Fix unnecessary register spill that depends on function ordering

Message ID AM0PR08MB5121FB517BC1AB5447CE7764926F0@AM0PR08MB5121.eurprd08.prod.outlook.com
State New
Headers show
Series
  • Fix unnecessary register spill that depends on function ordering
Related show

Commit Message

Omar Tahir June 30, 2020, 4:58 p.m.
Hi,

The variables first_moveable_pseudo and last_moveable_pseudo aren't reset
after compiling a function, which means they leak into the first scheduler
pass of the following function. In some cases, this can cause an extra spill
during register allocation of the second function.

This patch fixes this by setting

                first_moveable_pseudo = last_moveable_pseudo

at the beginning of the first scheduler pass.

Because the spill occurs in the middle of the IRA pass it is highly
target-sensitive, so I have provided a test case that works on aarch64. There
doesn't seem to be a target-independent way to trigger this bug, since the
RTL at the IRA stage is already quite target-specific.

Interestingly, the aarch64 test case here doesn't actually perform a complete
register spill - the value is stored on the stack but never retrieved.
Perhaps this points to another, deeper bug?

Bootstrapped and regression tested on aarch64-none-linux-gnu.

I don't have write privileges, so if it's fine could someone push for me?

Thanks,
Omar

----

gcc/ChangeLog:

2020-06-30: Omar Tahir <omar.tahir@arm.com>

                * sched-rgn.c: Include ira-int.h, ira.h, regs.h.
                (rest_of_handle_sched): Clear moveable pseudo range.

gcc/testsuite/ChangeLog:

2020-06-30: Omar Tahir <omar.tahir@arm.com>

                * gcc.target/aarch64/nospill.c: New test.

----

Comments

Richard Sandiford June 30, 2020, 8:07 p.m. | #1
Hi,

Thanks for the patch.

Omar Tahir <Omar.Tahir@arm.com> writes:
> Hi,

>

> The variables first_moveable_pseudo and last_moveable_pseudo aren't reset

> after compiling a function, which means they leak into the first scheduler

> pass of the following function. In some cases, this can cause an extra spill

> during register allocation of the second function.

>

> This patch fixes this by setting

>

>                 first_moveable_pseudo = last_moveable_pseudo

>

> at the beginning of the first scheduler pass.

>

> Because the spill occurs in the middle of the IRA pass it is highly

> target-sensitive, so I have provided a test case that works on aarch64. There

> doesn't seem to be a target-independent way to trigger this bug, since the

> RTL at the IRA stage is already quite target-specific.

>

> Interestingly, the aarch64 test case here doesn't actually perform a complete

> register spill - the value is stored on the stack but never retrieved.

> Perhaps this points to another, deeper bug?

>

> Bootstrapped and regression tested on aarch64-none-linux-gnu.

>

> I don't have write privileges, so if it's fine could someone push for me?

>

> Thanks,

> Omar

>

> ----

>

> gcc/ChangeLog:

>

> 2020-06-30: Omar Tahir <omar.tahir@arm.com>

>

>                 * sched-rgn.c: Include ira-int.h, ira.h, regs.h.

>                 (rest_of_handle_sched): Clear moveable pseudo range.

>

> gcc/testsuite/ChangeLog:

>

> 2020-06-30: Omar Tahir <omar.tahir@arm.com>

>

>                 * gcc.target/aarch64/nospill.c: New test.

>

> ----

>

> diff --git a/gcc/sched-rgn.c b/gcc/sched-rgn.c

> index 7f5dfdb..51329fc 100644

> --- a/gcc/sched-rgn.c

> +++ b/gcc/sched-rgn.c

> @@ -65,6 +65,9 @@ along with GCC; see the file COPYING3.  If not see

> #include "dbgcnt.h"

> #include "pretty-print.h"

> #include "print-rtl.h"

> +#include "ira.h"

> +#include "regs.h"

> +#include "ira-int.h"

>

> /* Disable warnings about quoting issues in the pp_xxx calls below

>     that (intentionally) don't follow GCC diagnostic conventions.  */

> @@ -3719,6 +3722,7 @@ static unsigned int

> rest_of_handle_sched (void)

> {

> #ifdef INSN_SCHEDULING

> +  first_moveable_pseudo = last_moveable_pseudo;

>    if (flag_selective_scheduling

>        && ! maybe_skip_selective_scheduling ())

>      run_selective_scheduling ();


I think instead we should zero out both variables at the end of IRA.
There are other places besides the scheduler that call into the IRA code,
so tackling the problem there seems more general.

> diff --git a/gcc/testsuite/gcc.target/aarch64/nospill.c b/gcc/testsuite/gcc.target/aarch64/nospill.c

> new file mode 100644

> index 0000000..60399d8

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/aarch64/nospill.c

> @@ -0,0 +1,35 @@

> +/* { dg-do compile } */

> +/* { dg-options "-O3" } */

> +

> +/* The pseudo for P is marked as moveable in the IRA pass. */

> +float

> +func_0 (float a, float b, float c)

> +{

> +  float p = c / a;

> +

> +  if (b > 1)

> +    {

> +      b /= p;

> +      if (c > 2)

> +        a /= 3;

> +    }

> +

> +  return b / c * a;

> +}

> +

> +/* If first_moveable_pseudo and last_moveable_pseudo are not reset correctly,

> +   they will carry over and spill the pseudo for Q. */

> +float

> +func_1 (float a, float b, float c)

> +{

> +  float q = a + b;

> +

> +  c *= a / (b + b);

> +  if (a > 0)

> +    c *= q;

> +

> +  return a * b * c;

> +}

> +

> +/* We have plenty of spare registers, so check nothing has been spilled. */

> +/* { dg-final { scan-assembler-not "str" } } */


The testcase looks good, but it's probably better to make that “\tstr\t”.
The problem with plain “str” is that assembly output can often include
pathnames and version strings, and it's possible that one of those could
contain “str”.

Thanks,
Richard
Omar Tahir July 1, 2020, 9:22 a.m. | #2
Hi Richard,

From: Richard Sandiford <richard.sandiford@arm.com>

> > @@ -3719,6 +3722,7 @@ static unsigned int rest_of_handle_sched (void) 

> > { #ifdef INSN_SCHEDULING

> > +  first_moveable_pseudo = last_moveable_pseudo;

> >    if (flag_selective_scheduling

> >        && ! maybe_skip_selective_scheduling ())

> >      run_selective_scheduling ();

> 

> I think instead we should zero out both variables at the end of IRA.

> There are other places besides the scheduler that call into the IRA code, so tackling the problem there seems more general.


If you zero first_moveable_pseudo and last_moveable_pseudo after IRA then
they'll be zero for the second scheduler pass, which uses them.

In fact, I've just realised that the GCSE and move_loop_invariants passes
also use them (they both call ira_set_pseudo_classes which calls
find_costs_and_classes which uses these variables).

They need to be zeroed or set equal to each other before the first pass that
uses them (move_loop_invariants), but kept alive until after the last pass
that uses them (GCSE). So if there's a function that sets things up right
before the RTL passes start then I think that's a good location candidate.

> > +/* We have plenty of spare registers, so check nothing has been 

> > +spilled. */

> > +/* { dg-final { scan-assembler-not "str" } } */

> 

> The testcase looks good, but it's probably better to make that “\tstr\t”.

> The problem with plain “str” is that assembly output can often include pathnames and version strings, and it's possible that one of those could contain “str”.


Good catch, I'll keep that tip in mind for future!

Thanks,
Omar
Richard Sandiford July 1, 2020, 9 p.m. | #3
Omar Tahir <Omar.Tahir@arm.com> writes:
> Hi Richard,

>

> From: Richard Sandiford <richard.sandiford@arm.com>

>> > @@ -3719,6 +3722,7 @@ static unsigned int rest_of_handle_sched (void) 

>> > { #ifdef INSN_SCHEDULING

>> > +  first_moveable_pseudo = last_moveable_pseudo;

>> >    if (flag_selective_scheduling

>> >        && ! maybe_skip_selective_scheduling ())

>> >      run_selective_scheduling ();

>> 

>> I think instead we should zero out both variables at the end of IRA.

>> There are other places besides the scheduler that call into the IRA code, so tackling the problem there seems more general.

>

> If you zero first_moveable_pseudo and last_moveable_pseudo after IRA then

> they'll be zero for the second scheduler pass, which uses them.


Are you sure?  It shouldn't be doing that, since there are no pseudos
left when the second scheduling pass runs.  RA replaces all pseudos
with hard registers.

So if the values in the variables has a noticeable effect on sched2,
I think that's even more reason to clear them after IRA :-)

Thanks,
Richard
Omar Tahir July 2, 2020, 10:52 a.m. | #4
> Omar Tahir <Omar.Tahir@arm.com> writes:

> > Hi Richard,

> >

> > From: Richard Sandiford <richard.sandiford@arm.com>

> >> > @@ -3719,6 +3722,7 @@ static unsigned int rest_of_handle_sched (void) 

> >> > { #ifdef INSN_SCHEDULING

> >> > +  first_moveable_pseudo = last_moveable_pseudo;

> >> >    if (flag_selective_scheduling

> >> >        && ! maybe_skip_selective_scheduling ())

> >> >      run_selective_scheduling ();

> >> 

> >> I think instead we should zero out both variables at the end of IRA.

> >> There are other places besides the scheduler that call into the IRA code, so tackling the problem there seems more general.

> >

> > If you zero first_moveable_pseudo and last_moveable_pseudo after IRA then

> > they'll be zero for the second scheduler pass, which uses them.

> 

> Are you sure?  It shouldn't be doing that, since there are no pseudos

> left when the second scheduling pass runs.  RA replaces all pseudos

> with hard registers.

> 

> So if the values in the variables has a noticeable effect on sched2,

> I think that's even more reason to clear them after IRA :-)


That's a good point. A few other passes call functions that make use of
the moveable pseudo variables. But if they're before IRA then they should
be zero, and as you say if they're after IRA then there are no pseudos left!

I've moved the reset to the end of move_unallocated_pseudos. Unfortunately I
can't inline the patch as there's a form feed (^L) that's mangling the text,
not sure how to get around that.

Thanks,
Omar

----

gcc/ChangeLog:

2020-07-02: Omar Tahir <omar.tahir@arm.com>

	* ira.c (move_unallocated_pseudos): Zero first_moveable_pseudo and
	last_moveable_pseudo before returning.

gcc/testsuite/ChangeLog:
        
2020-07-02: Omar Tahir <omar.tahir@arm.com>

	* gcc.target/aarch64/nospill.c: New test.
Richard Sandiford July 9, 2020, 9:16 a.m. | #5
Sorry for the slow reply.

Omar Tahir <Omar.Tahir@arm.com> writes:
>> Omar Tahir <Omar.Tahir@arm.com> writes:

>> > Hi Richard,

>> >

>> > From: Richard Sandiford <richard.sandiford@arm.com>

>> >> > @@ -3719,6 +3722,7 @@ static unsigned int rest_of_handle_sched (void) 

>> >> > { #ifdef INSN_SCHEDULING

>> >> > +  first_moveable_pseudo = last_moveable_pseudo;

>> >> >    if (flag_selective_scheduling

>> >> >        && ! maybe_skip_selective_scheduling ())

>> >> >      run_selective_scheduling ();

>> >> 

>> >> I think instead we should zero out both variables at the end of IRA.

>> >> There are other places besides the scheduler that call into the IRA code, so tackling the problem there seems more general.

>> >

>> > If you zero first_moveable_pseudo and last_moveable_pseudo after IRA then

>> > they'll be zero for the second scheduler pass, which uses them.

>> 

>> Are you sure?  It shouldn't be doing that, since there are no pseudos

>> left when the second scheduling pass runs.  RA replaces all pseudos

>> with hard registers.

>> 

>> So if the values in the variables has a noticeable effect on sched2,

>> I think that's even more reason to clear them after IRA :-)

>

> That's a good point. A few other passes call functions that make use of

> the moveable pseudo variables. But if they're before IRA then they should

> be zero, and as you say if they're after IRA then there are no pseudos left!

>

> I've moved the reset to the end of move_unallocated_pseudos. Unfortunately I

> can't inline the patch as there's a form feed (^L) that's mangling the text,

> not sure how to get around that.


Thanks, pushed to master.

Richard

Patch

diff --git a/gcc/sched-rgn.c b/gcc/sched-rgn.c
index 7f5dfdb..51329fc 100644
--- a/gcc/sched-rgn.c
+++ b/gcc/sched-rgn.c
@@ -65,6 +65,9 @@  along with GCC; see the file COPYING3.  If not see
#include "dbgcnt.h"
#include "pretty-print.h"
#include "print-rtl.h"
+#include "ira.h"
+#include "regs.h"
+#include "ira-int.h"

/* Disable warnings about quoting issues in the pp_xxx calls below
    that (intentionally) don't follow GCC diagnostic conventions.  */
@@ -3719,6 +3722,7 @@  static unsigned int
rest_of_handle_sched (void)
{
#ifdef INSN_SCHEDULING
+  first_moveable_pseudo = last_moveable_pseudo;
   if (flag_selective_scheduling
       && ! maybe_skip_selective_scheduling ())
     run_selective_scheduling ();
diff --git a/gcc/testsuite/gcc.target/aarch64/nospill.c b/gcc/testsuite/gcc.target/aarch64/nospill.c
new file mode 100644
index 0000000..60399d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/nospill.c
@@ -0,0 +1,35 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+/* The pseudo for P is marked as moveable in the IRA pass. */
+float
+func_0 (float a, float b, float c)
+{
+  float p = c / a;
+
+  if (b > 1)
+    {
+      b /= p;
+      if (c > 2)
+        a /= 3;
+    }
+
+  return b / c * a;
+}
+
+/* If first_moveable_pseudo and last_moveable_pseudo are not reset correctly,
+   they will carry over and spill the pseudo for Q. */
+float
+func_1 (float a, float b, float c)
+{
+  float q = a + b;
+
+  c *= a / (b + b);
+  if (a > 0)
+    c *= q;
+
+  return a * b * c;
+}
+
+/* We have plenty of spare registers, so check nothing has been spilled. */
+/* { dg-final { scan-assembler-not "str" } } */