cprop_hardreg: Workaround for narrow mode != lowpart targets

Message ID 20220113171140.43735-1-krebbel@linux.ibm.com
State New
Headers show
Series
  • cprop_hardreg: Workaround for narrow mode != lowpart targets
Related show

Commit Message

apinski--- via Gcc-patches Jan. 13, 2022, 5:11 p.m.
The cprop_hardreg pass is built around the assumption that accessing a
register in a narrower mode is the same as accessing the lowpart of
the register.  This unfortunately is not true for vector registers on
IBM Z. This caused a miscompile of LLVM with GCC 8.5. The problem
could not be reproduced with upstream GCC unfortunately but we have to
assume that it is latent there. The right fix would require
substantial changes to the cprop pass and is certainly something we
would want for our platform. But since this would not be acceptable
for older GCCs I'll go with what Vladimir proposed in the RedHat BZ
and introduce a hopefully temporary and undocumented target hook to
disable that specific transformation in regcprop.c.

Here the RedHat BZ for reference:
https://bugzilla.redhat.com/show_bug.cgi?id=2028609

Bootstrapped and regression-tested on s390x.

Ok?

gcc/ChangeLog:

	* target.def (narrow_mode_refers_low_part_p): Add new target hook.
	* config/s390/s390.c (s390_narrow_mode_refers_low_part_p):
	Implement new target hook for IBM Z.
	(TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.
	* regcprop.c (maybe_mode_change): Disable transformation depending
	on the new target hook.
---
 gcc/config/s390/s390.c | 14 ++++++++++++++
 gcc/regcprop.c         |  3 ++-
 gcc/target.def         | 12 +++++++++++-
 3 files changed, 27 insertions(+), 2 deletions(-)

-- 
2.33.1

Comments

apinski--- via Gcc-patches Jan. 13, 2022, 10:03 p.m. | #1
On 1/13/22 18:11, Andreas Krebbel via Gcc-patches wrote:
...
> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done.  As long as the\n\

>  floating registers are not in class @code{GENERAL_REGS}, they will not\n\

>  be used unless some pattern's constraint asks for one.",

>   bool, (unsigned int regno, machine_mode mode),

> - hook_bool_uint_mode_true)

> + hook_bool_uint_true)

>  

>  DEFHOOK

>  (modes_tieable_p,


That hunk was a copy and paste bug and does not belong to the patch.

Andreas
apinski--- via Gcc-patches Jan. 14, 2022, 7:37 a.m. | #2
On Thu, Jan 13, 2022 at 6:12 PM Andreas Krebbel via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>

> The cprop_hardreg pass is built around the assumption that accessing a

> register in a narrower mode is the same as accessing the lowpart of

> the register.  This unfortunately is not true for vector registers on

> IBM Z. This caused a miscompile of LLVM with GCC 8.5. The problem

> could not be reproduced with upstream GCC unfortunately but we have to

> assume that it is latent there. The right fix would require

> substantial changes to the cprop pass and is certainly something we

> would want for our platform. But since this would not be acceptable

> for older GCCs I'll go with what Vladimir proposed in the RedHat BZ

> and introduce a hopefully temporary and undocumented target hook to

> disable that specific transformation in regcprop.c.

>

> Here the RedHat BZ for reference:

> https://bugzilla.redhat.com/show_bug.cgi?id=2028609


Can the gist of this bug be put into the GCC bugzilla so the rev can
refer to it?  Can we have a testcase even?

I'm not quite understanding the problem but is it that, say,

 (subreg:DI (reg:V2DI ..) 0)

isn't the same as

 (lowpart:DI (reg:V2DI ...) 0)

?  The regcprop code looks more like asking whether the larger reg
is a composition of multiple other hardregs and will return the specific
hardreg corresponding to the lowpart - so like if on s390 the vector
registers overlap with some other regset.  But then doing the actual
accesses via the other regset regs doesn't actually work?  Isn't the
backend then lying to us (aka the mode_change_ok returns the
wrong answer)?

How does the stage1 fix, aka "rewrite" of cprop, look like?  How can we
be sure this hack isn't still present in 10 years from now?

Thanks,
Richard.

> Bootstrapped and regression-tested on s390x.

>

> Ok?

>

> gcc/ChangeLog:

>

>         * target.def (narrow_mode_refers_low_part_p): Add new target hook.

>         * config/s390/s390.c (s390_narrow_mode_refers_low_part_p):

>         Implement new target hook for IBM Z.

>         (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.

>         * regcprop.c (maybe_mode_change): Disable transformation depending

>         on the new target hook.

> ---

>  gcc/config/s390/s390.c | 14 ++++++++++++++

>  gcc/regcprop.c         |  3 ++-

>  gcc/target.def         | 12 +++++++++++-

>  3 files changed, 27 insertions(+), 2 deletions(-)

>

> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c

> index 056002e4a4a..aafc6d63be6 100644

> --- a/gcc/config/s390/s390.c

> +++ b/gcc/config/s390/s390.c

> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode)

>    return false;

>  }

>

> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P.  */

> +

> +static bool

> +s390_narrow_mode_refers_low_part_p (unsigned int regno)

> +{

> +  if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno)))

> +    return false;

> +

> +  return true;

> +}

> +

> +

>  /* Implement TARGET_MODES_TIEABLE_P.  */

>

>  static bool

> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, rtx op1,

>  #undef TARGET_VECTORIZE_VEC_PERM_CONST

>  #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const

>

> +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P

> +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P s390_narrow_mode_refers_low_part_p

>

>  struct gcc_target targetm = TARGET_INITIALIZER;

>

> diff --git a/gcc/regcprop.c b/gcc/regcprop.c

> index 1a9bcf0a1ad..aaf94ad9b51 100644

> --- a/gcc/regcprop.c

> +++ b/gcc/regcprop.c

> @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode,

>

>    if (orig_mode == new_mode)

>      return gen_raw_REG (new_mode, regno);

> -  else if (mode_change_ok (orig_mode, new_mode, regno))

> +  else if (mode_change_ok (orig_mode, new_mode, regno)

> +          && targetm.narrow_mode_refers_low_part_p (regno))

>      {

>        int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);

>        int use_nregs = hard_regno_nregs (copy_regno, new_mode);

> diff --git a/gcc/target.def b/gcc/target.def

> index 8fd2533e90a..598eea501ff 100644

> --- a/gcc/target.def

> +++ b/gcc/target.def

> @@ -5446,6 +5446,16 @@ value that the middle-end intended.",

>   bool, (machine_mode from, machine_mode to, reg_class_t rclass),

>   hook_bool_mode_mode_reg_class_t_true)

>

> +/* This hook is used to work around a problem in regcprop. Hardcoded

> +assumptions currently prevent it from working correctly for targets

> +where the low part of a multi-word register doesn't align to accessing

> +the register with a narrower mode.  */

> +DEFHOOK_UNDOC

> +(narrow_mode_refers_low_part_p,

> +"",

> +bool, (unsigned int regno),

> +hook_bool_unit_true)

> +

>  /* Change pseudo allocno class calculated by IRA.  */

>  DEFHOOK

>  (ira_change_pseudo_allocno_class,

> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done.  As long as the\n\

>  floating registers are not in class @code{GENERAL_REGS}, they will not\n\

>  be used unless some pattern's constraint asks for one.",

>   bool, (unsigned int regno, machine_mode mode),

> - hook_bool_uint_mode_true)

> + hook_bool_uint_true)

>

>  DEFHOOK

>  (modes_tieable_p,

> --

> 2.33.1

>
apinski--- via Gcc-patches Jan. 14, 2022, 7:41 p.m. | #3
On 1/14/22 08:37, Richard Biener wrote:
...
> Can the gist of this bug be put into the GCC bugzilla so the rev can

> refer to it? 

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034

> Can we have a testcase even?

The testcase from Jakub is in the BZ. However, since it doesn't fail with head I didn't try to
include it in my patch.

> I'm not quite understanding the problem but is it that, say,

> 

>  (subreg:DI (reg:V2DI ..) 0)

> 

> isn't the same as

> 

>  (lowpart:DI (reg:V2DI ...) 0)


(reg:DI v0) does not match the lower order bits of (reg:TI v0)

> ?  The regcprop code looks more like asking whether the larger reg

> is a composition of multiple other hardregs and will return the specific

> hardreg corresponding to the lowpart - so like if on s390 the vector

> registers overlap with some other regset.  But then doing the actual

> accesses via the other regset regs doesn't actually work?  Isn't the

> backend then lying to us (aka the mode_change_ok returns the

> wrong answer)?


can_change_mode_class should do the right thing. We return false in case somebody wants to change TI
to DI for a vector register. However, the hook never gets called like this from regcprop. regcprop
only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's indeed ok.

Before cprop we have:

(insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69])
        (reg:TI 8 %r8)) -1
     (nil))


(insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ])
        (reg:DI 16 %f0)) 1277 {*movdi_64}
     (nil))


(insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69])
        (unspec:DI [
                (reg:V2DI 16 %f0)
                (const_int 1 [0x1])
            ] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di}
     (expr_list:REG_DEAD (reg:V2DI 16 %f0)
        (nil)))

So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop assuming that (reg:DI
f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 %f0) with (reg:DI 9 %r9)
what would be the DImode lowpart of (reg:TI r8)

Insn 155 and 156 are the result of applying the following splitter:

; Split a VR -> GPR TImode move into 2 vector load GR from VR element.
; For the higher order bits we do simply a DImode move while the
; second part is done via vec extract.  Both will end up as vlgvg.
(define_split
  [(set (match_operand:TI 0 "register_operand" "")
        (match_operand:TI 1 "register_operand" ""))]
  "TARGET_VX && reload_completed
   && GENERAL_REG_P (operands[0])
   && VECTOR_REG_P (operands[1])"
  [(set (match_dup 2) (match_dup 4))
   (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)]
                                 UNSPEC_VEC_EXTRACT))]
{
  operands[2] = operand_subword (operands[0], 0, 0, TImode);
  operands[3] = operand_subword (operands[0], 1, 0, TImode);
  operands[4] = gen_rtx_REG (DImode, REGNO (operands[1]));
  operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1]));
})

Introducing the (reg:DI 16 %f0) access to the TImode VR is something the middle end is not expected
to do - because we prevent it in can_change_mode_class. However, I don't see anything wrong with
doing that in the splitter. In our backend this is well-defined as being the first element in the
vector register - the high part of the TImode vector register value.

Unfortunately it confuses cprop :(

Andreas



> 

> How does the stage1 fix, aka "rewrite" of cprop, look like?  How can we

> be sure this hack isn't still present in 10 years from now?

> 

> Thanks,

> Richard.

> 

>> Bootstrapped and regression-tested on s390x.

>>

>> Ok?

>>

>> gcc/ChangeLog:

>>

>>         * target.def (narrow_mode_refers_low_part_p): Add new target hook.

>>         * config/s390/s390.c (s390_narrow_mode_refers_low_part_p):

>>         Implement new target hook for IBM Z.

>>         (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.

>>         * regcprop.c (maybe_mode_change): Disable transformation depending

>>         on the new target hook.

>> ---

>>  gcc/config/s390/s390.c | 14 ++++++++++++++

>>  gcc/regcprop.c         |  3 ++-

>>  gcc/target.def         | 12 +++++++++++-

>>  3 files changed, 27 insertions(+), 2 deletions(-)

>>

>> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c

>> index 056002e4a4a..aafc6d63be6 100644

>> --- a/gcc/config/s390/s390.c

>> +++ b/gcc/config/s390/s390.c

>> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode)

>>    return false;

>>  }

>>

>> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P.  */

>> +

>> +static bool

>> +s390_narrow_mode_refers_low_part_p (unsigned int regno)

>> +{

>> +  if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno)))

>> +    return false;

>> +

>> +  return true;

>> +}

>> +

>> +

>>  /* Implement TARGET_MODES_TIEABLE_P.  */

>>

>>  static bool

>> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, rtx op1,

>>  #undef TARGET_VECTORIZE_VEC_PERM_CONST

>>  #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const

>>

>> +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P

>> +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P s390_narrow_mode_refers_low_part_p

>>

>>  struct gcc_target targetm = TARGET_INITIALIZER;

>>

>> diff --git a/gcc/regcprop.c b/gcc/regcprop.c

>> index 1a9bcf0a1ad..aaf94ad9b51 100644

>> --- a/gcc/regcprop.c

>> +++ b/gcc/regcprop.c

>> @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode,

>>

>>    if (orig_mode == new_mode)

>>      return gen_raw_REG (new_mode, regno);

>> -  else if (mode_change_ok (orig_mode, new_mode, regno))

>> +  else if (mode_change_ok (orig_mode, new_mode, regno)

>> +          && targetm.narrow_mode_refers_low_part_p (regno))

>>      {

>>        int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);

>>        int use_nregs = hard_regno_nregs (copy_regno, new_mode);

>> diff --git a/gcc/target.def b/gcc/target.def

>> index 8fd2533e90a..598eea501ff 100644

>> --- a/gcc/target.def

>> +++ b/gcc/target.def

>> @@ -5446,6 +5446,16 @@ value that the middle-end intended.",

>>   bool, (machine_mode from, machine_mode to, reg_class_t rclass),

>>   hook_bool_mode_mode_reg_class_t_true)

>>

>> +/* This hook is used to work around a problem in regcprop. Hardcoded

>> +assumptions currently prevent it from working correctly for targets

>> +where the low part of a multi-word register doesn't align to accessing

>> +the register with a narrower mode.  */

>> +DEFHOOK_UNDOC

>> +(narrow_mode_refers_low_part_p,

>> +"",

>> +bool, (unsigned int regno),

>> +hook_bool_unit_true)

>> +

>>  /* Change pseudo allocno class calculated by IRA.  */

>>  DEFHOOK

>>  (ira_change_pseudo_allocno_class,

>> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done.  As long as the\n\

>>  floating registers are not in class @code{GENERAL_REGS}, they will not\n\

>>  be used unless some pattern's constraint asks for one.",

>>   bool, (unsigned int regno, machine_mode mode),

>> - hook_bool_uint_mode_true)

>> + hook_bool_uint_true)

>>

>>  DEFHOOK

>>  (modes_tieable_p,

>> --

>> 2.33.1

>>
apinski--- via Gcc-patches Jan. 14, 2022, 7:51 p.m. | #4
On Fri, Jan 14, 2022 at 11:42 AM Andreas Krebbel via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>

> On 1/14/22 08:37, Richard Biener wrote:

> ...

> > Can the gist of this bug be put into the GCC bugzilla so the rev can

> > refer to it?

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034

>

> > Can we have a testcase even?

> The testcase from Jakub is in the BZ. However, since it doesn't fail with head I didn't try to

> include it in my patch.


So I also noticed there was PR 101260 which had the exact same
analysis as this patch with respect to s390 and TI mode splitting and
regcprop.
So I marked PR 104034 as a duplicate of PR 101260.

Thanks,
Andrew

>

> > I'm not quite understanding the problem but is it that, say,

> >

> >  (subreg:DI (reg:V2DI ..) 0)

> >

> > isn't the same as

> >

> >  (lowpart:DI (reg:V2DI ...) 0)

>

> (reg:DI v0) does not match the lower order bits of (reg:TI v0)

>

> > ?  The regcprop code looks more like asking whether the larger reg

> > is a composition of multiple other hardregs and will return the specific

> > hardreg corresponding to the lowpart - so like if on s390 the vector

> > registers overlap with some other regset.  But then doing the actual

> > accesses via the other regset regs doesn't actually work?  Isn't the

> > backend then lying to us (aka the mode_change_ok returns the

> > wrong answer)?

>

> can_change_mode_class should do the right thing. We return false in case somebody wants to change TI

> to DI for a vector register. However, the hook never gets called like this from regcprop. regcprop

> only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's indeed ok.

>

> Before cprop we have:

>

> (insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69])

>         (reg:TI 8 %r8)) -1

>      (nil))

>

>

> (insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ])

>         (reg:DI 16 %f0)) 1277 {*movdi_64}

>      (nil))

>

>

> (insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69])

>         (unspec:DI [

>                 (reg:V2DI 16 %f0)

>                 (const_int 1 [0x1])

>             ] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di}

>      (expr_list:REG_DEAD (reg:V2DI 16 %f0)

>         (nil)))

>

> So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop assuming that (reg:DI

> f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 %f0) with (reg:DI 9 %r9)

> what would be the DImode lowpart of (reg:TI r8)

>

> Insn 155 and 156 are the result of applying the following splitter:

>

> ; Split a VR -> GPR TImode move into 2 vector load GR from VR element.

> ; For the higher order bits we do simply a DImode move while the

> ; second part is done via vec extract.  Both will end up as vlgvg.

> (define_split

>   [(set (match_operand:TI 0 "register_operand" "")

>         (match_operand:TI 1 "register_operand" ""))]

>   "TARGET_VX && reload_completed

>    && GENERAL_REG_P (operands[0])

>    && VECTOR_REG_P (operands[1])"

>   [(set (match_dup 2) (match_dup 4))

>    (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)]

>                                  UNSPEC_VEC_EXTRACT))]

> {

>   operands[2] = operand_subword (operands[0], 0, 0, TImode);

>   operands[3] = operand_subword (operands[0], 1, 0, TImode);

>   operands[4] = gen_rtx_REG (DImode, REGNO (operands[1]));

>   operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1]));

> })

>

> Introducing the (reg:DI 16 %f0) access to the TImode VR is something the middle end is not expected

> to do - because we prevent it in can_change_mode_class. However, I don't see anything wrong with

> doing that in the splitter. In our backend this is well-defined as being the first element in the

> vector register - the high part of the TImode vector register value.

>

> Unfortunately it confuses cprop :(

>

> Andreas

>

>

>

> >

> > How does the stage1 fix, aka "rewrite" of cprop, look like?  How can we

> > be sure this hack isn't still present in 10 years from now?

> >

> > Thanks,

> > Richard.

> >

> >> Bootstrapped and regression-tested on s390x.

> >>

> >> Ok?

> >>

> >> gcc/ChangeLog:

> >>

> >>         * target.def (narrow_mode_refers_low_part_p): Add new target hook.

> >>         * config/s390/s390.c (s390_narrow_mode_refers_low_part_p):

> >>         Implement new target hook for IBM Z.

> >>         (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.

> >>         * regcprop.c (maybe_mode_change): Disable transformation depending

> >>         on the new target hook.

> >> ---

> >>  gcc/config/s390/s390.c | 14 ++++++++++++++

> >>  gcc/regcprop.c         |  3 ++-

> >>  gcc/target.def         | 12 +++++++++++-

> >>  3 files changed, 27 insertions(+), 2 deletions(-)

> >>

> >> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c

> >> index 056002e4a4a..aafc6d63be6 100644

> >> --- a/gcc/config/s390/s390.c

> >> +++ b/gcc/config/s390/s390.c

> >> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode)

> >>    return false;

> >>  }

> >>

> >> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P.  */

> >> +

> >> +static bool

> >> +s390_narrow_mode_refers_low_part_p (unsigned int regno)

> >> +{

> >> +  if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno)))

> >> +    return false;

> >> +

> >> +  return true;

> >> +}

> >> +

> >> +

> >>  /* Implement TARGET_MODES_TIEABLE_P.  */

> >>

> >>  static bool

> >> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, rtx op1,

> >>  #undef TARGET_VECTORIZE_VEC_PERM_CONST

> >>  #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const

> >>

> >> +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P

> >> +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P s390_narrow_mode_refers_low_part_p

> >>

> >>  struct gcc_target targetm = TARGET_INITIALIZER;

> >>

> >> diff --git a/gcc/regcprop.c b/gcc/regcprop.c

> >> index 1a9bcf0a1ad..aaf94ad9b51 100644

> >> --- a/gcc/regcprop.c

> >> +++ b/gcc/regcprop.c

> >> @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode,

> >>

> >>    if (orig_mode == new_mode)

> >>      return gen_raw_REG (new_mode, regno);

> >> -  else if (mode_change_ok (orig_mode, new_mode, regno))

> >> +  else if (mode_change_ok (orig_mode, new_mode, regno)

> >> +          && targetm.narrow_mode_refers_low_part_p (regno))

> >>      {

> >>        int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);

> >>        int use_nregs = hard_regno_nregs (copy_regno, new_mode);

> >> diff --git a/gcc/target.def b/gcc/target.def

> >> index 8fd2533e90a..598eea501ff 100644

> >> --- a/gcc/target.def

> >> +++ b/gcc/target.def

> >> @@ -5446,6 +5446,16 @@ value that the middle-end intended.",

> >>   bool, (machine_mode from, machine_mode to, reg_class_t rclass),

> >>   hook_bool_mode_mode_reg_class_t_true)

> >>

> >> +/* This hook is used to work around a problem in regcprop. Hardcoded

> >> +assumptions currently prevent it from working correctly for targets

> >> +where the low part of a multi-word register doesn't align to accessing

> >> +the register with a narrower mode.  */

> >> +DEFHOOK_UNDOC

> >> +(narrow_mode_refers_low_part_p,

> >> +"",

> >> +bool, (unsigned int regno),

> >> +hook_bool_unit_true)

> >> +

> >>  /* Change pseudo allocno class calculated by IRA.  */

> >>  DEFHOOK

> >>  (ira_change_pseudo_allocno_class,

> >> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done.  As long as the\n\

> >>  floating registers are not in class @code{GENERAL_REGS}, they will not\n\

> >>  be used unless some pattern's constraint asks for one.",

> >>   bool, (unsigned int regno, machine_mode mode),

> >> - hook_bool_uint_mode_true)

> >> + hook_bool_uint_true)

> >>

> >>  DEFHOOK

> >>  (modes_tieable_p,

> >> --

> >> 2.33.1

> >>

>
apinski--- via Gcc-patches Jan. 14, 2022, 7:56 p.m. | #5
On 1/14/22 20:41, Andreas Krebbel via Gcc-patches wrote:
> On 1/14/22 08:37, Richard Biener wrote:

> ...

>> Can the gist of this bug be put into the GCC bugzilla so the rev can

>> refer to it? 

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034

> 

>> Can we have a testcase even?

> The testcase from Jakub is in the BZ. However, since it doesn't fail with head I didn't try to

> include it in my patch.

> 

>> I'm not quite understanding the problem but is it that, say,

>>

>>  (subreg:DI (reg:V2DI ..) 0)

>>

>> isn't the same as

>>

>>  (lowpart:DI (reg:V2DI ...) 0)

> 

> (reg:DI v0) does not match the lower order bits of (reg:TI v0)

> 

>> ?  The regcprop code looks more like asking whether the larger reg

>> is a composition of multiple other hardregs and will return the specific

>> hardreg corresponding to the lowpart - so like if on s390 the vector

>> registers overlap with some other regset.  But then doing the actual

>> accesses via the other regset regs doesn't actually work?  Isn't the

>> backend then lying to us (aka the mode_change_ok returns the

>> wrong answer)?

> 

> can_change_mode_class should do the right thing. We return false in case somebody wants to change TI

> to DI for a vector register. However, the hook never gets called like this from regcprop. regcprop

> only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's indeed ok.


After writing this I'm wondering whether this would be a better fix:

diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 18132425ab2..b6a3f4e3804 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -402,7 +402,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode,

   if (orig_mode == new_mode)
     return gen_raw_REG (new_mode, regno);
-  else if (mode_change_ok (orig_mode, new_mode, regno))
+  else if (mode_change_ok (orig_mode, new_mode, regno)
+           && mode_change_ok (copy_mode, new_mode, copy_regno))
     {
       int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);
       int use_nregs = hard_regno_nregs (copy_regno, new_mode);


Andreas

Patch

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 056002e4a4a..aafc6d63be6 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -10488,6 +10488,18 @@  s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
   return false;
 }
 
+/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P.  */
+
+static bool
+s390_narrow_mode_refers_low_part_p (unsigned int regno)
+{
+  if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno)))
+    return false;
+
+  return true;
+}
+
+
 /* Implement TARGET_MODES_TIEABLE_P.  */
 
 static bool
@@ -17472,6 +17484,8 @@  s390_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, rtx op1,
 #undef TARGET_VECTORIZE_VEC_PERM_CONST
 #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const
 
+#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P
+#define TARGET_NARROW_MODE_REFERS_LOW_PART_P s390_narrow_mode_refers_low_part_p
 
 struct gcc_target targetm = TARGET_INITIALIZER;
 
diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 1a9bcf0a1ad..aaf94ad9b51 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -426,7 +426,8 @@  maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode,
 
   if (orig_mode == new_mode)
     return gen_raw_REG (new_mode, regno);
-  else if (mode_change_ok (orig_mode, new_mode, regno))
+  else if (mode_change_ok (orig_mode, new_mode, regno)
+	   && targetm.narrow_mode_refers_low_part_p (regno))
     {
       int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);
       int use_nregs = hard_regno_nregs (copy_regno, new_mode);
diff --git a/gcc/target.def b/gcc/target.def
index 8fd2533e90a..598eea501ff 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5446,6 +5446,16 @@  value that the middle-end intended.",
  bool, (machine_mode from, machine_mode to, reg_class_t rclass),
  hook_bool_mode_mode_reg_class_t_true)
 
+/* This hook is used to work around a problem in regcprop. Hardcoded
+assumptions currently prevent it from working correctly for targets
+where the low part of a multi-word register doesn't align to accessing
+the register with a narrower mode.  */
+DEFHOOK_UNDOC
+(narrow_mode_refers_low_part_p,
+"",
+bool, (unsigned int regno),
+hook_bool_unit_true)
+
 /* Change pseudo allocno class calculated by IRA.  */
 DEFHOOK
 (ira_change_pseudo_allocno_class,
@@ -5949,7 +5959,7 @@  register if floating point arithmetic is not being done.  As long as the\n\
 floating registers are not in class @code{GENERAL_REGS}, they will not\n\
 be used unless some pattern's constraint asks for one.",
  bool, (unsigned int regno, machine_mode mode),
- hook_bool_uint_mode_true)
+ hook_bool_uint_true)
 
 DEFHOOK
 (modes_tieable_p,