[RFC] tree-optimization/100499 - multiple_of_p bad behavior wrt niter analysis

Message ID o28o7rp-279-nop1-p7q7-39snq252qnp3@fhfr.qr
State New
Headers show
Series
  • [RFC] tree-optimization/100499 - multiple_of_p bad behavior wrt niter analysis
Related show

Commit Message

Richard Biener July 22, 2021, 10:41 a.m.
This avoids using multiple_of_p in niter analysis when its behavior
to assume the tested expression does not invoke integer overflow
produces wrong niter analysis results.  For the cases multiple_of_p
handles power-of-two values of bottom look safe which should also be
the majority of cases we care about in addition to the constant case
now handled explicitely.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

I'm unsure how important a "perfect" solution is (rewriting
multiple_of_p), and wonder whether this solution is preferable
for now (and especially for branches).  I've not yet tried
to sanitize multiple_of_p plus use range info to prove
no-overflow where TYPE_OVERFLOW_UNDEFINED doesn't tell us
immediately.

2021-07-22  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/100499
	* tree-ssa-loop-niter.c (number_of_iterations_ne): Restrict
	multiple_of_p queries to power-of-two bottoms, handle
	the all constant case inline.

	* gcc.dg/torture/pr100499-1.c: New testcase.
	* gcc.dg/torture/pr100499-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/torture/pr100499-1.c | 28 +++++++++++++++++++++++
 gcc/testsuite/gcc.dg/torture/pr100499-2.c | 16 +++++++++++++
 gcc/tree-ssa-loop-niter.c                 |  8 ++++++-
 3 files changed, 51 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-2.c

-- 
2.26.2

Comments

Indu Bhagat via Gcc-patches July 26, 2021, 3:40 a.m. | #1
On Thu, Jul 22, 2021 at 6:41 PM Richard Biener <rguenther@suse.de> wrote:
>

> This avoids using multiple_of_p in niter analysis when its behavior

Hmm, but this patch actually introduces one more call to
multiple_of_p, also it doesn't touch the below use:
if (niter->control.no_overflow && multiple_of_p (type, c, s))
  {
    niter->niter = fold_build2 (FLOOR_DIV_EXPR, niter_type, c, s);
    return true;
  }

> to assume the tested expression does not invoke integer overflow

> produces wrong niter analysis results.  For the cases multiple_of_p

> handles power-of-two values of bottom look safe which should also be

> the majority of cases we care about in addition to the constant case

> now handled explicitely.

>

> Bootstrapped and tested on x86_64-unknown-linux-gnu.

>

> I'm unsure how important a "perfect" solution is (rewriting

> multiple_of_p), and wonder whether this solution is preferable

I am still for this approach now, it only needs to be conservative,
rather than perfect, especially if there are not many breakages with a
conservative version multiple_of_p?

> for now (and especially for branches).  I've not yet tried

> to sanitize multiple_of_p plus use range info to prove

> no-overflow where TYPE_OVERFLOW_UNDEFINED doesn't tell us

> immediately.

>

> 2021-07-22  Richard Biener  <rguenther@suse.de>

>

>         PR tree-optimization/100499

>         * tree-ssa-loop-niter.c (number_of_iterations_ne): Restrict

>         multiple_of_p queries to power-of-two bottoms, handle

>         the all constant case inline.

>

>         * gcc.dg/torture/pr100499-1.c: New testcase.

>         * gcc.dg/torture/pr100499-2.c: Likewise.

> ---

>  gcc/testsuite/gcc.dg/torture/pr100499-1.c | 28 +++++++++++++++++++++++

>  gcc/testsuite/gcc.dg/torture/pr100499-2.c | 16 +++++++++++++

>  gcc/tree-ssa-loop-niter.c                 |  8 ++++++-

>  3 files changed, 51 insertions(+), 1 deletion(-)

>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-1.c

>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-2.c

>

> diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-1.c b/gcc/testsuite/gcc.dg/torture/pr100499-1.c

> new file mode 100644

> index 00000000000..97ab6051554

> --- /dev/null

> +++ b/gcc/testsuite/gcc.dg/torture/pr100499-1.c

> @@ -0,0 +1,28 @@

> +/* { dg-do run } */

> +/* { dg-require-effective-target int32plus } */

> +

> +typedef __UINT16_TYPE__ uint16_t;

> +static uint16_t g_2823 = 0xEC75L;

> +static uint16_t g_116 = 0xBC07L;

> +

> +static uint16_t

> +safe_mul_func_uint16_t_u_u(uint16_t ui1, uint16_t ui2)

> +{

> +  return ((unsigned int)ui1) * ((unsigned int)ui2);

> +}

> +

> +int main (int argc, char* argv[])

> +{

> +  uint16_t l_2815 = 65535UL;

> +  uint16_t *l_2821 = &g_116;

> +  uint16_t *l_2822 = &g_2823;

> +

> +lbl_2826:

> +  l_2815 &= 0x9DEF1EAEL;

> +  if (+(safe_mul_func_uint16_t_u_u(((*l_2821) = l_2815), (--(*l_2822)))))

> +    goto lbl_2826;

> +

> +  if (g_2823 != 32768)

> +    __builtin_abort ();

> +  return 0;

> +}

> diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-2.c b/gcc/testsuite/gcc.dg/torture/pr100499-2.c

> new file mode 100644

> index 00000000000..999f931806a

> --- /dev/null

> +++ b/gcc/testsuite/gcc.dg/torture/pr100499-2.c

> @@ -0,0 +1,16 @@

> +/* { dg-do run } */

> +

> +unsigned char ag = 55;

> +unsigned i;

> +int main()

> +{

> +  unsigned char c;

> +  unsigned char a = ag;

> +d:

> +  c = a-- * 52;

> +  if (c)

> +    goto d;

> +  if (a != 255)

> +    __builtin_abort ();

> +  return 0;

> +}

> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c

> index 1b5605c26b8..c6b953c5316 100644

> --- a/gcc/tree-ssa-loop-niter.c

> +++ b/gcc/tree-ssa-loop-niter.c

> @@ -1050,7 +1050,13 @@ number_of_iterations_ne (class loop *loop, tree type, affine_iv *iv,

>       Note, for NE_EXPR, base equals to FINAL is a special case, in

>       which the loop exits immediately, and the iv does not overflow.  */

>    if (!niter->control.no_overflow

> -      && (integer_onep (s) || multiple_of_p (type, c, s)))

> +      && (integer_onep (s)

> +         || (poly_int_tree_p (c)

> +             && multiple_p (wi::to_poly_widest (c), wi::to_poly_widest (s)))

> +         /* ???  multiple_of_p assumes the expression 'c' does not overflow

> +            but that cannot be guaranteed, so we restrict 's' to power of

> +            two values where that should not be an issue.  See PR100499.  */

> +         || (integer_pow2p (s) && multiple_of_p (type, c, s))))

>      {

>        tree t, cond, new_c, relaxed_cond = boolean_false_node;

I (to be blamed) added this part of code to special handle cases like
pr34114, now I feel it's in the wrong direction.  Ideally this part of
code is unnecessary and conditions will be (it is now) incorporated
into niter->assumptions which should be simplified to 1/0
correspondingly.  The only problem is that assumptions are not
appropriately simplified.
Is it possible to incorporate a more powerful solver (like Z3) in GCC
for such cases, e.g., assumption simplification, multiple_of_p, etc..
Oh, we don't do SCEV analysis under assumptions either (LLVM does it),
which could be beneficial too.  Of course a solver is usually too
heavy so it has to be restricted in some way.

Thanks,
bin
>

> --

> 2.26.2
Richard Biener July 27, 2021, 12:15 p.m. | #2
On Mon, 26 Jul 2021, Bin.Cheng wrote:

> On Thu, Jul 22, 2021 at 6:41 PM Richard Biener <rguenther@suse.de> wrote:

> >

> > This avoids using multiple_of_p in niter analysis when its behavior

> Hmm, but this patch actually introduces one more call to

> multiple_of_p, also it doesn't touch the below use:

> if (niter->control.no_overflow && multiple_of_p (type, c, s))

>   {

>     niter->niter = fold_build2 (FLOOR_DIV_EXPR, niter_type, c, s);

>     return true;

>   }

> 

> > to assume the tested expression does not invoke integer overflow

> > produces wrong niter analysis results.  For the cases multiple_of_p

> > handles power-of-two values of bottom look safe which should also be

> > the majority of cases we care about in addition to the constant case

> > now handled explicitely.

> >

> > Bootstrapped and tested on x86_64-unknown-linux-gnu.

> >

> > I'm unsure how important a "perfect" solution is (rewriting

> > multiple_of_p), and wonder whether this solution is preferable

> I am still for this approach now, it only needs to be conservative,

> rather than perfect, especially if there are not many breakages with a

> conservative version multiple_of_p?


OK, I'll see how ugly it gets.

> > for now (and especially for branches).  I've not yet tried

> > to sanitize multiple_of_p plus use range info to prove

> > no-overflow where TYPE_OVERFLOW_UNDEFINED doesn't tell us

> > immediately.

> >

> > 2021-07-22  Richard Biener  <rguenther@suse.de>

> >

> >         PR tree-optimization/100499

> >         * tree-ssa-loop-niter.c (number_of_iterations_ne): Restrict

> >         multiple_of_p queries to power-of-two bottoms, handle

> >         the all constant case inline.

> >

> >         * gcc.dg/torture/pr100499-1.c: New testcase.

> >         * gcc.dg/torture/pr100499-2.c: Likewise.

> > ---

> >  gcc/testsuite/gcc.dg/torture/pr100499-1.c | 28 +++++++++++++++++++++++

> >  gcc/testsuite/gcc.dg/torture/pr100499-2.c | 16 +++++++++++++

> >  gcc/tree-ssa-loop-niter.c                 |  8 ++++++-

> >  3 files changed, 51 insertions(+), 1 deletion(-)

> >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-1.c

> >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-2.c

> >

> > diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-1.c b/gcc/testsuite/gcc.dg/torture/pr100499-1.c

> > new file mode 100644

> > index 00000000000..97ab6051554

> > --- /dev/null

> > +++ b/gcc/testsuite/gcc.dg/torture/pr100499-1.c

> > @@ -0,0 +1,28 @@

> > +/* { dg-do run } */

> > +/* { dg-require-effective-target int32plus } */

> > +

> > +typedef __UINT16_TYPE__ uint16_t;

> > +static uint16_t g_2823 = 0xEC75L;

> > +static uint16_t g_116 = 0xBC07L;

> > +

> > +static uint16_t

> > +safe_mul_func_uint16_t_u_u(uint16_t ui1, uint16_t ui2)

> > +{

> > +  return ((unsigned int)ui1) * ((unsigned int)ui2);

> > +}

> > +

> > +int main (int argc, char* argv[])

> > +{

> > +  uint16_t l_2815 = 65535UL;

> > +  uint16_t *l_2821 = &g_116;

> > +  uint16_t *l_2822 = &g_2823;

> > +

> > +lbl_2826:

> > +  l_2815 &= 0x9DEF1EAEL;

> > +  if (+(safe_mul_func_uint16_t_u_u(((*l_2821) = l_2815), (--(*l_2822)))))

> > +    goto lbl_2826;

> > +

> > +  if (g_2823 != 32768)

> > +    __builtin_abort ();

> > +  return 0;

> > +}

> > diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-2.c b/gcc/testsuite/gcc.dg/torture/pr100499-2.c

> > new file mode 100644

> > index 00000000000..999f931806a

> > --- /dev/null

> > +++ b/gcc/testsuite/gcc.dg/torture/pr100499-2.c

> > @@ -0,0 +1,16 @@

> > +/* { dg-do run } */

> > +

> > +unsigned char ag = 55;

> > +unsigned i;

> > +int main()

> > +{

> > +  unsigned char c;

> > +  unsigned char a = ag;

> > +d:

> > +  c = a-- * 52;

> > +  if (c)

> > +    goto d;

> > +  if (a != 255)

> > +    __builtin_abort ();

> > +  return 0;

> > +}

> > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c

> > index 1b5605c26b8..c6b953c5316 100644

> > --- a/gcc/tree-ssa-loop-niter.c

> > +++ b/gcc/tree-ssa-loop-niter.c

> > @@ -1050,7 +1050,13 @@ number_of_iterations_ne (class loop *loop, tree type, affine_iv *iv,

> >       Note, for NE_EXPR, base equals to FINAL is a special case, in

> >       which the loop exits immediately, and the iv does not overflow.  */

> >    if (!niter->control.no_overflow

> > -      && (integer_onep (s) || multiple_of_p (type, c, s)))

> > +      && (integer_onep (s)

> > +         || (poly_int_tree_p (c)

> > +             && multiple_p (wi::to_poly_widest (c), wi::to_poly_widest (s)))

> > +         /* ???  multiple_of_p assumes the expression 'c' does not overflow

> > +            but that cannot be guaranteed, so we restrict 's' to power of

> > +            two values where that should not be an issue.  See PR100499.  */

> > +         || (integer_pow2p (s) && multiple_of_p (type, c, s))))

> >      {

> >        tree t, cond, new_c, relaxed_cond = boolean_false_node;

> I (to be blamed) added this part of code to special handle cases like

> pr34114, now I feel it's in the wrong direction.  Ideally this part of

> code is unnecessary and conditions will be (it is now) incorporated

> into niter->assumptions which should be simplified to 1/0

> correspondingly.  The only problem is that assumptions are not

> appropriately simplified.


Yeah, that's an issue.

> Is it possible to incorporate a more powerful solver (like Z3) in GCC

> for such cases, e.g., assumption simplification, multiple_of_p, etc..

> Oh, we don't do SCEV analysis under assumptions either (LLVM does it),

> which could be beneficial too.  Of course a solver is usually too

> heavy so it has to be restricted in some way.


And yes, we have to deal with the SCEV analysis which also should
produce assumptions itself under which evolutions are affine.

Richard.

Patch

diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-1.c b/gcc/testsuite/gcc.dg/torture/pr100499-1.c
new file mode 100644
index 00000000000..97ab6051554
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr100499-1.c
@@ -0,0 +1,28 @@ 
+/* { dg-do run } */
+/* { dg-require-effective-target int32plus } */
+
+typedef __UINT16_TYPE__ uint16_t;
+static uint16_t g_2823 = 0xEC75L;
+static uint16_t g_116 = 0xBC07L;
+
+static uint16_t
+safe_mul_func_uint16_t_u_u(uint16_t ui1, uint16_t ui2)
+{
+  return ((unsigned int)ui1) * ((unsigned int)ui2);
+}
+
+int main (int argc, char* argv[])
+{
+  uint16_t l_2815 = 65535UL;
+  uint16_t *l_2821 = &g_116;
+  uint16_t *l_2822 = &g_2823;
+
+lbl_2826:
+  l_2815 &= 0x9DEF1EAEL;
+  if (+(safe_mul_func_uint16_t_u_u(((*l_2821) = l_2815), (--(*l_2822)))))
+    goto lbl_2826;
+
+  if (g_2823 != 32768)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-2.c b/gcc/testsuite/gcc.dg/torture/pr100499-2.c
new file mode 100644
index 00000000000..999f931806a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr100499-2.c
@@ -0,0 +1,16 @@ 
+/* { dg-do run } */
+
+unsigned char ag = 55;
+unsigned i;
+int main()
+{
+  unsigned char c;
+  unsigned char a = ag;
+d:
+  c = a-- * 52;
+  if (c)
+    goto d;
+  if (a != 255)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 1b5605c26b8..c6b953c5316 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -1050,7 +1050,13 @@  number_of_iterations_ne (class loop *loop, tree type, affine_iv *iv,
      Note, for NE_EXPR, base equals to FINAL is a special case, in
      which the loop exits immediately, and the iv does not overflow.  */
   if (!niter->control.no_overflow
-      && (integer_onep (s) || multiple_of_p (type, c, s)))
+      && (integer_onep (s)
+	  || (poly_int_tree_p (c)
+	      && multiple_p (wi::to_poly_widest (c), wi::to_poly_widest (s)))
+	  /* ???  multiple_of_p assumes the expression 'c' does not overflow
+	     but that cannot be guaranteed, so we restrict 's' to power of
+	     two values where that should not be an issue.  See PR100499.  */
+	  || (integer_pow2p (s) && multiple_of_p (type, c, s))))
     {
       tree t, cond, new_c, relaxed_cond = boolean_false_node;