More NEGATE_EXPR folding in match.pd

Message ID 002801d7a562$997998a0$cc6cc9e0$@nextmovesoftware.com
State New
Headers show
Series
  • More NEGATE_EXPR folding in match.pd
Related show

Commit Message

Roger Sayle Sept. 9, 2021, 10:08 a.m.
As observed by Jakub in comment #2 of PR 98865, the expression -(a>>63)
is optimized in GENERIC but not in GIMPLE.  Investigating further it
turns out that this is one of a few transformations performed by
fold_negate_expr in fold-const.c that aren't yet performed by match.pd.
This patch moves/duplicates them there, and should be relatively safe
as these transformations are already performed by the compiler, but
just in different passes.

Alas the one minor complication is that some of these transformations
are only wins, if the intermediate result (of the multiplication or
division) is only used once, to avoid duplication/performing them again.
See gcc.dg/tree-ssa/ssa-free-88.c.  Normally, this is the perfect usage
of match's single_use (aka SSA's has_single_use).  Alas, single_use is
not always accurate in match.pd, as some passes will construct and
simplify an expression/stmt before inserting it into GIMPLE, and folding
during this process sees the temporary undercount from the data-flow.
To solve this, this patch introduces a new single_use_is_op_p that
double checks that the single_use has the expected tree_code/operation
and skips the transformation if we can tell single_use might be invalid.

A follow-up patch might be to investigate whether genmatch.c can be
tweaked to use this new helper function to implement the :s qualifier
when the enclosing context is/should be known, but that's overkill
to just unblock Jakub and Andrew on 98865.

This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
and "make -k check" with no new failures.  Ok for mainline?


2021-09-09  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* generic-match-head.c (single_use_is_op_p): New helper function.
	* gimple-match-head.c (single_use_is_op_p): New helper function.
	* match.pd (negation simplifications): Implement some negation
	folding transformations from fold-const.c's fold_negate_expr.

gcc/testsuite/ChangeLog
	* gcc.dg/fold-negate-1.c: New test case.

Roger
--

/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-optimized" } */

#define SHIFT ((8*__SIZEOF_INT__)-1)

int test_rshift_1(int x)
{
  int t = x >> SHIFT;
  return -t;
}

int test_rshift_2(int x)
{
  unsigned int t = (unsigned int)x >> SHIFT;
  return -t;
}

int test_rshift_3(int x)
{
  int t = (unsigned int)x >> SHIFT;
  return -t;
}

int test_rshift_4(int x)
{
  unsigned int t = x >> SHIFT;
  return -t;
}

double test_mul_1(double x)
{
  double t = -5.0 * x;
  return -t;
}

double test_mul_2(double x, double y)
{
  double t1 = -x;
  double t2 = t1 * y;
  return -t2;
}

double test_rdiv_1(double x, double y)
{
  double t1 = -x;
  double t2 = t1 / y;
  return -t2;
}

double test_rdiv_2(double x, double y)
{
  double t1 = -y;
  double t2 = x / t1;
  return -t2;
}

/* { dg-final { scan-tree-dump-not " -" "optimized" } } */

Comments

Harald Anlauf via Gcc-patches Sept. 9, 2021, 11:05 a.m. | #1
On Thu, 9 Sept 2021 at 15:38, Roger Sayle <roger@nextmovesoftware.com> wrote:
>

>

> As observed by Jakub in comment #2 of PR 98865, the expression -(a>>63)

> is optimized in GENERIC but not in GIMPLE.  Investigating further it

> turns out that this is one of a few transformations performed by

> fold_negate_expr in fold-const.c that aren't yet performed by match.pd.

> This patch moves/duplicates them there, and should be relatively safe

> as these transformations are already performed by the compiler, but

> just in different passes.

IIRC, we should probably remove the code from fold-const.c while
porting the pattern to
match.pd, which I suppose is case RSHIFT_EXPR block in
fold_negate_expr_1 near the
end of the function ?
>

> Alas the one minor complication is that some of these transformations

> are only wins, if the intermediate result (of the multiplication or

> division) is only used once, to avoid duplication/performing them again.

> See gcc.dg/tree-ssa/ssa-free-88.c.  Normally, this is the perfect usage

> of match's single_use (aka SSA's has_single_use).  Alas, single_use is

> not always accurate in match.pd, as some passes will construct and

> simplify an expression/stmt before inserting it into GIMPLE, and folding

> during this process sees the temporary undercount from the data-flow.

> To solve this, this patch introduces a new single_use_is_op_p that

> double checks that the single_use has the expected tree_code/operation

> and skips the transformation if we can tell single_use might be invalid.

>

> A follow-up patch might be to investigate whether genmatch.c can be

> tweaked to use this new helper function to implement the :s qualifier

> when the enclosing context is/should be known, but that's overkill

> to just unblock Jakub and Andrew on 98865.

>

> This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"

> and "make -k check" with no new failures.  Ok for mainline?

>

>

> 2021-09-09  Roger Sayle  <roger@nextmovesoftware.com>

>

> gcc/ChangeLog

>         * generic-match-head.c (single_use_is_op_p): New helper function.

>         * gimple-match-head.c (single_use_is_op_p): New helper function.

>         * match.pd (negation simplifications): Implement some negation

>         folding transformations from fold-const.c's fold_negate_expr.

>

> gcc/testsuite/ChangeLog

>         * gcc.dg/fold-negate-1.c: New test case.

Just a small nit -- please include the test-case as part of the patch
(instead of separate attachment), thanks!

Thanks,
Prathamesh
>

> Roger

> --

>
Harald Anlauf via Gcc-patches Sept. 9, 2021, 12:04 p.m. | #2
On Thu, Sep 9, 2021 at 12:08 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>

>

> As observed by Jakub in comment #2 of PR 98865, the expression -(a>>63)

> is optimized in GENERIC but not in GIMPLE.  Investigating further it

> turns out that this is one of a few transformations performed by

> fold_negate_expr in fold-const.c that aren't yet performed by match.pd.

> This patch moves/duplicates them there, and should be relatively safe

> as these transformations are already performed by the compiler, but

> just in different passes.

>

> Alas the one minor complication is that some of these transformations

> are only wins, if the intermediate result (of the multiplication or

> division) is only used once, to avoid duplication/performing them again.

> See gcc.dg/tree-ssa/ssa-free-88.c.  Normally, this is the perfect usage

> of match's single_use (aka SSA's has_single_use).  Alas, single_use is

> not always accurate in match.pd, as some passes will construct and

> simplify an expression/stmt before inserting it into GIMPLE, and folding

> during this process sees the temporary undercount from the data-flow.

> To solve this, this patch introduces a new single_use_is_op_p that

> double checks that the single_use has the expected tree_code/operation

> and skips the transformation if we can tell single_use might be invalid.

>

> A follow-up patch might be to investigate whether genmatch.c can be

> tweaked to use this new helper function to implement the :s qualifier

> when the enclosing context is/should be known, but that's overkill

> to just unblock Jakub and Andrew on 98865.

>

> This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"

> and "make -k check" with no new failures.  Ok for mainline?


I think that single_use_is_op_p is "bad" heuristics since it stretches
the SSA operand / immediate use use a bit too far.  Generally
fold_stmt and thus match.pd patterns may not rely on stmt
operands or immediate uses as only generated by update_stmt.

In fact the whole gimple_build machinery relies on match.pd
patterns operating on stmts that are _not_ in the IL yet and it
carefully restricts instruction combining to those stmts
(see gimple_build_valueize).  I suppose the cases you are running
into cross the boundary which is where these kind of issues
can happen.

That said, it's a heuristic that can't be perfect - some passes
are building a lot of IL into a sequence via gimple_build and
they are generally not too careful to flush stmts when they
make use of the result multiple times, so allowing has_zero_uses
is not conservative either.

That said - I'm not sure - (x*y) -> x * -y for easily negatable
y is something we should pursue during folding when we're
facing expression graphs.  That looks like sth for backprop.

Iff we want to improve single_use heuristics on the side
of saying 'no' when we're not absolutely sure that there's
a single_use we probably want to think of some way
of tracking immediate use reliability and documenting
constraints and assumptions we make when matching
patterns.

For the ssa-fre-88.c issue at hand it's probably
visit_nary_op that shouldn't simplify the expression
it wants to insert when it does the reverse transform.
So sth like

diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 8058a1e3c6a..8b11def93bc 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -2333,15 +2333,16 @@ vn_nary_build_or_lookup_1 (gimple_match_op
*res_op, bool insert)
      So first simplify and lookup this expression to see if it
      is already available.  */
   /* For simplification valueize.  */
-  unsigned i;
-  for (i = 0; i < res_op->num_ops; ++i)
-    if (TREE_CODE (res_op->ops[i]) == SSA_NAME)
-      {
-       tree tem = vn_valueize (res_op->ops[i]);
-       if (!tem)
-         break;
-       res_op->ops[i] = tem;
-      }
+  unsigned i = 0;
+  if (!insert)
+    for (; i < res_op->num_ops; ++i)
+      if (TREE_CODE (res_op->ops[i]) == SSA_NAME)
+       {
+         tree tem = vn_valueize (res_op->ops[i]);
+         if (!tem)
+           break;
+         res_op->ops[i] = tem;
+       }
   /* If valueization of an operand fails (it is not available), skip
      simplification.  */
   bool res = false;

or rather adding a new flag to the function, bool simplify and explicitely
disabling it from the negation transform.

Richard.

>

> 2021-09-09  Roger Sayle  <roger@nextmovesoftware.com>

>

> gcc/ChangeLog

>         * generic-match-head.c (single_use_is_op_p): New helper function.

>         * gimple-match-head.c (single_use_is_op_p): New helper function.

>         * match.pd (negation simplifications): Implement some negation

>         folding transformations from fold-const.c's fold_negate_expr.

>

> gcc/testsuite/ChangeLog

>         * gcc.dg/fold-negate-1.c: New test case.

>

> Roger

> --

>

Patch

diff --git a/gcc/generic-match-head.c b/gcc/generic-match-head.c
index f426208..64115a2 100644
--- a/gcc/generic-match-head.c
+++ b/gcc/generic-match-head.c
@@ -63,6 +63,16 @@  single_use (tree t ATTRIBUTE_UNUSED)
   return true;
 }
 
+/* Like single_use above, but confirm that the single use (if any)
+   is an expression of tree_code OP.  For GENERIC, assume the worst
+   and postpone folding transformations until GIMPLE.  */
+
+static inline bool
+single_use_is_op_p (tree t ATTRIBUTE_UNUSED, tree_code op ATTRIBUTE_UNUSED)
+{
+  return false;
+}
+
 /* Return true if math operations should be canonicalized,
    e.g. sqrt(sqrt(x)) -> pow(x, 0.25).  */
 
diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
index 7112c11..2026f96a 100644
--- a/gcc/gimple-match-head.c
+++ b/gcc/gimple-match-head.c
@@ -1141,6 +1141,22 @@  single_use (tree t)
   return TREE_CODE (t) != SSA_NAME || has_zero_uses (t) || has_single_use (t);
 }
 
+/* Like single_use above, but confirm that the single use (if any)
+   is an expression of tree_code OP.  */
+
+static inline bool
+single_use_is_op_p (tree t, tree_code op)
+{
+  use_operand_p use;
+  gimple *stmt;
+
+  return TREE_CODE (t) != SSA_NAME
+	 || has_zero_uses (t)
+	 || (single_imm_use (t, &use, &stmt)
+	     && is_gimple_assign (stmt)
+	     && gimple_assign_rhs_code (stmt) == op);
+}
+
 /* Return true if math operations should be canonicalized,
    e.g. sqrt(sqrt(x)) -> pow(x, 0.25).  */
 
diff --git a/gcc/match.pd b/gcc/match.pd
index 008f775..091da6c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1481,6 +1481,36 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (!FIXED_POINT_TYPE_P (type))
  (plus @0 (negate @1))))
 
+/* Other simplifications of negation (c.f. fold_negate_expr_1).  */
+(simplify
+ (negate (mult:c@0 @1 negate_expr_p@2))
+ (if (! TYPE_UNSIGNED (type)
+      && ! HONOR_SIGN_DEPENDENT_ROUNDING (type)
+      && single_use_is_op_p (@0, NEGATE_EXPR))
+  (mult @1 (negate @2))))
+
+(simplify
+ (negate (rdiv@0 @1 negate_expr_p@2))
+ (if (! HONOR_SIGN_DEPENDENT_ROUNDING (type)
+      && single_use_is_op_p (@0, NEGATE_EXPR))
+  (rdiv @1 (negate @2))))
+
+(simplify
+ (negate (rdiv@0 negate_expr_p@1 @2))
+ (if (! HONOR_SIGN_DEPENDENT_ROUNDING (type)
+      && single_use_is_op_p (@0, NEGATE_EXPR))
+  (rdiv (negate @1) @2)))
+
+/* Fold -((int)x >> (prec - 1)) into (unsigned)x >> (prec - 1).  */
+(simplify
+ (negate (convert? (rshift @0 INTEGER_CST@1)))
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
+      && wi::to_wide (@1) == element_precision (type) - 1)
+  (with { tree stype = TREE_TYPE (@0);
+	  tree ntype = TYPE_UNSIGNED (stype) ? signed_type_for (stype)
+					     : unsigned_type_for (stype); }
+   (convert (rshift:ntype (convert:ntype @0) @1)))))
+
 /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST))
    when profitable.
    For bitwise binary operations apply operand conversions to the