tree-optimization/99954 - fix loop distribution memcpy classification

Message ID nycvar.YFH.7.76.2104071429350.30412@elmra.sevgm.obk
State New
Headers show
Series
  • tree-optimization/99954 - fix loop distribution memcpy classification
Related show

Commit Message

Richard Biener April 7, 2021, 12:29 p.m.
This fixes bogus classification of a copy as memcpy.  We cannot use
plain dependence analysis to decide between memcpy and memmove when
it computes no dependence.  Instead we have to try harder later which
the patch does for the gcc.dg/tree-ssa/ldist-24.c testcase by resorting
to tree-affine to compute the difference between src and dest and
compare against the copy size.


Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-04-07  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/99954
	* tree-loop-distribution.c: Include tree-affine.h.
	(generate_memcpy_builtin): Try using tree-affine to prove
	non-overlap.
	(loop_distribution::classify_builtin_ldst): Always classify
	as PKIND_MEMMOVE.

	* gcc.dg/torture/pr99954.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr99954.c | 30 ++++++++++++++++++++++++++
 gcc/tree-loop-distribution.c           | 17 +++++++++++++--
 2 files changed, 45 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr99954.c

-- 
2.26.2

Patch

diff --git a/gcc/testsuite/gcc.dg/torture/pr99954.c b/gcc/testsuite/gcc.dg/torture/pr99954.c
new file mode 100644
index 00000000000..7d447035912
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr99954.c
@@ -0,0 +1,30 @@ 
+/* { dg-do run } */
+
+#include <assert.h>
+
+#define CONTAINER_KIND union
+
+typedef CONTAINER_KIND container { int value; } container;
+
+void move(container* end, container* start) {
+    container* p;
+    for (p = end; p > start; p--) {
+	(p)->value = (p-1)->value;
+    }
+}
+
+#define N 100
+
+int main(int argc, char* argv[]) {
+    container vals[N];
+    int i;
+    for (i=0; i<N; i++) {
+        vals[i].value = argc + i;
+    }
+    move(&vals[N-1], &vals[0]);
+    assert(vals[0].value == argc + 0);
+    for (i=1; i<N; i++) {
+        assert(vals[i].value == argc + i - 1);
+    }
+    return 0;
+}
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 583bb062b76..8b91a30d1c6 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -115,6 +115,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-vectorizer.h"
 #include "tree-eh.h"
 #include "gimple-fold.h"
+#include "tree-affine.h"
 
 
 #define MAX_DATAREFS_NUM \
@@ -1212,6 +1213,18 @@  generate_memcpy_builtin (class loop *loop, partition *partition)
     kind = BUILT_IN_MEMCPY;
   else
     kind = BUILT_IN_MEMMOVE;
+  /* Try harder if we're copying a constant size.  */
+  if (kind == BUILT_IN_MEMMOVE && poly_int_tree_p (nb_bytes))
+    {
+      aff_tree asrc, adest;
+      tree_to_aff_combination (src, ptr_type_node, &asrc);
+      tree_to_aff_combination (dest, ptr_type_node, &adest);
+      aff_combination_scale (&adest, -1);
+      aff_combination_add (&asrc, &adest);
+      if (aff_comb_cannot_overlap_p (&asrc, wi::to_poly_widest (nb_bytes),
+				     wi::to_poly_widest (nb_bytes)))
+	kind = BUILT_IN_MEMCPY;
+    }
 
   dest = force_gimple_operand_gsi (&gsi, dest, true, NULL_TREE,
 				   false, GSI_CONTINUE_LINKING);
@@ -1759,11 +1772,11 @@  loop_distribution::classify_builtin_ldst (loop_p loop, struct graph *rdg,
   /* Now check that if there is a dependence.  */
   ddr_p ddr = get_data_dependence (rdg, src_dr, dst_dr);
 
-  /* Classify as memcpy if no dependence between load and store.  */
+  /* Classify as memmove if no dependence between load and store.  */
   if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
     {
       partition->builtin = alloc_builtin (dst_dr, src_dr, base, src_base, size);
-      partition->kind = PKIND_MEMCPY;
+      partition->kind = PKIND_MEMMOVE;
       return;
     }