rs6000: Fix PR91275

Message ID 6569d067-8ebc-8360-61a8-499a3700d216@linux.ibm.com
State New
Headers show
Series
  • rs6000: Fix PR91275
Related show

Commit Message

Bill Schmidt Oct. 1, 2019, 1:24 a.m.
Hi,

PR91275 observes that __builtin_crypto_vpmsumd fails to work properly
with -O1 or higher with -mcpu=power8.  That combination spells swap
optimization.  Indeed, all vpmsum* instructions were being accepted
as swappable operations.  This is correct for all of them but vpmsumd,
which creates a 128-bit result.

The -std=gnu11 in the testcase is there to avoid errors about long long
not being accepted with pure ANSI.  The "11" part is arbitrary.  The
testcase is modified from the original bug report.

This patch disallows swap optimization in the presence of vpmsumd.
Bootstrapped and tested on powerpc64le-unknown-linux-gnu.  Is this okay
for trunk, and for backport to all active branches after an appropriate
waiting period?

Thanks,
Bill


[gcc]

2019-09-30  Bill Schmidt  <wschmidt@linux.ibm.com>

	* config/rs6000/rs6000-p8swap.c (rtx_is_swappable_p): Don't swap
	vpmsumd.

[gcc/testsuite]

2019-09-30  Bill Schmidt  <wschmdit@linux.ibm.com>

	* gcc.target/powerpc/pr91275.c: New.

Comments

Segher Boessenkool Oct. 1, 2019, 2:26 a.m. | #1
Hi Bill,

On Mon, Sep 30, 2019 at 08:24:09PM -0500, Bill Schmidt wrote:
> PR91275 observes that __builtin_crypto_vpmsumd fails to work properly

> with -O1 or higher with -mcpu=power8.  That combination spells swap

> optimization.  Indeed, all vpmsum* instructions were being accepted

> as swappable operations.  This is correct for all of them but vpmsumd,

> which creates a 128-bit result.


Yeah, that is obvious once you see it spelled out :-)

> The -std=gnu11 in the testcase is there to avoid errors about long long

> not being accepted with pure ANSI.  The "11" part is arbitrary.  The

> testcase is modified from the original bug report.


That is fine.

> This patch disallows swap optimization in the presence of vpmsumd.

> Bootstrapped and tested on powerpc64le-unknown-linux-gnu.  Is this okay

> for trunk, and for backport to all active branches after an appropriate

> waiting period?


Yes please.  Thanks,


Segher


> 2019-09-30  Bill Schmidt  <wschmidt@linux.ibm.com>

> 

> 	* config/rs6000/rs6000-p8swap.c (rtx_is_swappable_p): Don't swap

> 	vpmsumd.

> 

> [gcc/testsuite]

> 

> 2019-09-30  Bill Schmidt  <wschmdit@linux.ibm.com>

> 

> 	* gcc.target/powerpc/pr91275.c: New.

Patch

Index: gcc/config/rs6000/rs6000-p8swap.c
===================================================================
--- gcc/config/rs6000/rs6000-p8swap.c	(revision 276360)
+++ gcc/config/rs6000/rs6000-p8swap.c	(working copy)
@@ -791,6 +791,11 @@  rtx_is_swappable_p (rtx op, unsigned int *special)
  	  case UNSPEC_REDUC_PLUS:
  	  case UNSPEC_REDUC:
  	    return 1;
+	  case UNSPEC_VPMSUM:
+	    /* vpmsumd is not swappable, but vpmsum[bhw] are.  */
+	    if (GET_MODE (op) == V2DImode)
+	      return 0;
+	    break;
  	  }
        }
  
Index: gcc/testsuite/gcc.target/powerpc/pr91275.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr91275.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr91275.c	(working copy)
@@ -0,0 +1,21 @@ 
+/* Test that we generate vpmsumd correctly without a swap error.  */
+
+/* { dg-do run { target { p8vector_hw } } } */
+/* { dg-options "-O2 -std=gnu11" } */
+
+#include <altivec.h>
+
+int main() {
+
+  const unsigned long long r0l = 0x8e7dfceac070e3a0;
+  vector unsigned long long r0 = (vector unsigned long long) {r0l, 0}, v;
+  const vector unsigned long long pd
+    = (vector unsigned long) {0xc2LLU << 56, 0};
+
+  v = __builtin_crypto_vpmsumd ((vector unsigned long long) {r0[0], 0}, pd);
+
+  if (v[0] != 0x4000000000000000 || v[1] != 0x65bd7ab605a4a8ff)
+    __builtin_abort ();
+
+  return 0;
+}