middle-end: Optimize (A&C)^(B&C) to (A^B)&C in simplify_rtx.

Message ID 006501d64024$93550030$b9ff0090$@nextmovesoftware.com
State New
Headers show
Series
  • middle-end: Optimize (A&C)^(B&C) to (A^B)&C in simplify_rtx.
Related show

Commit Message

Roger Sayle June 11, 2020, 7:14 p.m.
My apologies in advance for a middle-end patch without a test case.   The
patch below

implements a simple/safe missing transformation in the RTL optimizers, that
transforms

(A&C)^(B&C) into the equivalent (A^B)&C, when C doesn't side-effect, such as
a constant.

 

I originally identified this opportunity in gfortran, where:

 

integer function foo(i) result(res)

  integer(kind=16), intent(in) :: i

  res = poppar(i)

end function

 

currently on x86_64 with -O2 -march=corei7 gfortran produces:

foo_:   popcntq (%rdi), %rdx

        popcntq 8(%rdi), %rax

        andl    $1, %edx

        andl    $1, %eax

        xorl    %edx, %eax

        ret

 

But with this patch, now produces:

foo_:   popcntq (%rdi), %rdx

        popcntq 8(%rdi), %rax

        xorl    %edx, %eax

        andl    $1, %eax

        ret

 

The equivalent C/C++ testcase is:

 

unsigned int foo(unsigned int x, unsigned  int y)

{

  return __builtin_parityll(x) ^ __builtin_parityll(y);

}

 

where GCC currently generates:

foo:    movl    %esi, %eax

        movl    %edi, %edi

        popcntq %rdi, %rdi

        popcntq %rax, %rax

        andl    $1, %edi

        andl    $1, %eax

        xorl    %edi, %eax

        ret

 

and with this patch, it instead generates:

foo:    movl    %esi, %eax

        movl    %edi, %edi

        popcntq %rdi, %rdi

        popcntq %rax, %rax

        xorl    %edi, %eax

        andl    $1, %eax

        ret

 

 

The trouble is I'm just about to submit a patch to improve constant folding
of parity in the

middle-end's match.pd, which will generate different RTL code sequences for
the above

two examples.  Hopefully, folks agree it's better to have a RTL optimization
that's difficult

to test, than not perform this simplification at all.  The
semantics/correctness of this

transformation are tested by the run-time tests in
gfortran.dg/popcnt_poppar_2.F90

 

This patch has been tested with "make bootstrap" and "make -k check" on

x86_64-pc-linux-gnu with no regressions.  If approved, I'd very much
appreciate it if

someone could commit this change for me.

 

 

2020-06-11  Roger Sayle  <roger@nextmovesoftware.com>

 

        * simplify-rtx.c (simplify_binary_operation_1): Simplify

        (X & C) ^ (Y & C) to (X ^ Y) & C, when C is simple (i.e. a
constant).

 

 

Thanks very much in advance,

Roger

--

Roger Sayle

NextMove Software

Cambridge, UK

Comments

Richard Sandiford June 12, 2020, 7:31 a.m. | #1
"Roger Sayle" <roger@nextmovesoftware.com> writes:
> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c

> index 28c2dc6..ccf5f6d 100644

> --- a/gcc/simplify-rtx.c

> +++ b/gcc/simplify-rtx.c

> @@ -3128,6 +3128,17 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,

>  				     mode);

>        }

>  

> +      /* Convert (xor (and A C) (and B C)) into (and (xor A B) C).  */

> +      if (GET_CODE (op0) == AND 

> +	  && GET_CODE (op1) == AND

> +	  && rtx_equal_p (XEXP (op0, 1), XEXP (op1, 1))

> +	  && ! side_effects_p (XEXP (op0, 1)))

> +	return simplify_gen_binary (AND, mode,

> +				    simplify_gen_binary (XOR, mode,

> +							 XEXP (op0, 0),

> +							 XEXP (op1, 0)),

> +				    XEXP (op0, 1));

> +

>        /* Convert (xor (and A B) B) to (and (not A) B).  The latter may

>  	 correspond to a machine insn or result in further simplifications

>  	 if B is a constant.  */


Looks good, but I guess we might as well add the corresponding
(ior (and …) (and …)) and (and (and …) (and …)) rules at the
same time.  E.g. maybe have a new subfunction for logical ops that
is shared by all of AND, IOR and XOR.  In the case of AND, this ought
to have precedence over simplify_associative_operation.

No separate testcase is fine, but maybe it'd be worth adding some
selftests to simplify_rtx_c_tests.

Hope this isn't the best being the enemy of the good…

Thanks,
Richard

Patch

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 28c2dc6..ccf5f6d 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -3128,6 +3128,17 @@  simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
 				     mode);
       }
 
+      /* Convert (xor (and A C) (and B C)) into (and (xor A B) C).  */
+      if (GET_CODE (op0) == AND 
+	  && GET_CODE (op1) == AND
+	  && rtx_equal_p (XEXP (op0, 1), XEXP (op1, 1))
+	  && ! side_effects_p (XEXP (op0, 1)))
+	return simplify_gen_binary (AND, mode,
+				    simplify_gen_binary (XOR, mode,
+							 XEXP (op0, 0),
+							 XEXP (op1, 0)),
+				    XEXP (op0, 1));
+
       /* Convert (xor (and A B) B) to (and (not A) B).  The latter may
 	 correspond to a machine insn or result in further simplifications
 	 if B is a constant.  */