[v2,0/6] rs6000: Add SSE4.1 "blend", "ceil", "floor"

Message ID 20210716135022.489455-1-pc@us.ibm.com
Headers show
Series
  • rs6000: Add SSE4.1 "blend", "ceil", "floor"
Related show

Message

Feng Xue OS via Gcc-patches July 16, 2021, 1:50 p.m.
I have combined three independent "v1" patchsets into this set,
and the "blend" patches were originally combined with "test",
which has now been merged.

Instead of copying some tests from gcc/testsuite/gcc.target/i386,
I created new tests.  The i386 tests in question used rand() to
generate the input data and assembly to compute the rounded values.
Using rand() for testing seems wrong, and the assembly is obviously
not portable.  I use static data, primarily exercising the edges of
dynamic ranges (where fractions start to be unrepresentable).

Tested on ppc64le, ppc64, ppc.

v2:
- Rewrite blends to use vec_perm.
- Improve formatting.

Paul A. Clarke (6):
  rs6000: Add support for SSE4.1 "blend" intrinsics
  rs6000: Add tests for SSE4.1 "blend" intrinsics
  rs6000: Add support for SSE4.1 "ceil" intrinsics
  rs6000: Add tests for SSE4.1 "ceil" intrinsics
  rs6000: Add support for SSE4.1 "floor" intrinsics
  rs6000: Add tests for SSE4.1 "floor" intrinsics

 gcc/config/rs6000/smmintrin.h                 | 124 ++++++++++++++++++
 .../gcc.target/powerpc/sse4_1-blendpd.c       |  89 +++++++++++++
 .../gcc.target/powerpc/sse4_1-blendps-2.c     |  81 ++++++++++++
 .../gcc.target/powerpc/sse4_1-blendps.c       |  90 +++++++++++++
 .../gcc.target/powerpc/sse4_1-blendvpd.c      |  65 +++++++++
 .../gcc.target/powerpc/sse4_1-ceilpd.c        |  51 +++++++
 .../gcc.target/powerpc/sse4_1-ceilps.c        |  41 ++++++
 .../gcc.target/powerpc/sse4_1-ceilsd.c        | 119 +++++++++++++++++
 .../gcc.target/powerpc/sse4_1-ceilss.c        |  95 ++++++++++++++
 .../gcc.target/powerpc/sse4_1-check.h         |   4 +
 .../gcc.target/powerpc/sse4_1-floorpd.c       |  51 +++++++
 .../gcc.target/powerpc/sse4_1-floorps.c       |  41 ++++++
 .../gcc.target/powerpc/sse4_1-floorsd.c       | 119 +++++++++++++++++
 .../gcc.target/powerpc/sse4_1-floorss.c       |  95 ++++++++++++++
 .../gcc.target/powerpc/sse4_1-round-data.h    |  20 +++
 .../gcc.target/powerpc/sse4_1-round.h         |  27 ++++
 .../gcc.target/powerpc/sse4_1-round2.h        |  27 ++++
 .../gcc.target/powerpc/sse4_1-roundpd-2.c     |  36 +++++
 .../gcc.target/powerpc/sse4_1-roundpd-3.c     |  36 +++++
 19 files changed, 1211 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-blendpd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-blendps-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-blendps.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-blendvpd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-ceilpd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-ceilps.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-ceilsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-ceilss.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-floorpd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-floorps.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-floorsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-floorss.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-round-data.h
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-round.h
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-round2.h
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd-3.c

-- 
2.27.0