Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]

Message ID 20210330171833.2184942-1-Paul.Zimmermann@inria.fr
State Superseded
Headers show
Series
  • Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469, #14470, #14471, #14472]
Related show

Commit Message

Paul Zimmermann March 30, 2021, 5:18 p.m.
For j0f/j1f/y0f/y1f, the largest error for all binary32
inputs is reduced to at most 9 ulps for all rounding modes.

The new code is enabled only when there is a cancellation at the very end of
the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not
give any visible slowdown on average.  Two different algorithms are used:

* around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of
  degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)

* for large inputs, an asymptotic formula from [1] is used

[1] Fast and Accurate Bessel Function Computation,
    John Harrison, Proceedings of Arith 19, 2009.

Inputs yielding the new largest errors are added to auto-libm-test-in,
and ulps are regenerated for various targets (thanks Adhemerval Zanella).

Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.
---
 math/auto-libm-test-in             |  15 +
 math/auto-libm-test-out-j0         |  50 +++
 math/auto-libm-test-out-j1         |  50 +++
 math/auto-libm-test-out-y0         |  50 +++
 math/auto-libm-test-out-y1         |  75 +++++
 sysdeps/aarch64/libm-test-ulps     |  70 ++--
 sysdeps/ieee754/flt-32/e_j0f.c     | 515 ++++++++++++++++++++++++++---
 sysdeps/ieee754/flt-32/e_j1f.c     | 512 ++++++++++++++++++++++++++--
 sysdeps/powerpc/fpu/libm-test-ulps |  62 ++--
 sysdeps/s390/fpu/libm-test-ulps    |  68 ++--
 sysdeps/sparc/fpu/libm-test-ulps   |  68 ++--
 sysdeps/x86_64/fpu/libm-test-ulps  |  76 ++---
 12 files changed, 1369 insertions(+), 242 deletions(-)

-- 
2.30.2

Comments

Stafford Horne via Libc-alpha March 30, 2021, 6:18 p.m. | #1
On 30/03/2021 14:18, Paul Zimmermann wrote:
> For j0f/j1f/y0f/y1f, the largest error for all binary32

> inputs is reduced to at most 9 ulps for all rounding modes.

> 

> The new code is enabled only when there is a cancellation at the very end of

> the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not

> give any visible slowdown on average.  Two different algorithms are used:

> 

> * around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of

>   degree 3 are used, computed using the Sollya tool (https://www.sollya.org/)

> 

> * for large inputs, an asymptotic formula from [1] is used

> 

> [1] Fast and Accurate Bessel Function Computation,

>     John Harrison, Proceedings of Arith 19, 2009.

> 

> Inputs yielding the new largest errors are added to auto-libm-test-in,

> and ulps are regenerated for various targets (thanks Adhemerval Zanella).

> 

> Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu.


Patch looks good, thanks for working on this.  Besides the minor style
nits regarding comments, please open bug reports for the xfail entries
added on auto-libm-test-in and add a comment to reference them.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>


> ---

>  math/auto-libm-test-in             |  15 +

>  math/auto-libm-test-out-j0         |  50 +++

>  math/auto-libm-test-out-j1         |  50 +++

>  math/auto-libm-test-out-y0         |  50 +++

>  math/auto-libm-test-out-y1         |  75 +++++

>  sysdeps/aarch64/libm-test-ulps     |  70 ++--

>  sysdeps/ieee754/flt-32/e_j0f.c     | 515 ++++++++++++++++++++++++++---

>  sysdeps/ieee754/flt-32/e_j1f.c     | 512 ++++++++++++++++++++++++++--

>  sysdeps/powerpc/fpu/libm-test-ulps |  62 ++--

>  sysdeps/s390/fpu/libm-test-ulps    |  68 ++--

>  sysdeps/sparc/fpu/libm-test-ulps   |  68 ++--

>  sysdeps/x86_64/fpu/libm-test-ulps  |  76 ++---

>  12 files changed, 1369 insertions(+), 242 deletions(-)

> 

> diff --git a/math/auto-libm-test-in b/math/auto-libm-test-in

> index 4edaaa8ee1..9ccbd04567 100644

> --- a/math/auto-libm-test-in

> +++ b/math/auto-libm-test-in

> @@ -5792,6 +5792,9 @@ j0 0x1p16382

>  j0 0x1p16383

>  # the next value generates larger error bounds on x86_64 (binary32)

>  j0 0x2.602774p+0 xfail-rounding:ibm128-libgcc

> +# the next values yield the largest error (9 ulps) for binary32 (x86_64)

> +j0 0x1.04c39cp+6

> +j0 0x1.4b7066p+7

>  # the next value exercises the flt-32 code path for x >= 2^127

>  j0 0x8.2f4ecp+124

>  

> @@ -5827,6 +5830,9 @@ j1 0x1p-600

>  j1 0x1p-10000

>  # the next value generates larger error bounds on x86_64 (binary32)

>  j1 0x3.ae4b2p+0 xfail-rounding:ibm128-libgcc

> +# the next two values yield large errors in the binary32 format

> +j1 0x1.2f28eap+7 xfail-rounding:binary64 xfail-rounding:binary128 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc

> +j1 0x1.a1d20ap+6 xfail-rounding:binary128 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc

>  j1 min

>  j1 -min

>  j1 min_subnorm


The BZ #14469, #14470, #14471, #14472 refers explicit to the float
version, so we will need to open bug reports and reference them for the
double and long double version, as indicated by math/gen-auto-libm-tests.c:

| "xfail", "xfail-rounding", "spurious-" and "missing-" flags should be
| accompanied by a comment referring to an open bug in glibc Bugzilla.


> @@ -8275,6 +8281,11 @@ y0 0x1p-600

>  y0 0x1p-10000

>  # the next value generates larger error bounds on x86_64 (binary32)

>  y0 0xd.3432bp-4

> +# the next two values yield the largest error (9 ulps) for binary32 (x86_64)

> +# the first one fails for binary64 and ldbl-96 because y0 is known to have

> +# large errors (#16492)

> +y0 0x1.33eaacp+5 xfail:binary64 xfail:intel96 xfail-rounding:ibm128-libgcc

> +y0 0x1.a681cep-1 xfail-rounding:ibm128-libgcc

>  y0 min

>  y0 min_subnorm

>  

> @@ -8303,6 +8314,10 @@ y1 0x1p-100

>  y1 0x1p-110

>  y1 0x1p-600

>  y1 0x1p-10000

> +# the next three values yield the largest error in the binary32 format

> +y1 0x1.065194p+7 xfail-rounding:binary64 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc

> +y1 0x1.c1badep+0 xfail-rounding:ibm128-libgcc

> +y1 0x1.c1bc2ep+0

>  y1 min

>  y1 min_subnorm

>  

> diff --git a/math/auto-libm-test-out-j0 b/math/auto-libm-test-out-j0

> index cc1fea074e..85c4148de3 100644

> --- a/math/auto-libm-test-out-j0

> +++ b/math/auto-libm-test-out-j0

> @@ -1359,6 +1359,56 @@ j0 0x2.602774p+0 xfail-rounding:ibm128-libgcc

>  = j0 tonearest ibm128 0x2.602774p+0 : 0x3.e83779fe19911fa806cee83ab7p-8 : inexact-ok

>  = j0 towardzero ibm128 0x2.602774p+0 : 0x3.e83779fe19911fa806cee83ab7p-8 : xfail:ibm128-libgcc inexact-ok

>  = j0 upward ibm128 0x2.602774p+0 : 0x3.e83779fe19911fa806cee83ab8p-8 : xfail:ibm128-libgcc inexact-ok

> +j0 0x1.04c39cp+6

> += j0 downward binary32 0x4.130e7p+4 : -0x6.dd65d8p-16 : inexact-ok

> += j0 tonearest binary32 0x4.130e7p+4 : -0x6.dd65dp-16 : inexact-ok

> += j0 towardzero binary32 0x4.130e7p+4 : -0x6.dd65dp-16 : inexact-ok

> += j0 upward binary32 0x4.130e7p+4 : -0x6.dd65dp-16 : inexact-ok

> += j0 downward binary64 0x4.130e7p+4 : -0x6.dd65d0005a19p-16 : inexact-ok

> += j0 tonearest binary64 0x4.130e7p+4 : -0x6.dd65d0005a18cp-16 : inexact-ok

> += j0 towardzero binary64 0x4.130e7p+4 : -0x6.dd65d0005a18cp-16 : inexact-ok

> += j0 upward binary64 0x4.130e7p+4 : -0x6.dd65d0005a18cp-16 : inexact-ok

> += j0 downward intel96 0x4.130e7p+4 : -0x6.dd65d0005a18c04p-16 : inexact-ok

> += j0 tonearest intel96 0x4.130e7p+4 : -0x6.dd65d0005a18c04p-16 : inexact-ok

> += j0 towardzero intel96 0x4.130e7p+4 : -0x6.dd65d0005a18c038p-16 : inexact-ok

> += j0 upward intel96 0x4.130e7p+4 : -0x6.dd65d0005a18c038p-16 : inexact-ok

> += j0 downward m68k96 0x4.130e7p+4 : -0x6.dd65d0005a18c04p-16 : inexact-ok

> += j0 tonearest m68k96 0x4.130e7p+4 : -0x6.dd65d0005a18c04p-16 : inexact-ok

> += j0 towardzero m68k96 0x4.130e7p+4 : -0x6.dd65d0005a18c038p-16 : inexact-ok

> += j0 upward m68k96 0x4.130e7p+4 : -0x6.dd65d0005a18c038p-16 : inexact-ok

> += j0 downward binary128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bb64p-16 : inexact-ok

> += j0 tonearest binary128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bb6p-16 : inexact-ok

> += j0 towardzero binary128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bb6p-16 : inexact-ok

> += j0 upward binary128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bb6p-16 : inexact-ok

> += j0 downward ibm128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bcp-16 : inexact-ok

> += j0 tonearest ibm128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bcp-16 : inexact-ok

> += j0 towardzero ibm128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bap-16 : inexact-ok

> += j0 upward ibm128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bap-16 : inexact-ok

> +j0 0x1.4b7066p+7

> += j0 downward binary32 0xa.5b833p+4 : 0xf.80edp-20 : inexact-ok

> += j0 tonearest binary32 0xa.5b833p+4 : 0xf.80edp-20 : inexact-ok

> += j0 towardzero binary32 0xa.5b833p+4 : 0xf.80edp-20 : inexact-ok

> += j0 upward binary32 0xa.5b833p+4 : 0xf.80ed1p-20 : inexact-ok

> += j0 downward binary64 0xa.5b833p+4 : 0xf.80ed0055a8e58p-20 : inexact-ok

> += j0 tonearest binary64 0xa.5b833p+4 : 0xf.80ed0055a8e6p-20 : inexact-ok

> += j0 towardzero binary64 0xa.5b833p+4 : 0xf.80ed0055a8e58p-20 : inexact-ok

> += j0 upward binary64 0xa.5b833p+4 : 0xf.80ed0055a8e6p-20 : inexact-ok

> += j0 downward intel96 0xa.5b833p+4 : 0xf.80ed0055a8e5c88p-20 : inexact-ok

> += j0 tonearest intel96 0xa.5b833p+4 : 0xf.80ed0055a8e5c89p-20 : inexact-ok

> += j0 towardzero intel96 0xa.5b833p+4 : 0xf.80ed0055a8e5c88p-20 : inexact-ok

> += j0 upward intel96 0xa.5b833p+4 : 0xf.80ed0055a8e5c89p-20 : inexact-ok

> += j0 downward m68k96 0xa.5b833p+4 : 0xf.80ed0055a8e5c88p-20 : inexact-ok

> += j0 tonearest m68k96 0xa.5b833p+4 : 0xf.80ed0055a8e5c89p-20 : inexact-ok

> += j0 towardzero m68k96 0xa.5b833p+4 : 0xf.80ed0055a8e5c88p-20 : inexact-ok

> += j0 upward m68k96 0xa.5b833p+4 : 0xf.80ed0055a8e5c89p-20 : inexact-ok

> += j0 downward binary128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe12ap-20 : inexact-ok

> += j0 tonearest binary128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe12a8p-20 : inexact-ok

> += j0 towardzero binary128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe12ap-20 : inexact-ok

> += j0 upward binary128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe12a8p-20 : inexact-ok

> += j0 downward ibm128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe1p-20 : inexact-ok

> += j0 tonearest ibm128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe14p-20 : inexact-ok

> += j0 towardzero ibm128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe1p-20 : inexact-ok

> += j0 upward ibm128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe14p-20 : inexact-ok

>  j0 0x8.2f4ecp+124

>  = j0 downward binary32 0x8.2f4ecp+124 : 0xd.331efp-68 : inexact-ok

>  = j0 tonearest binary32 0x8.2f4ecp+124 : 0xd.331efp-68 : inexact-ok

> diff --git a/math/auto-libm-test-out-j1 b/math/auto-libm-test-out-j1

> index 6bc3bbe5a4..f808801976 100644

> --- a/math/auto-libm-test-out-j1

> +++ b/math/auto-libm-test-out-j1

> @@ -993,6 +993,56 @@ j1 0x3.ae4b2p+0 xfail-rounding:ibm128-libgcc

>  = j1 tonearest ibm128 0x3.ae4b2p+0 : 0xf.d085c66e86f30267f22d6f787cp-8 : inexact-ok

>  = j1 towardzero ibm128 0x3.ae4b2p+0 : 0xf.d085c66e86f30267f22d6f787cp-8 : xfail:ibm128-libgcc inexact-ok

>  = j1 upward ibm128 0x3.ae4b2p+0 : 0xf.d085c66e86f30267f22d6f788p-8 : xfail:ibm128-libgcc inexact-ok

> +j1 0x1.2f28eap+7 xfail-rounding:binary64 xfail-rounding:binary128 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc

> += j1 downward binary32 0x9.79475p+4 : 0x2.49a6fp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest binary32 0x9.79475p+4 : 0x2.49a6f4p-16 : inexact-ok

> += j1 towardzero binary32 0x9.79475p+4 : 0x2.49a6fp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward binary32 0x9.79475p+4 : 0x2.49a6f4p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward binary64 0x9.79475p+4 : 0x2.49a6f3fb00346p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest binary64 0x9.79475p+4 : 0x2.49a6f3fb00346p-16 : inexact-ok

> += j1 towardzero binary64 0x9.79475p+4 : 0x2.49a6f3fb00346p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward binary64 0x9.79475p+4 : 0x2.49a6f3fb00348p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward intel96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest intel96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : inexact-ok

> += j1 towardzero intel96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward intel96 0x9.79475p+4 : 0x2.49a6f3fb003462a4p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward m68k96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest m68k96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : inexact-ok

> += j1 towardzero m68k96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward m68k96 0x9.79475p+4 : 0x2.49a6f3fb003462a4p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward binary128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec05ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest binary128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec05cp-16 : inexact-ok

> += j1 towardzero binary128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec05ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward binary128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec05cp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward ibm128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cecp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest ibm128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cecp-16 : inexact-ok

> += j1 towardzero ibm128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cecp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward ibm128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec1p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> +j1 0x1.a1d20ap+6 xfail-rounding:binary128 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc

> += j1 downward binary32 0x6.874828p+4 : -0x3.d6f0b8p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest binary32 0x6.874828p+4 : -0x3.d6f0b4p-16 : inexact-ok

> += j1 towardzero binary32 0x6.874828p+4 : -0x3.d6f0b4p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward binary32 0x6.874828p+4 : -0x3.d6f0b4p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward binary64 0x6.874828p+4 : -0x3.d6f0b408112dp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest binary64 0x6.874828p+4 : -0x3.d6f0b408112dp-16 : inexact-ok

> += j1 towardzero binary64 0x6.874828p+4 : -0x3.d6f0b408112cep-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward binary64 0x6.874828p+4 : -0x3.d6f0b408112cep-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward intel96 0x6.874828p+4 : -0x3.d6f0b408112cf3dp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest intel96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : inexact-ok

> += j1 towardzero intel96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward intel96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward m68k96 0x6.874828p+4 : -0x3.d6f0b408112cf3dp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest m68k96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : inexact-ok

> += j1 towardzero m68k96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward m68k96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward binary128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7088p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest binary128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7086p-16 : inexact-ok

> += j1 towardzero binary128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7086p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward binary128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7086p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 downward ibm128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca71p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 tonearest ibm128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca71p-16 : inexact-ok

> += j1 towardzero ibm128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += j1 upward ibm128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok

>  j1 min

>  = j1 downward binary32 0x4p-128 : 0x1.fffff8p-128 : inexact-ok underflow errno-erange-ok

>  = j1 tonearest binary32 0x4p-128 : 0x2p-128 : inexact-ok underflow errno-erange-ok

> diff --git a/math/auto-libm-test-out-y0 b/math/auto-libm-test-out-y0

> index 8ebb585170..f9742b0f69 100644

> --- a/math/auto-libm-test-out-y0

> +++ b/math/auto-libm-test-out-y0

> @@ -820,6 +820,56 @@ y0 0xd.3432bp-4

>  = y0 tonearest ibm128 0xd.3432bp-4 : -0xf.fdd871793bc71f92d6b137b44p-8 : inexact-ok

>  = y0 towardzero ibm128 0xd.3432bp-4 : -0xf.fdd871793bc71f92d6b137b44p-8 : inexact-ok

>  = y0 upward ibm128 0xd.3432bp-4 : -0xf.fdd871793bc71f92d6b137b44p-8 : inexact-ok

> +y0 0x1.33eaacp+5 xfail:binary64 xfail:intel96 xfail-rounding:ibm128-libgcc

> += y0 downward binary32 0x2.67d558p+4 : 0xf.6b0ep-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 tonearest binary32 0x2.67d558p+4 : 0xf.6b0ep-16 : xfail:binary64 xfail:intel96 inexact-ok

> += y0 towardzero binary32 0x2.67d558p+4 : 0xf.6b0ep-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 upward binary32 0x2.67d558p+4 : 0xf.6b0e1p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 downward binary64 0x2.67d558p+4 : 0xf.6b0e005e95ad8p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 tonearest binary64 0x2.67d558p+4 : 0xf.6b0e005e95ad8p-16 : xfail:binary64 xfail:intel96 inexact-ok

> += y0 towardzero binary64 0x2.67d558p+4 : 0xf.6b0e005e95ad8p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 upward binary64 0x2.67d558p+4 : 0xf.6b0e005e95aep-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 downward intel96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 tonearest intel96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 inexact-ok

> += y0 towardzero intel96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 upward intel96 0x2.67d558p+4 : 0xf.6b0e005e95ad82bp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 downward m68k96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 tonearest m68k96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 inexact-ok

> += y0 towardzero m68k96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 upward m68k96 0x2.67d558p+4 : 0xf.6b0e005e95ad82bp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 downward binary128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b86p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 tonearest binary128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b86p-16 : xfail:binary64 xfail:intel96 inexact-ok

> += y0 towardzero binary128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b86p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 upward binary128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b868p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 downward ibm128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b8p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 tonearest ibm128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b8p-16 : xfail:binary64 xfail:intel96 inexact-ok

> += y0 towardzero ibm128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b8p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y0 upward ibm128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6bcp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> +y0 0x1.a681cep-1 xfail-rounding:ibm128-libgcc

> += y0 downward binary32 0xd.340e7p-4 : -0xf.ffff8p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 tonearest binary32 0xd.340e7p-4 : -0xf.ffff8p-8 : inexact-ok

> += y0 towardzero binary32 0xd.340e7p-4 : -0xf.ffff7p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 upward binary32 0xd.340e7p-4 : -0xf.ffff7p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 downward binary64 0xd.340e7p-4 : -0xf.ffff7f7e5aba8p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 tonearest binary64 0xd.340e7p-4 : -0xf.ffff7f7e5aba8p-8 : inexact-ok

> += y0 towardzero binary64 0xd.340e7p-4 : -0xf.ffff7f7e5abap-8 : xfail:ibm128-libgcc inexact-ok

> += y0 upward binary64 0xd.340e7p-4 : -0xf.ffff7f7e5abap-8 : xfail:ibm128-libgcc inexact-ok

> += y0 downward intel96 0xd.340e7p-4 : -0xf.ffff7f7e5aba736p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 tonearest intel96 0xd.340e7p-4 : -0xf.ffff7f7e5aba736p-8 : inexact-ok

> += y0 towardzero intel96 0xd.340e7p-4 : -0xf.ffff7f7e5aba735p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 upward intel96 0xd.340e7p-4 : -0xf.ffff7f7e5aba735p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 downward m68k96 0xd.340e7p-4 : -0xf.ffff7f7e5aba736p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 tonearest m68k96 0xd.340e7p-4 : -0xf.ffff7f7e5aba736p-8 : inexact-ok

> += y0 towardzero m68k96 0xd.340e7p-4 : -0xf.ffff7f7e5aba735p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 upward m68k96 0xd.340e7p-4 : -0xf.ffff7f7e5aba735p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 downward binary128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25de8p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 tonearest binary128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25dep-8 : inexact-ok

> += y0 towardzero binary128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25dep-8 : xfail:ibm128-libgcc inexact-ok

> += y0 upward binary128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25dep-8 : xfail:ibm128-libgcc inexact-ok

> += y0 downward ibm128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b26p-8 : xfail:ibm128-libgcc inexact-ok

> += y0 tonearest ibm128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25cp-8 : inexact-ok

> += y0 towardzero ibm128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25cp-8 : xfail:ibm128-libgcc inexact-ok

> += y0 upward ibm128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25cp-8 : xfail:ibm128-libgcc inexact-ok

>  y0 min

>  = y0 downward binary32 0x4p-128 : -0x3.7ac89cp+4 : inexact-ok

>  = y0 tonearest binary32 0x4p-128 : -0x3.7ac89cp+4 : inexact-ok

> diff --git a/math/auto-libm-test-out-y1 b/math/auto-libm-test-out-y1

> index af68e6c05a..ab920d1542 100644

> --- a/math/auto-libm-test-out-y1

> +++ b/math/auto-libm-test-out-y1

> @@ -795,6 +795,81 @@ y1 0x1p-10000

>  = y1 tonearest binary128 0x1p-10000 : -0xa.2f9836e4e441529fc2757d1f535p+9996 : inexact-ok

>  = y1 towardzero binary128 0x1p-10000 : -0xa.2f9836e4e441529fc2757d1f5348p+9996 : inexact-ok

>  = y1 upward binary128 0x1p-10000 : -0xa.2f9836e4e441529fc2757d1f5348p+9996 : inexact-ok

> +y1 0x1.065194p+7 xfail-rounding:binary64 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc

> += y1 downward binary32 0x8.328cap+4 : -0x3.2fe874p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 tonearest binary32 0x8.328cap+4 : -0x3.2fe87p-16 : inexact-ok

> += y1 towardzero binary32 0x8.328cap+4 : -0x3.2fe87p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 upward binary32 0x8.328cap+4 : -0x3.2fe87p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 downward binary64 0x8.328cap+4 : -0x3.2fe87001a1df4p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 tonearest binary64 0x8.328cap+4 : -0x3.2fe87001a1df4p-16 : inexact-ok

> += y1 towardzero binary64 0x8.328cap+4 : -0x3.2fe87001a1df2p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 upward binary64 0x8.328cap+4 : -0x3.2fe87001a1df2p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 downward intel96 0x8.328cap+4 : -0x3.2fe87001a1df3754p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 tonearest intel96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : inexact-ok

> += y1 towardzero intel96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 upward intel96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 downward m68k96 0x8.328cap+4 : -0x3.2fe87001a1df3754p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 tonearest m68k96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : inexact-ok

> += y1 towardzero m68k96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 upward m68k96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 downward binary128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecb2p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 tonearest binary128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecb2p-16 : inexact-ok

> += y1 towardzero binary128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecbp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 upward binary128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecbp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 downward ibm128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aedp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 tonearest ibm128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aedp-16 : inexact-ok

> += y1 towardzero ibm128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> += y1 upward ibm128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok

> +y1 0x1.c1badep+0 xfail-rounding:ibm128-libgcc

> += y1 downward binary32 0x1.c1badep+0 : -0x3.ff6378p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 tonearest binary32 0x1.c1badep+0 : -0x3.ff6374p-4 : inexact-ok

> += y1 towardzero binary32 0x1.c1badep+0 : -0x3.ff6374p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 upward binary32 0x1.c1badep+0 : -0x3.ff6374p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 downward binary64 0x1.c1badep+0 : -0x3.ff6375c9f4404p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 tonearest binary64 0x1.c1badep+0 : -0x3.ff6375c9f4402p-4 : inexact-ok

> += y1 towardzero binary64 0x1.c1badep+0 : -0x3.ff6375c9f4402p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 upward binary64 0x1.c1badep+0 : -0x3.ff6375c9f4402p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 downward intel96 0x1.c1badep+0 : -0x3.ff6375c9f440230cp-4 : xfail:ibm128-libgcc inexact-ok

> += y1 tonearest intel96 0x1.c1badep+0 : -0x3.ff6375c9f440230cp-4 : inexact-ok

> += y1 towardzero intel96 0x1.c1badep+0 : -0x3.ff6375c9f4402308p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 upward intel96 0x1.c1badep+0 : -0x3.ff6375c9f4402308p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 downward m68k96 0x1.c1badep+0 : -0x3.ff6375c9f440230cp-4 : xfail:ibm128-libgcc inexact-ok

> += y1 tonearest m68k96 0x1.c1badep+0 : -0x3.ff6375c9f440230cp-4 : inexact-ok

> += y1 towardzero m68k96 0x1.c1badep+0 : -0x3.ff6375c9f4402308p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 upward m68k96 0x1.c1badep+0 : -0x3.ff6375c9f4402308p-4 : xfail:ibm128-libgcc inexact-ok

> += y1 downward binary128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1ebp-4 : xfail:ibm128-libgcc inexact-ok

> += y1 tonearest binary128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1ebp-4 : inexact-ok

> += y1 towardzero binary128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1eaep-4 : xfail:ibm128-libgcc inexact-ok

> += y1 upward binary128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1eaep-4 : xfail:ibm128-libgcc inexact-ok

> += y1 downward ibm128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1fp-4 : xfail:ibm128-libgcc inexact-ok

> += y1 tonearest ibm128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1fp-4 : inexact-ok

> += y1 towardzero ibm128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1ep-4 : xfail:ibm128-libgcc inexact-ok

> += y1 upward ibm128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1ep-4 : xfail:ibm128-libgcc inexact-ok

> +y1 0x1.c1bc2ep+0

> += y1 downward binary32 0x1.c1bc2ep+0 : -0x3.ff56acp-4 : inexact-ok

> += y1 tonearest binary32 0x1.c1bc2ep+0 : -0x3.ff56a8p-4 : inexact-ok

> += y1 towardzero binary32 0x1.c1bc2ep+0 : -0x3.ff56a8p-4 : inexact-ok

> += y1 upward binary32 0x1.c1bc2ep+0 : -0x3.ff56a8p-4 : inexact-ok

> += y1 downward binary64 0x1.c1bc2ep+0 : -0x3.ff56a991ca78ap-4 : inexact-ok

> += y1 tonearest binary64 0x1.c1bc2ep+0 : -0x3.ff56a991ca788p-4 : inexact-ok

> += y1 towardzero binary64 0x1.c1bc2ep+0 : -0x3.ff56a991ca788p-4 : inexact-ok

> += y1 upward binary64 0x1.c1bc2ep+0 : -0x3.ff56a991ca788p-4 : inexact-ok

> += y1 downward intel96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd8p-4 : inexact-ok

> += y1 tonearest intel96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok

> += y1 towardzero intel96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok

> += y1 upward intel96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok

> += y1 downward m68k96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd8p-4 : inexact-ok

> += y1 tonearest m68k96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok

> += y1 towardzero m68k96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok

> += y1 upward m68k96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok

> += y1 downward binary128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd5084138201442p-4 : inexact-ok

> += y1 tonearest binary128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd5084138201442p-4 : inexact-ok

> += y1 towardzero binary128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd508413820144p-4 : inexact-ok

> += y1 upward binary128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd508413820144p-4 : inexact-ok

> += y1 downward ibm128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd50841382015p-4 : inexact-ok

> += y1 tonearest ibm128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd50841382014p-4 : inexact-ok

> += y1 towardzero ibm128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd50841382014p-4 : inexact-ok

> += y1 upward ibm128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd50841382014p-4 : inexact-ok

>  y1 min

>  = y1 downward binary32 0x4p-128 : -0x2.8be61p+124 : inexact-ok

>  = y1 tonearest binary32 0x4p-128 : -0x2.8be60cp+124 : inexact-ok

> diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps

> index 21ff7dc5ce..a0329dea63 100644

> --- a/sysdeps/aarch64/libm-test-ulps

> +++ b/sysdeps/aarch64/libm-test-ulps

> @@ -1066,44 +1066,44 @@ double: 1

>  ldouble: 1

>  

>  Function: "j0":

> -double: 2

> -float: 8

> +double: 3

> +float: 9

>  ldouble: 2

>  

>  Function: "j0_downward":

> -double: 2

> -float: 4

> -ldouble: 4

> +double: 6

> +float: 9

> +ldouble: 9

>  

>  Function: "j0_towardzero":

> -double: 5

> -float: 6

> -ldouble: 4

> +double: 7

> +float: 9

> +ldouble: 9

>  

>  Function: "j0_upward":

> -double: 4

> -float: 5

> -ldouble: 5

> +double: 9

> +float: 8

> +ldouble: 7

>  

>  Function: "j1":

> -double: 2

> -float: 8

> +double: 4

> +float: 9

>  ldouble: 4

>  

>  Function: "j1_downward":

>  double: 3

> -float: 5

> -ldouble: 4

> +float: 8

> +ldouble: 6

>  

>  Function: "j1_towardzero":

> -double: 3

> -float: 2

> -ldouble: 4

> +double: 4

> +float: 8

> +ldouble: 9

>  

>  Function: "j1_upward":

> -double: 3

> -float: 4

> -ldouble: 3

> +double: 9

> +float: 9

> +ldouble: 9

>  

>  Function: "jn":

>  double: 4

> @@ -1364,42 +1364,42 @@ ldouble: 4

>  

>  Function: "y0":

>  double: 2

> -float: 6

> +float: 8

>  ldouble: 3

>  

>  Function: "y0_downward":

>  double: 3

> -float: 4

> -ldouble: 4

> +float: 8

> +ldouble: 7

>  

>  Function: "y0_towardzero":

>  double: 3

> -float: 3

> +float: 8

>  ldouble: 3

>  

>  Function: "y0_upward":

>  double: 2

> -float: 5

> -ldouble: 3

> +float: 8

> +ldouble: 4

>  

>  Function: "y1":

>  double: 3

> -float: 2

> -ldouble: 2

> +float: 9

> +ldouble: 5

>  

>  Function: "y1_downward":

> -double: 3

> -float: 2

> -ldouble: 4

> +double: 6

> +float: 8

> +ldouble: 5

>  

>  Function: "y1_towardzero":

>  double: 3

> -float: 2

> +float: 9

>  ldouble: 2

>  

>  Function: "y1_upward":

> -double: 5

> -float: 2

> +double: 6

> +float: 9

>  ldouble: 5

>  

>  Function: "yn":


Ok.

> diff --git a/sysdeps/ieee754/flt-32/e_j0f.c b/sysdeps/ieee754/flt-32/e_j0f.c

> index 5d29611eb7..b4a4ede0f1 100644

> --- a/sysdeps/ieee754/flt-32/e_j0f.c

> +++ b/sysdeps/ieee754/flt-32/e_j0f.c

> @@ -16,7 +16,9 @@

>  #include <math.h>

>  #include <math-barriers.h>

>  #include <math_private.h>

> +#include <fenv_private.h>

>  #include <libm-alias-finite.h>

> +#include <reduce_aux.h>

>  

>  static float pzerof(float), qzerof(float);

>  

> @@ -37,6 +39,218 @@ S04  =  1.1661400734e-09; /* 0x30a045e8 */

>  

>  static const float zero = 0.0;

>  

> +/* This is the nearest approximation of the first zero of j0.  */

> +#define FIRST_ZERO_J0 0xf.26247p-28f

> +

> +#define SMALL_SIZE 64

> +

> +/* The following table contains successive zeros of j0 and degree-3

> +   polynomial approximations of j0 around these zeros: Pj[0] for the first

> +   zero (2.40482), Pj[1] for the second one (5.520078), and so on.

> +   Each line contains:

> +              {x0, xmid, x1, p0, p1, p2, p3}

> +   where [x0,x1] is the interval around the zero, xmid is the binary32 number

> +   closest to the zero, and p0+p1*x+p2*x^2+p3*x^3 is the approximation

> +   polynomial.  Each polynomial was generated using Sollya on the interval

> +   [x0,x1] around the corresponding zero where the error exceeds 9 ulps

> +   for the alternate code.  Degree 3 is enough to get an error <= 9 ulps.

> +*/

> +static const float Pj[SMALL_SIZE][7] = {

> +  /* The following polynomial was optimized by hand with respect to the one

> +     generated by Sollya, to ensure the maximal error is at most 9 ulps,

> +     both if the polynomial is evaluated with fma or not.  */

> +  { 0x1.31e5c4p+1, 0x1.33d152p+1, 0x1.3b58dep+1, 0xf.2623fp-28,

> +    -0x8.4e6d7p-4, 0x1.ba2aaap-4, 0xe.4b9ap-8 }, /* 0 */

> +  { 0x1.60eafap+2, 0x1.6148f6p+2, 0x1.62955cp+2, 0x6.9205fp-28,

> +    0x5.71b98p-4, -0x7.e3e798p-8, -0xd.87d1p-8 }, /* 1 */

> +  { 0x1.14cde2p+3, 0x1.14eb56p+3, 0x1.1525c6p+3, 0x1.bcc1cap-24,

> +    -0x4.57de6p-4, 0x4.03e7cp-8, 0xb.39a37p-8 }, /* 2 */

> +  { 0x1.7931d8p+3, 0x1.79544p+3, 0x1.7998d6p+3, -0xf.2976fp-32,

> +    0x3.b827ccp-4, -0x2.8603ep-8, -0x9.bf49bp-8 }, /* 3 */

> +  { 0x1.ddb6d4p+3, 0x1.ddca14p+3, 0x1.ddf0c8p+3, -0x1.bd67d8p-28,

> +    -0x3.4e03ap-4, 0x1.c562a2p-8, 0x8.90ec2p-8 }, /* 4 */

> +  { 0x1.2118e4p+4, 0x1.212314p+4, 0x1.21375p+4, 0x1.62209cp-28,

> +    0x3.00efecp-4, -0x1.5458dap-8, -0x8.10063p-8 }, /* 5 */

> +  { 0x1.535d28p+4, 0x1.5362dep+4, 0x1.536e48p+4, -0x2.853f74p-24,

> +    -0x2.c5b274p-4, 0x1.0b9db4p-8, 0x7.8c3578p-8 }, /* 6 */

> +  { 0x1.859ddp+4, 0x1.85a3bap+4, 0x1.85aff4p+4, 0x2.19ed1cp-24,

> +    0x2.96545cp-4, -0xd.997e6p-12, -0x6.d9af28p-8 }, /* 7 */

> +  { 0x1.b7decap+4, 0x1.b7e54ap+4, 0x1.b7f038p+4, 0xe.959aep-28,

> +    -0x2.6f5594p-4, 0xb.538dp-12, 0x7.003ea8p-8 }, /* 8 */

> +  { 0x1.ea21c6p+4, 0x1.ea275ap+4, 0x1.ea337ap+4, 0x2.0c3964p-24,

> +    0x2.4e80fcp-4, -0x9.a2216p-12, -0x6.61e0a8p-8 }, /* 9 */

> +  { 0x1.0e3316p+5, 0x1.0e34e2p+5, 0x1.0e379ap+5, -0x3.642554p-24,

> +    -0x2.325e48p-4, 0x8.4f49cp-12, 0x7.d37c3p-8 }, /* 10 */

> +  { 0x1.275456p+5, 0x1.275638p+5, 0x1.2759e2p+5, 0x1.6c015ap-24,

> +    0x2.19e7d8p-4, -0x7.4c1bf8p-12, -0x4.af7ef8p-8 }, /* 11 */

> +  { 0x1.4075ecp+5, 0x1.4077a8p+5, 0x1.407b96p+5, -0x4.b18c9p-28,

> +    -0x2.046174p-4, 0x6.705618p-12, 0x5.f2d28p-8 }, /* 12 */

> +  { 0x1.59973p+5, 0x1.59992cp+5, 0x1.599b2ap+5, -0x1.8b8792p-24,

> +    0x1.f13fbp-4, -0x5.c14938p-12, -0x5.73e0cp-8 }, /* 13 */

> +  { 0x1.72b958p+5, 0x1.72bacp+5, 0x1.72bc5ap+5, 0x3.a26e0cp-24,

> +    -0x1.e018dap-4, 0x5.30e8dp-12, 0x2.81099p-8 }, /* 14 */

> +  { 0x1.8bdb4ap+5, 0x1.8bdc62p+5, 0x1.8bde7ep+5, -0x2.18fabcp-24,

> +    0x1.d09b22p-4, -0x4.b0b688p-12, -0x5.5fd308p-8 }, /* 15 */

> +  { 0x1.a4fcecp+5, 0x1.a4fe0ep+5, 0x1.a50042p+5, 0x3.2370e8p-24,

> +    -0x1.c28614p-4, 0x4.4647e8p-12, 0x5.68a28p-8 }, /* 16 */

> +  { 0x1.be1ebcp+5, 0x1.be1fc4p+5, 0x1.be21fp+5, -0x5.9eae3p-28,

> +    0x1.b5a622p-4, -0x3.eb9054p-12, -0x5.12d8cp-8 }, /* 17 */

> +  { 0x1.d7405p+5, 0x1.d7418p+5, 0x1.d74294p+5, 0x2.9fa1e8p-24,

> +    -0x1.a9d184p-4, 0x3.9d1e7p-12, 0x4.33d058p-8 }, /* 18 */

> +  { 0x1.f0624p+5, 0x1.f06344p+5, 0x1.f0645ep+5, 0x9.9ac67p-28,

> +    0x1.9ee5eep-4, -0x3.5816e8p-12, -0x2.6e5004p-8 }, /* 19 */

> +  { 0x1.04c22ep+6, 0x1.04c286p+6, 0x1.04c316p+6, 0xd.6ab94p-28,

> +    -0x1.94c6f6p-4, 0x3.174efcp-12, 0x7.9a092p-8 }, /* 20 */

> +  { 0x1.1153p+6, 0x1.11536cp+6, 0x1.11541p+6, -0x4.4cb2d8p-24,

> +    0x1.8b5cccp-4, -0x2.e3c238p-12, -0x4.e5437p-8 }, /* 21 */

> +  { 0x1.1de3d8p+6, 0x1.1de456p+6, 0x1.1de4dap+6, -0x4.4aa8c8p-24,

> +    -0x1.829356p-4, 0x2.b45124p-12, 0x5.baf638p-8 }, /* 22 */

> +  { 0x1.2a74f8p+6, 0x1.2a754p+6, 0x1.2a75bp+6, 0x2.077c38p-24,

> +    0x1.7a597ep-4, -0x2.8a0414p-12, -0x2.838d3p-8 }, /* 23 */

> +  { 0x1.3705d4p+6, 0x1.37062cp+6, 0x1.3706b2p+6, -0x2.6a6cd8p-24,

> +    -0x1.72a09ap-4, 0x2.623a3cp-12, 0x5.5256a8p-8 }, /* 24 */

> +  { 0x1.4396dp+6, 0x1.439718p+6, 0x1.43976ep+6, -0x5.08287p-24,

> +    0x1.6b5c06p-4, -0x2.3da154p-12, -0x7.a2254p-8 }, /* 25 */

> +  { 0x1.5027acp+6, 0x1.502808p+6, 0x1.50288cp+6, -0x3.4598dcp-24,

> +    -0x1.6480c4p-4, 0x2.1cb944p-12, 0x7.27c77p-8 }, /* 26 */

> +  { 0x1.5cb89ap+6, 0x1.5cb8f8p+6, 0x1.5cb97ep+6, 0x5.4e74bp-24,

> +    0x1.5e0544p-4, -0x2.00b158p-12, -0x5.9bc4a8p-8 }, /* 27 */

> +  { 0x1.69498cp+6, 0x1.6949e8p+6, 0x1.694a42p+6, -0x2.05751cp-24,

> +    -0x1.57e12p-4, 0x1.e78edcp-12, 0x9.9667dp-8 }, /* 28 */

> +  { 0x1.75da7ep+6, 0x1.75dadap+6, 0x1.75db3p+6, 0x4.c5e278p-24,

> +    0x1.520ceep-4, -0x1.d0127cp-12, -0xd.62681p-8 }, /* 29 */

> +  { 0x1.826b7ep+6, 0x1.826bccp+6, 0x1.826c2cp+6, -0x3.50e62cp-24,

> +    -0x1.4c822p-4, 0x1.ba5832p-12, -0x1.eb2ee2p-8 }, /* 30 */

> +  { 0x1.8efc84p+6, 0x1.8efcbep+6, 0x1.8efd16p+6, -0x1.c39f38p-24,

> +    0x1.473ae6p-4, -0x1.a616c8p-12, 0xf.f352ap-12 }, /* 31 */

> +  { 0x1.9b8d84p+6, 0x1.9b8db2p+6, 0x1.9b8e7p+6, -0x1.9245b6p-28,

> +    -0x1.42320ap-4, 0x1.932a04p-12, 0x2.dc113cp-8 }, /* 32 */

> +  { 0x1.a81e72p+6, 0x1.a81ea6p+6, 0x1.a81f04p+6, -0x1.0acf8p-24,

> +    0x1.3d62e6p-4, -0x1.7c4b14p-12, -0x1.cfc5c2p-4 }, /* 33 */

> +  { 0x1.b4af6ap+6, 0x1.b4af9ap+6, 0x1.b4afeep+6, 0x4.cd92d8p-24,

> +    -0x1.38c94ap-4, 0x1.643154p-12, 0x1.4c2a06p-4 }, /* 34 */

> +  { 0x1.c1406p+6, 0x1.c1409p+6, 0x1.c140cp+6, -0x1.37bf8ap-24,

> +    0x1.34617p-4, -0x1.5f504ap-12, -0x1.e2d324p-4 }, /* 35 */

> +  { 0x1.cdd154p+6, 0x1.cdd186p+6, 0x1.cdd1eap+6, -0x1.8f62dep-28,

> +    -0x1.3027fp-4, 0x1.534a02p-12, 0x2.c7f144p-12 }, /* 36 */

> +  { 0x1.da6248p+6, 0x1.da627cp+6, 0x1.da62e6p+6, -0x9.81e79p-28,

> +    0x1.2c19b4p-4, -0x1.4b8288p-12, 0x7.2d8bap-8 }, /* 37 */

> +  { 0x1.e6f33ep+6, 0x1.e6f372p+6, 0x1.e6f3a8p+6, 0x3.103b3p-24,

> +    -0x1.2833eep-4, 0x1.36f4d2p-12, 0x9.29f91p-8 }, /* 38 */

> +  { 0x1.f38434p+6, 0x1.f3846ap+6, 0x1.f384d8p+6, 0x2.07b058p-24,

> +    0x1.24740ap-4, -0x1.2ee58ap-12, 0xd.f1393p-12 }, /* 39 */

> +  { 0x1.000a98p+7, 0x1.000abp+7, 0x1.000ac8p+7, 0x3.87576cp-24,

> +    -0x1.20d7b6p-4, 0x1.2083e2p-12, 0x3.9a7aap-8 }, /* 40 */

> +  { 0x1.06531p+7, 0x1.06532cp+7, 0x1.065348p+7, -0x1.691ecp-24,

> +    0x1.1d5ccap-4, -0x1.166726p-12, -0x1.e4af48p-8 }, /* 41 */

> +  { 0x1.0c9b9ap+7, 0x1.0c9ba8p+7, 0x1.0c9bbep+7, 0x9.b406dp-28,

> +    -0x1.1a015p-4, 0x1.038f9cp-12, -0x4.021058p-4 }, /* 42 */

> +  { 0x1.12e412p+7, 0x1.12e424p+7, 0x1.12e436p+7, -0xf.bfd8fp-28,

> +    0x1.16c37ap-4, -0x1.039edep-12, 0x1.f0033p-4 }, /* 43 */

> +  { 0x1.192c92p+7, 0x1.192cap+7, 0x1.192cb6p+7, 0x2.6d50c8p-24,

> +    -0x1.13a19ep-4, 0xf.9df8ap-16, 0x4.ecd978p-8 }, /* 44 */

> +  { 0x1.1f7512p+7, 0x1.1f751cp+7, 0x1.1f753ap+7, -0x4.d475c8p-24,

> +    0x1.109a32p-4, -0x1.04fb3ap-12, -0xd.c271p-12 }, /* 45 */

> +  { 0x1.25bd8ep+7, 0x1.25bd98p+7, 0x1.25bdap+7, 0x8.1982p-24,

> +    -0x1.0dabc8p-4, 0xe.88eabp-16, -0x4.ed75dp-4 }, /* 46 */

> +  { 0x1.2c060cp+7, 0x1.2c0616p+7, 0x1.2c0644p+7, 0x4.864518p-24,

> +    0x1.0ad51p-4, -0xe.27196p-16, 0xb.97a3ep-8 }, /* 47 */

> +  { 0x1.324e86p+7, 0x1.324e92p+7, 0x1.324e9ep+7, 0x6.8917a8p-28,

> +    -0x1.0814d4p-4, 0xd.4fe7ep-16, -0x6.8d8d6p-4 }, /* 48 */

> +  { 0x1.389702p+7, 0x1.38970ep+7, 0x1.389728p+7, -0x5.fa18fp-24,

> +    0x1.0569fp-4, -0xd.5b0d4p-16, 0x1.50353ap-4 }, /* 49 */

> +  { 0x1.3edf84p+7, 0x1.3edf8cp+7, 0x1.3edfaap+7, -0x4.0e5c98p-24,

> +    -0x1.02d354p-4, 0xb.7b255p-16, 0x7.8a916p-4 }, /* 50 */

> +  { 0x1.4527fp+7, 0x1.452808p+7, 0x1.452812p+7, -0x2.c3ddbp-24,

> +    0x1.005004p-4, -0xd.7729cp-16, -0x3.bcc354p-8 }, /* 51 */

> +  { 0x1.4b7076p+7, 0x1.4b7086p+7, 0x1.4b70a4p+7, -0x5.d052p-24,

> +    -0xf.ddf16p-8, 0xc.318c1p-16, 0x5.7947p-8 }, /* 52 */

> +  { 0x1.51b8f4p+7, 0x1.51b902p+7, 0x1.51b90ep+7, -0x2.0b97dcp-24,

> +    0xf.b7fafp-8, -0xc.1429dp-16, -0x3.43c36p-4 }, /* 53 */

> +  { 0x1.580168p+7, 0x1.58018p+7, 0x1.580188p+7, -0x5.4aab5p-24,

> +    -0xf.930fep-8, 0xa.ecc24p-16, 0x9.c62cdp-12 }, /* 54 */

> +  { 0x1.5e49eap+7, 0x1.5e49fcp+7, 0x1.5e4a12p+7, -0x3.6dadd8p-24,

> +    0xf.6f245p-8, -0xb.6816cp-16, 0xa.d731ap-8 }, /* 55 */

> +  { 0x1.649272p+7, 0x1.64927ap+7, 0x1.64929p+7, -0x2.d7e038p-24,

> +    -0xf.4c2cep-8, 0xb.118bep-16, 0xb.69a4ep-8 }, /* 56 */

> +  { 0x1.6adae6p+7, 0x1.6adaf6p+7, 0x1.6adb04p+7, -0x6.977a1p-24,

> +    0xf.2a1fp-8, -0xa.a8911p-16, -0x4.bf6d2p-8 }, /* 57 */

> +  { 0x1.712366p+7, 0x1.712374p+7, 0x1.71238ep+7, 0x1.3cc95ep-24,

> +    -0xf.08f0ap-8, 0x9.f0858p-16, 0x1.77f7f4p-4 }, /* 58 */

> +  { 0x1.776beap+7, 0x1.776bf2p+7, 0x1.776bfap+7, 0x3.a4921p-24,

> +    0xe.e8986p-8, -0xa.39dfp-16, -0x6.7ba3dp-4 }, /* 59 */

> +  { 0x1.7db464p+7, 0x1.7db46ep+7, 0x1.7db476p+7, 0x6.b45a7p-24,

> +    -0xe.c90d8p-8, 0xa.e586fp-16, -0x1.d66becp-4 }, /* 60 */

> +  { 0x1.83fce2p+7, 0x1.83fcecp+7, 0x1.83fd0ep+7, -0x2.8f34a4p-24,

> +    0xe.aa478p-8, -0x9.810bp-16, -0x3.a5f3fcp-8 }, /* 61 */

> +  { 0x1.8a455cp+7, 0x1.8a456ap+7, 0x1.8a4588p+7, -0x1.325968p-24,

> +    -0xe.8c3eap-8, 0x9.0a765p-16, 0x1.29a54ap-4 }, /* 62 */

> +  { 0x1.908dd8p+7, 0x1.908de8p+7, 0x1.908df4p+7, 0x4.96b808p-24,

> +    0xe.6eeb5p-8, -0x9.0251bp-16, 0x1.41a488p-4 }, /* 63 */

> +};

> +

> +/* Formula page 5 of https://www.cl.cam.ac.uk/~jrh13/papers/bessel.pdf:

> +   j0f(x) ~ sqrt(2/(pi*x))*beta0(x)*cos(x-pi/4-alpha0(x))

> +   where beta0(x) = 1 - 1/(16*x^2) + 53/(512*x^4)

> +   and alpha0(x) = 1/(8*x) - 25/(384*x^3).  */

> +static float

> +j0f_asympt (float x)

> +{

> +  /* The following code fails to give an error <= 9 ulps in only two cases,

> +     for which we tabulate the result.  */

> +  if (x == 0x1.4665d2p+24f)

> +    return 0xa.50206p-52f;

> +  if (x == 0x1.a9afdep+7f)

> +    return 0xf.47039p-28f;

> +  double y = 1.0 / (double) x;

> +  double y2 = y * y;

> +  double beta0 = 1.0f + y2 * (-0x1p-4f + 0x1.a8p-4 * y2);

> +  double alpha0 = y * (0x2p-4 - 0x1.0aaaaap-4 * y2);

> +  double h;

> +  int n;

> +  h = reduce_aux (x, &n, alpha0);

> +  /* Now x - pi/4 - alpha0 = h + n*pi/2 mod (2*pi).  */

> +  float xr = (float) h;

> +  n = n & 3;

> +  float cst = 0xc.c422ap-4; /* sqrt(2/pi) rounded to nearest */

> +  float t = cst / sqrtf (x) * (float) beta0;

> +  if (n == 0)

> +    return t * __cosf (xr);

> +  else if (n == 2) /* cos(x+pi) = -cos(x) */

> +    return -t * __cosf (xr);

> +  else if (n == 1) /* cos(x+pi/2) = -sin(x) */

> +    return -t * __sinf (xr);

> +  else             /* cos(x+3pi/2) = sin(x) */

> +    return t * __sinf (xr);


Double space before comment close.

> +}

> +

> +/* Special code for x near a root of j0.

> +   z is the value computed by the generic code.

> +   For small x, we use a polynomial approximating j0 around its root.

> +   For large x, we use an asymptotic formula (j0f_asympt).  */

> +static float

> +j0f_near_root (float x, float z)

> +{

> +  float index_f;

> +  int index;

> +

> +  index_f = roundf ((x - FIRST_ZERO_J0) / (float) M_PI);

> +  /* j0f_asympt fails to give an error <= 9 ulps for x=0x1.324e92p+7

> +     (index 48) thus we can't reduce SMALL_SIZE below 49.  */

> +  if (index_f >= SMALL_SIZE)

> +    return j0f_asympt (x);

> +  index = (int) index_f;

> +  const float *p = Pj[index];

> +  float x0 = p[0];

> +  float x1 = p[2];

> +  /* If not in the interval [x0,x1] around xmid, we return the value z.  */

> +  if (! (x0 <= x && x <= x1))

> +    return z;

> +  float xmid = p[1];

> +  float y = x - xmid;

> +  return p[3] + y * (p[4] + y * (p[5] + y * p[6]));

> +}

> +

>  float

>  __ieee754_j0f(float x)

>  {

> @@ -48,39 +262,35 @@ __ieee754_j0f(float x)

>  	if(ix>=0x7f800000) return one/(x*x);

>  	x = fabsf(x);

>  	if(ix >= 0x40000000) {	/* |x| >= 2.0 */

> +                SET_RESTORE_ROUNDF (FE_TONEAREST);

>  		__sincosf (x, &s, &c);

>  		ss = s-c;

>  		cc = s+c;

> -		if(ix<0x7f000000) {  /* make sure x+x not overflow */

> -		    z = -__cosf(x+x);

> -		    if ((s*c)<zero) cc = z/ss;

> -		    else	    ss = z/cc;

> -		} else {

> -		    /* We subtract (exactly) a value x0 such that

> -		       cos(x0)+sin(x0) is very near to 0, and use the identity

> -		       sin(x-x0) = sin(x)*cos(x0)-cos(x)*sin(x0) to get

> -		       sin(x) + cos(x) with extra accuracy.  */

> -		    float x0 = 0xe.d4108p+124f;

> -		    float y = x - x0; /* exact  */

> -		    /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0)  */

> -		    z = __sinf (y);

> -		    float eps = 0x1.5f263ep-24f;

> -		    /* cos(x0) ~ -sin(x0) + eps  */

> -		    z += eps * __cosf (x);

> -		    /* now z ~ (sin(x)-cos(x))*cos(x0)  */

> -		    float cosx0 = -0xb.504f3p-4f;

> -		    cc = z / cosx0;

> -                }

> +                if (ix >= 0x7f000000)

> +		  /* x >= 2^127: use asymptotic expansion.  */

> +                  return j0f_asympt (x);

> +                /* Now we are sure that x+x cannot overflow.  */

> +                z = -__cosf(x+x);

> +                if ((s*c)<zero) cc = z/ss;

> +                else	    ss = z/cc;

>  	/*

>  	 * j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)

>  	 * y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)

>  	 */

> -		if(ix>0x5c000000) z = (invsqrtpi*cc)/sqrtf(x);

> -		else {

> -		    u = pzerof(x); v = qzerof(x);

> -		    z = invsqrtpi*(u*cc-v*ss)/sqrtf(x);

> -		}

> -		return z;

> +                if (ix <= 0x5c000000)

> +                  {

> +                    u = pzerof(x); v = qzerof(x);

> +                    cc = u*cc-v*ss;

> +                  }

> +                z = (invsqrtpi * cc) / sqrtf(x);

> +                /* The following threshold is optimal: for x=0x1.3b58dep+1

> +                   and rounding upwards, |cc|=0x1.579b26p-4 and z is 10 ulps

> +                   far from the correctly rounded value.  */

> +                float threshold = 0x1.579b26p-4;

> +                if (fabsf (cc) > threshold)

> +                  return z;

> +                else

> +                  return j0f_near_root (x, z);

>  	}

>  	if(ix<0x39000000) {	/* |x| < 2**-13 */

>  	    math_force_eval(huge+x);		/* raise inexact if x != 0 */

> @@ -112,6 +322,219 @@ v02  =  7.6006865129e-05, /* 0x389f65e0 */

>  v03  =  2.5915085189e-07, /* 0x348b216c */

>  v04  =  4.4111031494e-10; /* 0x2ff280c2 */

>  

> +/* This is the nearest approximation of the first zero of y0.  */

> +#define FIRST_ZERO_Y0 0xe.4c166p-4f

> +

> +/* The following table contains successive zeros of y0 and degree-3

> +   polynomial approximations of y0 around these zeros: Py[0] for the first

> +   zero (0.89358), Py[1] for the second one (3.957678), and so on.

> +   Each line contains:

> +              {x0, xmid, x1, p0, p1, p2, p3}

> +   where [x0,x1] is the interval around the zero, xmid is the binary32 number

> +   closest to the zero, and p0+p1*x+p2*x^2+p3*x^3 is the approximation

> +   polynomial.  Each polynomial was generated using Sollya on the interval

> +   [x0,x1] around the corresponding zero where the error exceeds 9 ulps

> +   for the alternate code.  Degree 3 is enough, except for index 0 where we

> +   use degree 5, and the coefficients of degree 4 and 5 are hard-coded in

> +   y0f_near_root.

> +*/

> +static const float Py[SMALL_SIZE][7] = {

> +  { 0x1.a681dap-1, 0x1.c982ecp-1, 0x1.ef6bcap-1, 0x3.274468p-28,

> +    0xe.121b8p-4, -0x7.df8b3p-4, 0x3.877be4p-4

> +    /*, -0x3.a46c9p-4, 0x3.735478p-4*/ }, /* 0 */

> +  { 0x1.f957c6p+1, 0x1.fa9534p+1, 0x1.fd11b2p+1, 0xa.f1f83p-28,

> +    -0x6.70d098p-4, 0xd.04d48p-8, 0xe.f61a9p-8 }, /* 1 */

> +  { 0x1.c51832p+2, 0x1.c581dcp+2, 0x1.c65164p+2, -0x5.e2a51p-28,

> +    0x4.cd3328p-4, -0x5.6bbe08p-8, -0xc.46d8p-8 }, /* 2 */

> +  { 0x1.46fd84p+3, 0x1.471d74p+3, 0x1.475bfcp+3, -0x1.4b0aeep-24,

> +    -0x3.fec6b8p-4, 0x3.2068a4p-8, 0xa.76e2dp-8 }, /* 3 */

> +  { 0x1.ab7afap+3, 0x1.ab8e1cp+3, 0x1.abb294p+3, -0x8.179d7p-28,

> +    0x3.7e6544p-4, -0x2.1799fp-8, -0x9.0e1c4p-8 }, /* 4 */

> +  { 0x1.07f9aap+4, 0x1.0803c8p+4, 0x1.08170cp+4, -0x2.5b8078p-24,

> +    -0x3.24b868p-4, 0x1.8631ecp-8, 0x8.3cb46p-8 }, /* 5 */

> +  { 0x1.3a38eap+4, 0x1.3a42cep+4, 0x1.3a4d8ap+4, 0x1.cd304ap-28,

> +    0x2.e189ecp-4, -0x1.2c6954p-8, -0x7.8178ep-8 }, /* 6 */

> +  { 0x1.6c7d42p+4, 0x1.6c833p+4, 0x1.6c99fp+4, -0x6.c63f1p-28,

> +    -0x2.acc9a8p-4, 0xf.09e31p-12, 0x7.0b5ab8p-8 }, /* 7 */

> +  { 0x1.9ebec4p+4, 0x1.9ec47p+4, 0x1.9ed016p+4, 0x1.e53838p-24,

> +    0x2.81f2p-4, -0xc.5ff51p-12, -0x7.05ep-8 }, /* 8 */

> +  { 0x1.d1008ep+4, 0x1.d10644p+4, 0x1.d11262p+4, 0x1.636feep-24,

> +    -0x2.5e40dcp-4, 0xa.6f81dp-12, 0x5.ff6p-8 }, /* 9 */

> +  { 0x1.01a27cp+5, 0x1.01a442p+5, 0x1.01a924p+5, -0x4.04e1bp-28,

> +    0x2.3febd8p-4, -0x8.f11e2p-12, -0x6.0111ap-8 }, /* 10 */

> +  { 0x1.1ac3bcp+5, 0x1.1ac588p+5, 0x1.1ac912p+5, 0x3.6063d8p-24,

> +    -0x2.25baacp-4, 0x7.c93cdp-12, 0x4.e7577p-8 }, /* 11 */

> +  { 0x1.33e508p+5, 0x1.33e6ecp+5, 0x1.33ea1ap+5, -0x3.f39ebcp-24,

> +    0x2.0ed04cp-4, -0x6.d9434p-12, -0x4.fc0b7p-8 }, /* 12 */

> +  { 0x1.4d0666p+5, 0x1.4d0868p+5, 0x1.4d0c14p+5, -0x4.ea23p-28,

> +    -0x1.fa8b4p-4, 0x6.1470e8p-12, 0x5.97f71p-8 }, /* 13 */

> +  { 0x1.6628b8p+5, 0x1.6629f4p+5, 0x1.662e0ep+5, -0x3.5df0c8p-24,

> +    0x1.e8727ep-4, -0x5.76a038p-12, -0x4.ee37c8p-8 }, /* 14 */

> +  { 0x1.7f4a7cp+5, 0x1.7f4b9p+5, 0x1.7f4daap+5, 0x1.1ef09ep-24,

> +    -0x1.d8293ap-4, 0x4.ed8a28p-12, 0x4.d43708p-8 }, /* 15 */

> +  { 0x1.986c5cp+5, 0x1.986d38p+5, 0x1.986f6p+5, 0x1.b70cecp-24,

> +    0x1.c967p-4, -0x4.7a70b8p-12, -0x5.6840e8p-8 }, /* 16 */

> +  { 0x1.b18dcap+5, 0x1.b18ee8p+5, 0x1.b19122p+5, 0x1.abaadcp-24,

> +    -0x1.bbf246p-4, 0x4.1a35bp-12, 0x3.c2d46p-8 }, /* 17 */

> +  { 0x1.caaf86p+5, 0x1.cab0a2p+5, 0x1.cab2fep+5, 0x1.63989ep-24,

> +    0x1.af9cb4p-4, -0x3.c2f2dcp-12, -0x4.cf8108p-8 }, /* 18 */

> +  { 0x1.e3d146p+5, 0x1.e3d262p+5, 0x1.e3d492p+5, -0x1.68a8ecp-24,

> +    -0x1.a4407ep-4, 0x3.7733ecp-12, 0x5.97916p-8 }, /* 19 */

> +  { 0x1.fcf316p+5, 0x1.fcf428p+5, 0x1.fcf59ap+5, 0x1.e1bb5p-24,

> +    0x1.99be74p-4, -0x3.37210cp-12, -0x5.d19bf8p-8 }, /* 20 */

> +  { 0x1.0b0a7cp+6, 0x1.0b0afap+6, 0x1.0b0b9cp+6, -0x5.5bbcfp-24,

> +    -0x1.8ffc9ap-4, 0x2.ffe638p-12, 0x2.ed28e8p-8 }, /* 21 */

> +  { 0x1.179b66p+6, 0x1.179bep+6, 0x1.179d0ap+6, -0x4.9e34a8p-24,

> +    0x1.86e51cp-4, -0x2.cc7a68p-12, -0x3.3642c4p-8 }, /* 22 */

> +  { 0x1.242c5cp+6, 0x1.242ccap+6, 0x1.242d68p+6, 0x1.966706p-24,

> +    -0x1.7e657p-4, 0x2.9aed4cp-12, 0x7.b87a58p-8 }, /* 23 */

> +  { 0x1.30bd62p+6, 0x1.30bdb6p+6, 0x1.30beb2p+6, 0x3.4b3b68p-24,

> +    0x1.766dc2p-4, -0x2.72651cp-12, -0x3.e347f8p-8 }, /* 24 */

> +  { 0x1.3d4e56p+6, 0x1.3d4ea2p+6, 0x1.3d4f2ep+6, 0x6.a99008p-28,

> +    -0x1.6ef07ep-4, 0x2.53aec4p-12, 0x2.9e3d88p-12 }, /* 25 */

> +  { 0x1.49df38p+6, 0x1.49df9p+6, 0x1.49e042p+6, -0x7.a9fa6p-32,

> +    0x1.67e1dap-4, -0x2.324d7p-12, -0xc.0e669p-12 }, /* 26 */

> +  { 0x1.56702ep+6, 0x1.56708p+6, 0x1.567116p+6, -0x5.026808p-24,

> +    -0x1.613798p-4, 0x2.114594p-12, 0x1.a22402p-8 }, /* 27 */

> +  { 0x1.630126p+6, 0x1.63017p+6, 0x1.630226p+6, 0x4.46aa2p-24,

> +    0x1.5ae8c2p-4, -0x1.f4aaa4p-12, -0x3.58593cp-8 }, /* 28 */

> +  { 0x1.6f9234p+6, 0x1.6f926p+6, 0x1.6f92b2p+6, 0x1.5cfccp-24,

> +    -0x1.54ed76p-4, 0x1.dd540ap-12, -0xb.e9429p-12 }, /* 29 */

> +  { 0x1.7c22fep+6, 0x1.7c2352p+6, 0x1.7c23c2p+6, -0xb.4dc4cp-28,

> +    0x1.4f3ebcp-4, -0x1.c463fp-12, -0x1.e94c54p-8 }, /* 30 */

> +  { 0x1.88b412p+6, 0x1.88b444p+6, 0x1.88b50ap+6, 0x3.f5343p-24,

> +    -0x1.49d668p-4, 0x1.a53f24p-12, 0x5.128198p-8 }, /* 31 */

> +  { 0x1.9544dcp+6, 0x1.954538p+6, 0x1.95459p+6, -0x6.e6f32p-28,

> +    0x1.44aefap-4, -0x1.9a6ef8p-12, -0x6.c639cp-8 }, /* 32 */

> +  { 0x1.a1d5fap+6, 0x1.a1d62cp+6, 0x1.a1d674p+6, 0x1.f359c2p-28,

> +    -0x1.3fc386p-4, 0x1.887ebep-12, 0x1.6c813cp-8 }, /* 33 */

> +  { 0x1.ae66cp+6, 0x1.ae672p+6, 0x1.ae6788p+6, -0x2.9de748p-24,

> +    0x1.3b0fa4p-4, -0x1.777f26p-12, 0x1.c128ccp-8 }, /* 34 */

> +  { 0x1.baf7c2p+6, 0x1.baf816p+6, 0x1.baf86cp+6, -0x2.24ccc8p-24,

> +    -0x1.368f5cp-4, 0x1.62bd9ep-12, 0xa.df002p-8 }, /* 35 */

> +  { 0x1.c788dap+6, 0x1.c7890cp+6, 0x1.c7896cp+6, 0x4.7dcea8p-24,

> +    0x1.323f16p-4, -0x1.61abf4p-12, 0xa.ad73ep-8 }, /* 36 */

> +  { 0x1.d419ccp+6, 0x1.d41a02p+6, 0x1.d41a68p+6, -0x4.b538p-24,

> +    -0x1.2e1b98p-4, 0x1.4a4d64p-12, 0x3.a47674p-8 }, /* 37 */

> +  { 0x1.e0aacep+6, 0x1.e0aaf8p+6, 0x1.e0ab5ep+6, 0x3.09dc4cp-24,

> +    0x1.2a21ecp-4, -0x1.3fa20cp-12, 0x2.216e8cp-8 }, /* 38 */

> +  { 0x1.ed3bb8p+6, 0x1.ed3beep+6, 0x1.ed3c56p+6, 0x4.d5a58p-28,

> +    -0x1.264f66p-4, 0x1.32c4cep-12, 0x1.53cbb4p-8 }, /* 39 */

> +  { 0x1.f9ccaep+6, 0x1.f9cce6p+6, 0x1.f9cd52p+6, 0x3.f4c44cp-24,

> +    0x1.22a192p-4, -0x1.1f8514p-12, -0xc.0de32p-8 }, /* 40 */

> +  { 0x1.032ed6p+7, 0x1.032eeep+7, 0x1.032f0cp+7, 0x2.4beae8p-24,

> +    -0x1.1f1634p-4, 0x1.171664p-12, 0x1.72a654p-4 }, /* 41 */

> +  { 0x1.097756p+7, 0x1.09776ap+7, 0x1.09779cp+7, -0xd.a581ep-28,

> +    0x1.1bab3cp-4, -0xf.9f507p-16, -0xc.ba2d4p-8 }, /* 42 */

> +  { 0x1.0fbfdp+7, 0x1.0fbfe6p+7, 0x1.0fbff6p+7, 0xa.7c0bdp-28,

> +    -0x1.185eccp-4, 0x1.01d7dep-12, -0x1.a2186ep-4 }, /* 43 */

> +  { 0x1.160856p+7, 0x1.160862p+7, 0x1.16087ap+7, -0x1.9452ecp-24,

> +    0x1.152f26p-4, -0x1.07b4aap-12, 0x1.6bbd7ep-4 }, /* 44 */

> +  { 0x1.1c50dp+7, 0x1.1c50dep+7, 0x1.1c5118p+7, 0x3.83b7b8p-24,

> +    -0x1.121ab2p-4, 0x1.0e938cp-12, -0x5.1a6dfp-8 }, /* 45 */

> +  { 0x1.22995p+7, 0x1.22995ap+7, 0x1.229976p+7, -0x6.5ca3a8p-24,

> +    0x1.0f1ff2p-4, -0xe.f198p-16, -0x3.8e98b8p-8 }, /* 46 */

> +  { 0x1.28e1ccp+7, 0x1.28e1d8p+7, 0x1.28e1f4p+7, -0x6.bb61ap-24,

> +    -0x1.0c3d8ap-4, 0xf.a14b9p-16, 0x9.81e82p-4 }, /* 47 */

> +  { 0x1.2f2a48p+7, 0x1.2f2a54p+7, 0x1.2f2a74p+7, 0x2.2438p-24,

> +    0x1.097236p-4, -0xd.fed5ep-16, -0x3.19eb5cp-8 }, /* 48 */

> +  { 0x1.3572b8p+7, 0x1.3572dp+7, 0x1.3572ecp+7, 0x3.1e0054p-24,

> +    -0x1.06bcc8p-4, 0xd.d2596p-16, -0x1.67e00ap-4 }, /* 49 */

> +  { 0x1.3bbb3ep+7, 0x1.3bbb4ep+7, 0x1.3bbb6ap+7, 0x7.46c908p-24,

> +    0x1.041c28p-4, -0xd.04045p-16, -0x8.fb297p-8 }, /* 50 */

> +  { 0x1.4203aep+7, 0x1.4203cap+7, 0x1.4203e6p+7, -0xb.4f158p-28,

> +    -0x1.018f52p-4, 0xc.ccf6fp-16, 0x1.4d5dp-4 }, /* 51 */

> +  { 0x1.484c38p+7, 0x1.484c46p+7, 0x1.484c56p+7, -0x6.5a89c8p-24,

> +    0xf.f155p-8, -0xc.5d21dp-16, -0xd.aca34p-8 }, /* 52 */

> +  { 0x1.4e94b8p+7, 0x1.4e94c4p+7, 0x1.4e94d4p+7, -0x1.ef16c8p-24,

> +    -0xf.cad3fp-8, 0xb.d75f8p-16, 0x1.f36732p-4 }, /* 53 */

> +  { 0x1.54dd36p+7, 0x1.54dd4p+7, 0x1.54dd52p+7, -0x6.1e7b68p-24,

> +    0xf.a564cp-8, -0xb.ec1cfp-16, 0xe.e7421p-8 }, /* 54 */

> +  { 0x1.5b25aep+7, 0x1.5b25bep+7, 0x1.5b25d4p+7, -0xf.8c858p-28,

> +    -0xf.80faep-8, 0xb.8b6c5p-16, -0x5.835ed8p-8 }, /* 55 */

> +  { 0x1.616e34p+7, 0x1.616e3cp+7, 0x1.616e4ep+7, 0x7.75d858p-24,

> +    0xf.5d8abp-8, -0xb.b3779p-16, 0x2.40b948p-4 }, /* 56 */

> +  { 0x1.67b6bp+7, 0x1.67b6b8p+7, 0x1.67b6dp+7, 0x1.d78632p-24,

> +    -0xf.3b096p-8, 0xa.daf89p-16, 0x1.aa8548p-8 }, /* 57 */

> +  { 0x1.6dff28p+7, 0x1.6dff36p+7, 0x1.6dff54p+7, 0x3.b24794p-24,

> +    0xf.196c7p-8, -0xb.1afe1p-16, -0x1.77538cp-8 }, /* 58 */

> +  { 0x1.7447a2p+7, 0x1.7447b2p+7, 0x1.7447cap+7, 0x6.39cbc8p-24,

> +    -0xe.f8aa5p-8, 0xa.50daap-16, 0x1.9592c2p-8 }, /* 59 */

> +  { 0x1.7a902p+7, 0x1.7a903p+7, 0x1.7a903ep+7, -0x1.820e3ap-24,

> +    0xe.d8b9dp-8, -0xa.998cp-16, -0x2.c35d78p-4 }, /* 60 */

> +  { 0x1.80d89ep+7, 0x1.80d8aep+7, 0x1.80d8bep+7, -0x2.c7e038p-24,

> +    -0xe.b9925p-8, 0x9.ce06p-16, -0x2.2b3054p-4 }, /* 61 */

> +  { 0x1.87211cp+7, 0x1.87212cp+7, 0x1.872144p+7, 0x6.ab31c8p-24,

> +    0xe.9b2bep-8, -0x9.4de7p-16, -0x1.32cb5ep-4 }, /* 62 */

> +  { 0x1.8d699ap+7, 0x1.8d69a8p+7, 0x1.8d69bp+7, 0x4.4ef25p-24,

> +    -0xe.7d7ecp-8, 0x9.a0f1ep-16, 0x1.6ac076p-4 }, /* 63 */

> +};

> +

> +/* Formula page 5 of https://www.cl.cam.ac.uk/~jrh13/papers/bessel.pdf:

> +   y0(x) ~ sqrt(2/(pi*x))*beta0(x)*sin(x-pi/4-alpha0(x))

> +   where beta0(x) = 1 - 1/(16*x^2) + 53/(512*x^4)

> +   and alpha0(x) = 1/(8*x) - 25/(384*x^3).  */

> +static float

> +y0f_asympt (float x)

> +{

> +  /* The following code fails to give an error <= 9 ulps in only two cases,

> +     for which we tabulate the correctly-rounded result.  */

> +  if (x == 0x1.bfad96p+7f)

> +    return -0x7.f32bdp-32f;

> +  if (x == 0x1.2e2a42p+17f)

> +    return 0x1.a48974p-40f;

> +  double y = 1.0 / (double) x;

> +  double y2 = y * y;

> +  double beta0 = 1.0f + y2 * (-0x1p-4f + 0x1.a8p-4 * y2);

> +  double alpha0 = y * (0x2p-4 - 0x1.0aaaaap-4 * y2);

> +  double h;

> +  int n;

> +  h = reduce_aux (x, &n, alpha0);

> +  /* Now x - pi/4 - alpha0 = h + n*pi/2 mod (2*pi).  */

> +  float xr = (float) h;

> +  n = n & 3;

> +  float cst = 0xc.c422ap-4; /* sqrt(2/pi) rounded to nearest */

> +  float t = cst / sqrtf (x) * (float) beta0;

> +  if (n == 0)

> +    return t * __sinf (xr);

> +  else if (n == 2) /* sin(x+pi) = -sin(x) */

> +    return -t * __sinf (xr);

> +  else if (n == 1) /* sin(x+pi/2) = cos(x) */

> +    return t * __cosf (xr);

> +  else             /* sin(x+3pi/2) = -cos(x) */


Double space before closing comment.

> +    return -t * __cosf (xr);

> +}

> +

> +/* Special code for x near a root of y0.

> +   z is the value computed by the generic code.

> +   For small x, use a polynomial approximating y0 around its root.

> +   For large x, use an asymptotic formula (y0f_asympt).  */

> +static float

> +y0f_near_root (float x, float z)

> +{

> +  float index_f;

> +  int index;

> +

> +  index_f = roundf ((x - FIRST_ZERO_Y0) / (float) M_PI);

> +  if (index_f >= SMALL_SIZE)

> +    return y0f_asympt (x);

> +  index = (int) index_f;

> +  const float *p = Py[index];

> +  float x0 = p[0];

> +  float x1 = p[2];

> +  /* If not in the interval [x0,x1] around xmid, return the value z.  */

> +  if (! (x0 <= x && x <= x1))

> +    return z;

> +  float xmid = p[1];

> +  float y = x - xmid;

> +  /* For degree 0 use a degree-5 polynomial, where the coefficients of

> +     degree 4 and 5 are hard-coded.  */

> +  float p6 = (index > 0) ? p[6]

> +    : p[6] + y * (-0x3.a46c9p-4 + y * 0x3.735478p-4);

> +  float res = p[3] + y * (p[4] + y * (p[5] + y * p6));

> +  return res;

> +}

> +

>  float

>  __ieee754_y0f(float x)

>  {

> @@ -124,7 +547,9 @@ __ieee754_y0f(float x)

>  	if(ix>=0x7f800000) return  one/(x+x*x);

>  	if(ix==0) return -1/zero; /* -inf and divide by zero exception.  */

>  	if(hx<0) return zero/(zero*x);

> -	if(ix >= 0x40000000) {  /* |x| >= 2.0 */

> +	if(ix >= 0x40000000 || (0x3f5340ed <= ix && ix <= 0x3f77b5e5)) {

> +          /* |x| >= 2.0 or

> +	     0x1.a681dap-1 <= |x| <= 0x1.ef6bcap-1 (around 1st zero) */

>  	/* y0(x) = sqrt(2/(pi*x))*(p0(x)*sin(x0)+q0(x)*cos(x0))

>  	 * where x0 = x-pi/4

>  	 *      Better formula:

> @@ -136,6 +561,7 @@ __ieee754_y0f(float x)

>  	 *              sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x))

>  	 * to compute the worse one.

>  	 */

> +                SET_RESTORE_ROUNDF (FE_TONEAREST);

>  		__sincosf (x, &s, &c);

>  		ss = s-c;

>  		cc = s+c;

> @@ -143,17 +569,26 @@ __ieee754_y0f(float x)

>  	 * j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)

>  	 * y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)

>  	 */

> -		if(ix<0x7f000000) {  /* make sure x+x not overflow */

> -		    z = -__cosf(x+x);

> -		    if ((s*c)<zero) cc = z/ss;

> -		    else            ss = z/cc;

> -		}

> -		if(ix>0x5c000000) z = (invsqrtpi*ss)/sqrtf(x);

> -		else {

> -		    u = pzerof(x); v = qzerof(x);

> -		    z = invsqrtpi*(u*ss+v*cc)/sqrtf(x);

> -		}

> -		return z;

> +                if (ix >= 0x7f000000)

> +		  /* x >= 2^127: use asymptotic expansion.  */

> +                  return y0f_asympt (x);

> +                /* Now we are sure that x+x cannot overflow.  */

> +                z = -__cosf(x+x);

> +                if ((s*c)<zero) cc = z/ss;

> +                else            ss = z/cc;

> +                if (ix <= 0x5c000000)

> +                  {

> +                    u = pzerof(x); v = qzerof(x);

> +                    ss = u*ss+v*cc;

> +                  }

> +                z = (invsqrtpi*ss)/sqrtf(x);

> +                /* The following threshold is optimal (determined on

> +                   aarch64-linux-gnu).  */

> +                float threshold = 0x1.be585ap-4;

> +                if (fabsf (ss) > threshold)

> +                  return z;

> +                else

> +                  return y0f_near_root (x, z);

>  	}

>  	if(ix<=0x39800000) {	/* x < 2**-13 */

>  	    return(u00 + tpi*__ieee754_logf(x));

> @@ -165,7 +600,7 @@ __ieee754_y0f(float x)

>  }

>  libm_alias_finite (__ieee754_y0f, __y0f)

>  

> -/* The asymptotic expansions of pzero is

> +/* The asymptotic expansion of pzero is

>   *	1 - 9/128 s^2 + 11025/98304 s^4 - ...,	where s = 1/x.

>   * For x >= 2, We approximate pzero by

>   *	pzero(x) = 1 + (R/S)

> @@ -257,7 +692,7 @@ pzerof(float x)

>  }

>  

>  

> -/* For x >= 8, the asymptotic expansions of qzero is

> +/* For x >= 8, the asymptotic expansion of qzero is

>   *	-1/8 s + 75/1024 s^3 - ..., where s = 1/x.

>   * We approximate pzero by

>   *	qzero(x) = s*(-1.25 + (R/S))


Ok.

> diff --git a/sysdeps/ieee754/flt-32/e_j1f.c b/sysdeps/ieee754/flt-32/e_j1f.c

> index ac5bb76828..eca1c0bae1 100644

> --- a/sysdeps/ieee754/flt-32/e_j1f.c

> +++ b/sysdeps/ieee754/flt-32/e_j1f.c

> @@ -21,6 +21,7 @@

>  #include <fenv_private.h>

>  #include <math-underflow.h>

>  #include <libm-alias-finite.h>

> +#include <reduce_aux.h>

>  

>  static float ponef(float), qonef(float);

>  

> @@ -42,6 +43,223 @@ s05  =  1.2354227016e-11; /* 0x2d59567e */

>  

>  static const float zero    = 0.0;

>  

> +/* This is the nearest approximation of the first positive zero of j1.  */

> +#define FIRST_ZERO_J1 0x3.d4eabp+0f

> +

> +#define SMALL_SIZE 64

> +

> +/* The following table contains successive zeros of j1 and degree-3

> +   polynomial approximations of j1 around these zeros: Pj[0] for the first

> +   positive zero (3.831705), Pj[1] for the second one (7.015586), and so on.

> +   Each line contains:

> +              {x0, xmid, x1, p0, p1, p2, p3}

> +   where [x0,x1] is the interval around the zero, xmid is the binary32 number

> +   closest to the zero, and p0+p1*x+p2*x^2+p3*x^3 is the approximation

> +   polynomial.  Each polynomial was generated using Sollya on the interval

> +   [x0,x1] around the corresponding zero where the error exceeds 9 ulps

> +   for the alternate code.  Degree 3 is enough to get an error at most

> +   9 ulps, except around the first zero.

> +*/

> +static const float Pj[SMALL_SIZE][7] = {

> +  /* For index 0, we use a degree-4 polynomial generated by Sollya, with the

> +     coefficient of degree 4 hard-coded in j1f_near_root().  */

> +  { 0x1.e09e5ep+1, 0x1.ea7558p+1, 0x1.ef7352p+1, -0x8.4f069p-28,

> +    -0x6.71b3d8p-4, 0xd.744a2p-8, 0xd.acd9p-8/*, -0x1.3e51aap-8*/ }, /* 0 */

> +  { 0x1.bdb4c2p+2, 0x1.c0ff6p+2, 0x1.c27a8cp+2, 0xe.c2858p-28,

> +    0x4.cd464p-4, -0x5.79b71p-8, -0xc.11124p-8 }, /* 1 */

> +  { 0x1.43b214p+3, 0x1.458d0ep+3, 0x1.460ccep+3, -0x1.e7acecp-24,

> +    -0x3.feca9p-4, 0x3.2470f8p-8, 0xa.625b7p-8 }, /* 2 */

> +  { 0x1.a9c98p+3, 0x1.aa5bbp+3, 0x1.aaa4d8p+3, 0x1.698158p-24,

> +    0x3.7e666cp-4, -0x2.1900ap-8, -0x9.2755p-8 }, /* 3 */

> +  { 0x1.073be4p+4, 0x1.0787b4p+4, 0x1.07aed8p+4, -0x1.f5f658p-24,

> +    -0x3.24b8ep-4, 0x1.86e35cp-8, 0x8.4e4bbp-8 }, /* 4 */

> +  { 0x1.39ae2ap+4, 0x1.39da8ep+4, 0x1.39f3dap+4, -0x1.4e744p-24,

> +    0x2.e18a24p-4, -0x1.2ccd16p-8, -0x7.a27ep-8 }, /* 5 */

> +  { 0x1.6bfa46p+4, 0x1.6c294ep+4, 0x1.6c412p+4, 0xa.3fb7fp-28,

> +    -0x2.acc9c4p-4, 0xf.0b783p-12, 0x7.1c0d3p-8 }, /* 6 */

> +  { 0x1.9e42bep+4, 0x1.9e757p+4, 0x1.9e876ep+4, -0x2.29f6f4p-24,

> +    0x2.81f21p-4, -0xc.641bp-12, -0x6.a7ea58p-8 }, /* 7 */

> +  { 0x1.d08a3ep+4, 0x1.d0bfdp+4, 0x1.d0cd3cp+4, -0x1.b5d196p-24,

> +    -0x2.5e40e4p-4, 0xa.7059fp-12, 0x6.4d6bfp-8 }, /* 8 */

> +  { 0x1.017794p+5, 0x1.018476p+5, 0x1.018b8cp+5, -0x4.0e001p-24,

> +    0x2.3febep-4, -0x8.f23aap-12, -0x6.0102cp-8 }, /* 9 */

> +  { 0x1.1a9e78p+5, 0x1.1aa89p+5, 0x1.1aaf88p+5, 0x3.b26f2p-24,

> +    -0x2.25babp-4, 0x7.c6d948p-12, 0x5.a1d988p-8 }, /* 10 */

> +  { 0x1.33bddep+5, 0x1.33cc52p+5, 0x1.33d2e4p+5, -0xf.c8cdap-28,

> +    0x2.0ed05p-4, -0x6.d97dbp-12, -0x5.8da498p-8 }, /* 11 */

> +  { 0x1.4ce7cp+5, 0x1.4cefdp+5, 0x1.4cf7d4p+5, -0x3.9940e4p-24,

> +    -0x1.fa8b4p-4, 0x6.16108p-12, 0x5.4355e8p-8 }, /* 12 */

> +  { 0x1.6603e8p+5, 0x1.661316p+5, 0x1.66173ap+5, 0x9.da15dp-28,

> +    0x1.e8727ep-4, -0x5.742468p-12, -0x5.117c28p-8 }, /* 13 */

> +  { 0x1.7f2ebcp+5, 0x1.7f3632p+5, 0x1.7f3a7ep+5, -0x3.39b218p-24,

> +    -0x1.d8293ap-4, 0x4.ee3348p-12, 0x4.f9bep-8 }, /* 14 */

> +  { 0x1.9850e6p+5, 0x1.985928p+5, 0x1.985d9ep+5, -0x3.7b5108p-24,

> +    0x1.c96702p-4, -0x4.7b0d08p-12, -0x4.c784a8p-8 }, /* 15 */

> +  { 0x1.b172e8p+5, 0x1.b17c04p+5, 0x1.b1805cp+5, -0x1.91e43ep-24,

> +    -0x1.bbf246p-4, 0x4.18ad78p-12, 0x4.9bfae8p-8 }, /* 16 */

> +  { 0x1.ca955ap+5, 0x1.ca9ec6p+5, 0x1.caa2a4p+5, 0x1.28453cp-24,

> +    0x1.af9cb4p-4, -0x3.c3a494p-12, -0x4.78b69p-8 }, /* 17 */

> +  { 0x1.e3bc94p+5, 0x1.e3c174p+5, 0x1.e3c64p+5, -0x2.e7fef4p-24,

> +    -0x1.a4407ep-4, 0x3.79b228p-12, 0x4.874f7p-8 }, /* 18 */

> +  { 0x1.fcdf16p+5, 0x1.fce40ep+5, 0x1.fce71p+5, -0x3.23b2fcp-24,

> +    0x1.99be76p-4, -0x3.39ad7cp-12, -0x4.92a55p-8 }, /* 19 */

> +  { 0x1.0afe34p+6, 0x1.0b034ep+6, 0x1.0b054ap+6, -0xd.19e93p-28,

> +    -0x1.8ffc9cp-4, 0x2.fee7f8p-12, 0x4.2d33b8p-8 }, /* 20 */

> +  { 0x1.179344p+6, 0x1.17948ep+6, 0x1.1795bp+6, 0x1.c2ac48p-24,

> +    0x1.86e51cp-4, -0x2.cc5abp-12, -0x4.866d08p-8 }, /* 21 */

> +  { 0x1.24231ep+6, 0x1.2425c8p+6, 0x1.2426e2p+6, -0xd.31027p-28,

> +    -0x1.7e656ep-4, 0x2.9db23cp-12, 0x3.cc63c8p-8 }, /* 22 */

> +  { 0x1.30b5a8p+6, 0x1.30b6fep+6, 0x1.30b84ep+6, 0x5.b5e53p-24,

> +    0x1.766dc2p-4, -0x2.754cfcp-12, -0x3.c39bb4p-8 }, /* 23 */

> +  { 0x1.3d46ccp+6, 0x1.3d482ep+6, 0x1.3d495ep+6, -0x1.340a8ap-24,

> +    -0x1.6ef07ep-4, 0x2.4ff9d4p-12, 0x3.9b36e4p-8 }, /* 24 */

> +  { 0x1.49d688p+6, 0x1.49d95ap+6, 0x1.49dabep+6, -0x3.ba66p-24,

> +    0x1.67e1dcp-4, -0x2.2f32b8p-12, -0x3.e2aaf4p-8 }, /* 25 */

> +  { 0x1.566916p+6, 0x1.566a84p+6, 0x1.566bcp+6, 0x6.47ca5p-28,

> +    -0x1.61379ap-4, 0x2.1096acp-12, 0x4.2d0968p-8 }, /* 26 */

> +  { 0x1.62f8dap+6, 0x1.62fbaap+6, 0x1.62fc9cp+6, -0x2.12affp-24,

> +    0x1.5ae8c4p-4, -0x1.f32444p-12, -0x3.9e592p-8 }, /* 27 */

> +  { 0x1.6f89e6p+6, 0x1.6f8ccep+6, 0x1.6f8e34p+6, -0x7.4853ap-28,

> +    -0x1.54ed76p-4, 0x1.db004ap-12, 0x3.907034p-8 }, /* 28 */

> +  { 0x1.7c1c6ap+6, 0x1.7c1deep+6, 0x1.7c1f4cp+6, -0x4.f0a998p-24,

> +    0x1.4f3ebcp-4, -0x1.c26808p-12, -0x2.da8df8p-8 }, /* 29 */

> +  { 0x1.88adaep+6, 0x1.88af0ep+6, 0x1.88afc4p+6, -0x1.80c246p-24,

> +    -0x1.49d668p-4, 0x1.aebc26p-12, 0x3.af7b5cp-8 }, /* 30 */

> +  { 0x1.953d7p+6, 0x1.95402ap+6, 0x1.9540ep+6, -0x2.22aff8p-24,

> +    0x1.44aefap-4, -0x1.99f25p-12, -0x3.5e9198p-8 }, /* 31 */

> +  { 0x1.a1d01ep+6, 0x1.a1d146p+6, 0x1.a1d20ap+6, -0x3.aad6d4p-24,

> +    -0x1.3fc386p-4, 0x1.892858p-12, 0x3.fe0184p-8 }, /* 32 */

> +  { 0x1.ae60ecp+6, 0x1.ae625ep+6, 0x1.ae6326p+6, -0x2.010be4p-24,

> +    0x1.3b0fa4p-4, -0x1.7539ap-12, -0x2.b2c9bp-8 }, /* 33 */

> +  { 0x1.baf234p+6, 0x1.baf376p+6, 0x1.baf442p+6, -0xd.4fd17p-32,

> +    -0x1.368f5cp-4, 0x1.6734e4p-12, 0x3.59f514p-8 }, /* 34 */

> +  { 0x1.c782e6p+6, 0x1.c7848cp+6, 0x1.c78516p+6, -0xa.d662dp-28,

> +    0x1.323f18p-4, -0x1.571c02p-12, -0x3.2be5bp-8 }, /* 35 */

> +  { 0x1.d4144ep+6, 0x1.d415ap+6, 0x1.d41622p+6, 0x4.9f217p-24,

> +    -0x1.2e1b9ap-4, 0x1.4a2edap-12, 0x3.a4e96p-8 }, /* 36 */

> +  { 0x1.e0a5ep+6, 0x1.e0a6b4p+6, 0x1.e0a788p+6, -0x2.d167p-24,

> +    0x1.2a21eep-4, -0x1.3c4b46p-12, -0x4.9e0978p-8 }, /* 37 */

> +  { 0x1.ed36eep+6, 0x1.ed37c8p+6, 0x1.ed3892p+6, -0x4.15a83p-24,

> +    -0x1.264f66p-4, 0x1.31dea4p-12, 0x3.d125ecp-8 }, /* 38 */

> +  { 0x1.f9c77p+6, 0x1.f9c8d8p+6, 0x1.f9c9acp+6, -0x2.a5bbbp-24,

> +    0x1.22a192p-4, -0x1.25e59ep-12, -0x2.ef6934p-8 }, /* 39 */

> +  { 0x1.032c54p+7, 0x1.032cf4p+7, 0x1.032d66p+7, 0x4.e828bp-24,

> +    -0x1.1f1634p-4, 0x1.1c2394p-12, 0x3.6d744cp-8 }, /* 40 */

> +  { 0x1.09750cp+7, 0x1.09757cp+7, 0x1.0975b6p+7, -0x3.28a3bcp-24,

> +    0x1.1bab3ep-4, -0x1.1569cep-12, -0x5.84da7p-8 }, /* 41 */

> +  { 0x1.0fbd9ap+7, 0x1.0fbe04p+7, 0x1.0fbe5ep+7, -0x2.2f667p-24,

> +    -0x1.185eccp-4, 0x1.07f42cp-12, 0x2.87896cp-8 }, /* 42 */

> +  { 0x1.160628p+7, 0x1.16068ap+7, 0x1.1606cep+7, -0x6.9097dp-24,

> +    0x1.152f28p-4, -0x1.0227fep-12, -0x5.da6e6p-8 }, /* 43 */

> +  { 0x1.1c4e9ap+7, 0x1.1c4f12p+7, 0x1.1c4f7cp+7, -0x5.1b408p-24,

> +    -0x1.121abp-4, 0xf.6efcp-16, 0x2.c5e954p-8 }, /* 44 */

> +  { 0x1.2296aap+7, 0x1.229798p+7, 0x1.2297d4p+7, 0x2.70d0dp-24,

> +    0x1.0f1ffp-4, -0xf.523f5p-16, -0x3.5c0568p-8 }, /* 45 */

> +  { 0x1.28dfa4p+7, 0x1.28e01ep+7, 0x1.28e054p+7, -0x2.7c176p-24,

> +    -0x1.0c3d8ap-4, 0xe.8329ap-16, 0x3.5eb34p-8 }, /* 46 */

> +  { 0x1.2f282ap+7, 0x1.2f28a4p+7, 0x1.2f28dep+7, 0x4.fd6368p-24,

> +    0x1.097236p-4, -0xe.17299p-16, -0x3.73a2e4p-8 }, /* 47 */

> +  { 0x1.3570bp+7, 0x1.357128p+7, 0x1.35716p+7, 0x6.b05f68p-24,

> +    -0x1.06bccap-4, 0xd.527b8p-16, 0x2.b8bf9cp-8 }, /* 48 */

> +  { 0x1.3bb932p+7, 0x1.3bb9aep+7, 0x1.3bb9eap+7, 0x4.0d622p-28,

> +    0x1.041c28p-4, -0xd.0ac11p-16, -0x1.65f2ccp-8 }, /* 49 */

> +  { 0x1.4201b6p+7, 0x1.420232p+7, 0x1.42027p+7, 0x7.0d98cp-24,

> +    -0x1.018f52p-4, 0xc.c4d8ep-16, 0x2.8f250cp-8 }, /* 50 */

> +  { 0x1.484a78p+7, 0x1.484ab8p+7, 0x1.484af4p+7, 0x3.93d10cp-24,

> +    0xf.f154fp-8, -0xc.7b7fep-16, -0x3.6b6e4cp-8 }, /* 51 */

> +  { 0x1.4e92c8p+7, 0x1.4e933cp+7, 0x1.4e9368p+7, 0xd.88185p-32,

> +    -0xf.cad3fp-8, 0xc.1462p-16, 0x2.bd66p-8 }, /* 52 */

> +  { 0x1.54db84p+7, 0x1.54dbcp+7, 0x1.54dbf4p+7, -0x1.fe6b92p-24,

> +    0xf.a564cp-8, -0xb.c4e6cp-16, -0x3.d51decp-8 }, /* 53 */

> +  { 0x1.5b23c4p+7, 0x1.5b2444p+7, 0x1.5b2486p+7, 0x2.6137f4p-24,

> +    -0xf.80faep-8, 0xb.5199ep-16, 0x1.9ca85ap-8 }, /* 54 */

> +  { 0x1.616c62p+7, 0x1.616cc8p+7, 0x1.616d0ap+7, -0x1.55468p-24,

> +    0xf.5d8acp-8, -0xb.21d16p-16, -0x1.b8809ap-8 }, /* 55 */

> +  { 0x1.67b4fp+7, 0x1.67b54cp+7, 0x1.67b588p+7, -0x1.08c6bep-24,

> +    -0xf.3b096p-8, 0xa.e65efp-16, 0x3.642304p-8 }, /* 56 */

> +  { 0x1.6dfd8ep+7, 0x1.6dfddp+7, 0x1.6dfe0ap+7, 0x4.9ebfa8p-24,

> +    0xf.196c7p-8, -0xa.ba8c8p-16, -0x5.ad6a2p-8 }, /* 57 */

> +  { 0x1.74461p+7, 0x1.744652p+7, 0x1.744692p+7, 0x5.a4017p-24,

> +    -0xe.f8aa5p-8, 0xa.49748p-16, 0x2.a86498p-8 }, /* 58 */

> +  { 0x1.7a8e5ep+7, 0x1.7a8ed6p+7, 0x1.7a8ef8p+7, 0x3.bcb2a8p-28,

> +    0xe.d8b9dp-8, -0x9.c43bep-16, -0x1.e7124ap-8 }, /* 59 */

> +  { 0x1.80d6cep+7, 0x1.80d75ap+7, 0x1.80d78ap+7, -0x7.1091fp-24,

> +    -0xe.b9925p-8, 0x9.c43dap-16, 0x1.aba86p-8 }, /* 60 */

> +  { 0x1.871f58p+7, 0x1.871fdcp+7, 0x1.87201ep+7, 0x2.ca1cf4p-28,

> +    0xe.9b2bep-8, -0x9.843b3p-16, -0x2.093e68p-8 }, /* 61 */

> +  { 0x1.8d67e8p+7, 0x1.8d685ep+7, 0x1.8d688ep+7, 0x5.aa8908p-24,

> +    -0xe.7d7ecp-8, 0x9.501a8p-16, 0x2.54a754p-8 }, /* 62 */

> +  { 0x1.93b09cp+7, 0x1.93b0e2p+7, 0x1.93b10ep+7, 0x3.d9cd9cp-24,

> +    0xe.6083ap-8, -0x9.45dadp-16, -0x5.112908p-8 }, /* 63 */

> +};

> +

> +/* Formula page 5 of https://www.cl.cam.ac.uk/~jrh13/papers/bessel.pdf:

> +   j1f(x) ~ sqrt(2/(pi*x))*beta1(x)*cos(x-3pi/4-alpha1(x))

> +   where beta1(x) = 1 + 3/(16*x^2) - 99/(512*x^4)

> +   and alpha1(x) = -3/(8*x) + 21/(128*x^3) - 1899/(5120*x^5).  */

> +static float

> +j1f_asympt (float x)

> +{

> +  float cst = 0xc.c422ap-4; /* sqrt(2/pi) rounded to nearest */

> +  if (x < 0)

> +    {

> +      x = -x;

> +      cst = -cst;

> +    }

> +  double y = 1.0 / (double) x;

> +  double y2 = y * y;

> +  double beta1 = 1.0f + y2 * (0x3p-4 - 0x3.18p-4 * y2);

> +  double alpha1;

> +  alpha1 = y * (-0x6p-4 + y2 * (0x2.ap-4 - 0x5.ef33333333334p-4 * y2));

> +  double h;

> +  int n;

> +  h = reduce_aux (x, &n, alpha1);

> +  n--; /* Subtract pi/2.  */

> +  /* Now x - 3pi/4 - alpha1 = h + n*pi/2 mod (2*pi).  */

> +  float xr = (float) h;

> +  n = n & 3;

> +  float t = cst / sqrtf (x) * (float) beta1;

> +  if (n == 0)

> +    return t * __cosf (xr);

> +  else if (n == 2) /* cos(x+pi) = -cos(x) */

> +    return -t * __cosf (xr);

> +  else if (n == 1) /* cos(x+pi/2) = -sin(x) */

> +    return -t * __sinf (xr);

> +  else             /* cos(x+3pi/2) = sin(x) */

> +    return t * __sinf (xr);

> +}

> +

> +/* Special code for x near a root of j1.

> +   z is the value computed by the generic code.

> +   For small x, we use a polynomial approximating j1 around its root.

> +   For large x, we use an asymptotic formula (j1f_asympt).  */

> +static float

> +j1f_near_root (float x, float z)

> +{

> +  float index_f, sign = 1.0f;

> +  int index;

> +

> +  if (x < 0)

> +    {

> +      x = -x;

> +      sign = -1.0f;

> +    }

> +  index_f = roundf ((x - FIRST_ZERO_J1) / (float) M_PI);

> +  if (index_f >= SMALL_SIZE)

> +    return sign * j1f_asympt (x);

> +  index = (int) index_f;

> +  const float *p = Pj[index];

> +  float x0 = p[0];

> +  float x1 = p[2];

> +  /* If not in the interval [x0,x1] around xmid, return the value z.  */

> +  if (! (x0 <= x && x <= x1))

> +    return z;

> +  float xmid = p[1];

> +  float y = x - xmid;

> +  float p6 = (index > 0) ? p[6] : p[6] + y * -0x1.3e51aap-8f;

> +  return sign * (p[3] + y * (p[4] + y * (p[5] + y * p6)));

> +}

> +

>  float

>  __ieee754_j1f(float x)

>  {

> @@ -53,25 +271,37 @@ __ieee754_j1f(float x)

>  	if(__builtin_expect(ix>=0x7f800000, 0)) return one/x;

>  	y = fabsf(x);

>  	if(ix >= 0x40000000) {	/* |x| >= 2.0 */

> +                SET_RESTORE_ROUNDF (FE_TONEAREST);

>  		__sincosf (y, &s, &c);

>  		ss = -s-c;

>  		cc = s-c;

> -		if(ix<0x7f000000) {  /* make sure y+y not overflow */

> -		    z = __cosf(y+y);

> -		    if ((s*c)>zero) cc = z/ss;

> -		    else	    ss = z/cc;

> -		}

> +                if (ix >= 0x7f000000)

> +		  /* x >= 2^127: use asymptotic expansion.  */

> +                  return j1f_asympt (x);

> +                /* Now we are sure that x+x cannot overflow.  */

> +                z = __cosf(y+y);

> +                if ((s*c)>zero) cc = z/ss;

> +                else	        ss = z/cc;

>  	/*

>  	 * j1(x) = 1/sqrt(pi) * (P(1,x)*cc - Q(1,x)*ss) / sqrt(x)

>  	 * y1(x) = 1/sqrt(pi) * (P(1,x)*ss + Q(1,x)*cc) / sqrt(x)

>  	 */

> -		if(ix>0x5c000000) z = (invsqrtpi*cc)/sqrtf(y);

> -		else {

> +		if (ix <= 0x5c000000)

> +                  {

>  		    u = ponef(y); v = qonef(y);

> -		    z = invsqrtpi*(u*cc-v*ss)/sqrtf(y);

> -		}

> -		if(hx<0) return -z;

> -		else	 return  z;

> +		    cc = u*cc-v*ss;

> +                  }

> +                z = (invsqrtpi * cc) / sqrtf(y);

> +                /* Adjust sign of z.  */

> +                z = (hx < 0) ? -z : z;

> +                /* The following threshold is optimal: for x=0x1.e09e5ep+1

> +                   and rounding upwards, cc=0x1.b79638p-4 and z is 10 ulps

> +                   far from the correctly rounded value.  */

> +                float threshold = 0x1.b79638p-4;

> +                if (fabsf (cc) > threshold)

> +                  return z;

> +                else

> +                  return j1f_near_root (x, z);

>  	}

>  	if(__builtin_expect(ix<0x32000000, 0)) {	/* |x|<2**-27 */

>  	    if(huge+x>one) {		/* inexact if x!=0 necessary */

> @@ -105,6 +335,218 @@ static const float V0[5] = {

>    1.6655924903e-11, /* 0x2d9281cf */

>  };

>  

> +/* This is the nearest approximation of the first zero of y1.  */

> +#define FIRST_ZERO_Y1 0x2.3277dcp+0f

> +

> +/* The following table contains successive zeros of y1 and degree-3

> +   polynomial approximations of y1 around these zeros: Py[0] for the first

> +   positive zero (2.197141), Py[1] for the second one (5.429681), and so on.

> +   Each line contains:

> +              {x0, xmid, x1, p0, p1, p2, p3}

> +   where [x0,x1] is the interval around the zero, xmid is the binary32 number

> +   closest to the zero, and p0+p1*x+p2*x^2+p3*x^3 is the approximation

> +   polynomial.  Each polynomial was generated using Sollya on the interval

> +   [x0,x1] around the corresponding zero where the error exceeds 9 ulps

> +   for the alternate code.  Degree 3 is enough, except for the first roots.

> +*/

> +static const float Py[SMALL_SIZE][7] = {

> +  /* For index 0, we use a degree-5 polynomial generated by Sollya, with the

> +     coefficients of degree 4 and 5 hard-coded in y1f_near_root().  */

> +  { 0x1.f7f16ap+0, 0x1.193beep+1, 0x1.2105dcp+1, 0xb.96749p-28,

> +    0x8.55241p-4, -0x1.e570bp-4, -0x8.68b61p-8

> +    /*, -0x1.28043p-8, 0x2.50e83p-8*/ }, /* 0 */

> +  /* For index 1, we use a degree-4 polynomial generated by Sollya, with the

> +     coefficient of degree 4 hard-coded in y1f_near_root().  */

> +  { 0x1.55c6d2p+2, 0x1.5b7fe4p+2, 0x1.5cf8cap+2, 0x1.3c7822p-24,

> +    -0x5.71f158p-4, 0x8.05cb4p-8, 0xd.0b15p-8/*, -0xf.ff6b8p-12*/ }, /* 1 */

> +  { 0x1.113c6p+3, 0x1.13127ap+3, 0x1.1387dcp+3, -0x1.f3ad8ep-24,

> +    0x4.57e66p-4, -0x4.0afb58p-8, -0xb.29207p-8 }, /* 2 */

> +  { 0x1.76e7dep+3, 0x1.77f914p+3, 0x1.786a6ap+3, -0xd.5608fp-28,

> +    -0x3.b829d4p-4, 0x2.8852cp-8, 0x9.b70e3p-8 }, /* 3 */

> +  { 0x1.dc2794p+3, 0x1.dcb7d8p+3, 0x1.dd032p+3, -0xe.a7c04p-28,

> +    0x3.4e0458p-4, -0x1.c64b18p-8, -0x8.b0e7fp-8 }, /* 4 */

> +  { 0x1.20874p+4, 0x1.20b1c6p+4, 0x1.20c71p+4, 0x1.c2626p-24,

> +    -0x3.00f03cp-4, 0x1.54f806p-8, 0x7.f9cf9p-8 }, /* 5 */

> +  { 0x1.52d848p+4, 0x1.530254p+4, 0x1.531962p+4, -0x1.9503ecp-24,

> +    0x2.c5b29cp-4, -0x1.0bf28p-8, -0x7.562e58p-8 }, /* 6 */

> +  { 0x1.851e64p+4, 0x1.854fa4p+4, 0x1.85679p+4, -0x2.8d40fcp-24,

> +    -0x2.96547p-4, 0xd.9c38bp-12, 0x6.dcbf8p-8 }, /* 7 */

> +  { 0x1.b7808ep+4, 0x1.b79acep+4, 0x1.b7b2a8p+4, -0x2.36df5cp-24,

> +    0x2.6f55ap-4, -0xb.57f9fp-12, -0x6.82569p-8 }, /* 8 */

> +  { 0x1.e9c8fp+4, 0x1.e9e48p+4, 0x1.e9f24p+4, 0xd.e2eb7p-28,

> +    -0x2.4e8104p-4, 0x9.a4be2p-12, 0x6.2541fp-8 }, /* 9 */

> +  { 0x1.0e0808p+5, 0x1.0e169p+5, 0x1.0e1d92p+5, -0x2.3070f4p-24,

> +    0x2.325e4cp-4, -0x8.53604p-12, -0x5.ca03a8p-8 }, /* 10 */

> +  { 0x1.272e08p+5, 0x1.273a7cp+5, 0x1.2741fcp+5, -0x3.525508p-24,

> +    -0x2.19e7dcp-4, 0x7.49d1dp-12, 0x5.9cb02p-8 }, /* 11 */

> +  { 0x1.404ec6p+5, 0x1.405e18p+5, 0x1.4065cep+5, -0xe.6e158p-28,

> +    0x2.046174p-4, -0x6.71b3dp-12, -0x5.4c3c8p-8 }, /* 12 */

> +  { 0x1.5971dcp+5, 0x1.598178p+5, 0x1.598592p+5, 0x1.e72698p-24,

> +    -0x1.f13fb2p-4, 0x5.c0f938p-12, 0x5.28ca78p-8 }, /* 13 */

> +  { 0x1.729c4ep+5, 0x1.72a4a8p+5, 0x1.72a8eap+5, -0x1.5bed9cp-24,

> +    0x1.e018dcp-4, -0x5.2f11e8p-12, -0x5.16ce48p-8 }, /* 14 */

> +  { 0x1.8bbf4ep+5, 0x1.8bc7b2p+5, 0x1.8bcc1p+5, -0x3.6b654cp-24,

> +    -0x1.d09b2p-4, 0x4.b1747p-12, 0x4.bd22fp-8 }, /* 15 */

> +  { 0x1.a4e272p+5, 0x1.a4ea9ap+5, 0x1.a4eef4p+5, 0x1.6f11bp-24,

> +    0x1.c28612p-4, -0x4.47462p-12, -0x4.947c5p-8 }, /* 16 */

> +  { 0x1.be08bep+5, 0x1.be0d68p+5, 0x1.be1088p+5, -0x2.0bc074p-24,

> +    -0x1.b5a622p-4, 0x3.ed52d4p-12, 0x4.b76fc8p-8 }, /* 17 */

> +  { 0x1.d7272ap+5, 0x1.d7301ep+5, 0x1.d734aep+5, -0x2.87dd4p-24,

> +    0x1.a9d184p-4, -0x3.9cf494p-12, -0x4.6303ep-8 }, /* 18 */

> +  { 0x1.f0499ap+5, 0x1.f052c4p+5, 0x1.f05758p+5, -0x2.fb964p-24,

> +    -0x1.9ee5eep-4, 0x3.5800dp-12, 0x4.4e9f9p-8 }, /* 19 */

> +  { 0x1.04b63ap+6, 0x1.04baacp+6, 0x1.04bc92p+6, 0x2.cf5adp-24,

> +    0x1.94c6f4p-4, -0x3.1a83e4p-12, -0x4.2311fp-8 }, /* 20 */

> +  { 0x1.1146dp+6, 0x1.114beep+6, 0x1.114e12p+6, 0x3.6766fp-24,

> +    -0x1.8b5cccp-4, 0x2.e4a4e4p-12, 0x4.20bf9p-8 }, /* 21 */

> +  { 0x1.1dda8cp+6, 0x1.1ddd2cp+6, 0x1.1dde7ap+6, 0x3.501424p-24,

> +    0x1.829356p-4, -0x2.b47524p-12, -0x4.04bf18p-8 }, /* 22 */

> +  { 0x1.2a6bcp+6, 0x1.2a6e64p+6, 0x1.2a6faap+6, -0x5.c05808p-24,

> +    -0x1.7a597ep-4, 0x2.8a0498p-12, 0x4.187258p-8 }, /* 23 */

> +  { 0x1.36fcd6p+6, 0x1.36ff96p+6, 0x1.3700f6p+6, 0x7.1e1478p-28,

> +    0x1.72a09ap-4, -0x2.61a7fp-12, -0x3.c0b54p-8 }, /* 24 */

> +  { 0x1.438f46p+6, 0x1.4390c4p+6, 0x1.4392p+6, 0x3.e36e6cp-24,

> +    -0x1.6b5c06p-4, 0x2.3f612p-12, 0x4.18f868p-8 }, /* 25 */

> +  { 0x1.501f4cp+6, 0x1.5021fp+6, 0x1.50235p+6, 0x1.3f9e5ap-24,

> +    0x1.6480c4p-4, -0x2.1f28fcp-12, -0x3.bb4e3cp-8 }, /* 26 */

> +  { 0x1.5cb07cp+6, 0x1.5cb318p+6, 0x1.5cb464p+6, -0x2.39e41cp-24,

> +    -0x1.5e0544p-4, 0x2.0189f4p-12, 0x3.8b55acp-8 }, /* 27 */

> +  { 0x1.694166p+6, 0x1.69443cp+6, 0x1.694594p+6, -0x2.912f84p-24,

> +    0x1.57e12p-4, -0x1.e6fabep-12, -0x3.850174p-8 }, /* 28 */

> +  { 0x1.75d27cp+6, 0x1.75d55ep+6, 0x1.75d67ep+6, 0x3.d5b00cp-24,

> +    -0x1.520ceep-4, 0x1.d0286ep-12, 0x3.8e7d1p-8 }, /* 29 */

> +  { 0x1.82653ep+6, 0x1.82667ep+6, 0x1.82674p+6, -0x3.1726ecp-24,

> +    0x1.4c8222p-4, -0x1.b98206p-12, -0x3.f34978p-8 }, /* 30 */

> +  { 0x1.8ef4b4p+6, 0x1.8ef79cp+6, 0x1.8ef888p+6, 0x1.949e22p-24,

> +    -0x1.473ae6p-4, 0x1.a47388p-12, 0x3.69eefcp-8 }, /* 31 */

> +  { 0x1.9b8728p+6, 0x1.9b88b8p+6, 0x1.9b896cp+6, -0x5.5553bp-28,

> +    0x1.42320ap-4, -0x1.90f0b8p-12, -0x3.6565p-8 }, /* 32 */

> +  { 0x1.a8183cp+6, 0x1.a819d2p+6, 0x1.a81aecp+6, 0x3.2df7ecp-28,

> +    -0x1.3d62e4p-4, 0x1.7dae28p-12, 0x2.9eb128p-8 }, /* 33 */

> +  { 0x1.b4aa1cp+6, 0x1.b4aaeap+6, 0x1.b4abb8p+6, -0x1.e13fcep-24,

> +    0x1.38c948p-4, -0x1.6eb0ecp-12, -0x1.f9ddf8p-8 }, /* 34 */

> +  { 0x1.c13a7ap+6, 0x1.c13c02p+6, 0x1.c13cbp+6, -0x3.ad9974p-24,

> +    -0x1.34616ep-4, 0x1.5e36ecp-12, 0x2.a9fc5p-8 }, /* 35 */

> +  { 0x1.cdcb76p+6, 0x1.cdcd16p+6, 0x1.cdcde4p+6, -0x3.6977e8p-24,

> +    0x1.3027fp-4, -0x1.4f703p-12, -0x2.9817d4p-8 }, /* 36 */

> +  { 0x1.da5cdep+6, 0x1.da5e2ap+6, 0x1.da5efp+6, 0x4.654cbp-24,

> +    -0x1.2c19b6p-4, 0x1.455982p-12, 0x3.f1c564p-8 }, /* 37 */

> +  { 0x1.e6edccp+6, 0x1.e6ef3ep+6, 0x1.e6f00ap+6, 0x8.825c8p-32,

> +    0x1.2833eep-4, -0x1.39097p-12, -0x3.b2646p-8 }, /* 38 */

> +  { 0x1.f37f72p+6, 0x1.f3805p+6, 0x1.f3812ap+6, -0x2.0d11d8p-28,

> +    -0x1.24740ap-4, 0x1.2c16p-12, 0x1.fc3804p-8 }, /* 39 */

> +  { 0x1.000842p+7, 0x1.0008bp+7, 0x1.000908p+7, -0x4.4e495p-24,

> +    0x1.20d7b6p-4, -0x1.20816p-12, -0x2.d1ebe8p-8 }, /* 40 */

> +  { 0x1.06505cp+7, 0x1.065138p+7, 0x1.06518p+7, 0x4.81c1c8p-24,

> +    -0x1.1d5ccap-4, 0x1.17ad5ap-12, 0x2.fda33p-8 }, /* 41 */

> +  { 0x1.0c98dap+7, 0x1.0c99cp+7, 0x1.0c9a28p+7, -0xe.99386p-28,

> +    0x1.1a015p-4, -0x1.0bd50ap-12, -0x2.9dfb68p-8 }, /* 42 */

> +  { 0x1.12e212p+7, 0x1.12e248p+7, 0x1.12e29p+7, -0x6.16f1c8p-24,

> +    -0x1.16c37ap-4, 0x1.0303dcp-12, 0x4.34316p-8 }, /* 43 */

> +  { 0x1.192a68p+7, 0x1.192acep+7, 0x1.192b02p+7, -0x1.129336p-24,

> +    0x1.13a19ep-4, -0xf.bd247p-16, -0x3.851d18p-8 }, /* 44 */

> +  { 0x1.1f727p+7, 0x1.1f7354p+7, 0x1.1f73ap+7, 0x5.19c09p-24,

> +    -0x1.109a32p-4, 0xf.09644p-16, 0x2.d78194p-8 }, /* 45 */

> +  { 0x1.25bb8p+7, 0x1.25bbdap+7, 0x1.25bc12p+7, -0x6.497dp-24,

> +    0x1.0dabc8p-4, -0xe.a1d25p-16, -0x2.3378bp-8 }, /* 46 */

> +  { 0x1.2c04p+7, 0x1.2c046p+7, 0x1.2c04ap+7, 0x4.e4f338p-24,

> +    -0x1.0ad512p-4, 0xe.52d84p-16, 0x4.3bfa08p-8 }, /* 47 */

> +  { 0x1.324cbp+7, 0x1.324ce6p+7, 0x1.324d4p+7, -0x1.287c58p-24,

> +    0x1.0814d4p-4, -0xe.03a95p-16, 0x3.9930ap-12 }, /* 48 */

> +  { 0x1.3894f6p+7, 0x1.38956cp+7, 0x1.3895ap+7, -0x4.b594ep-24,

> +    -0x1.0569fp-4, 0xd.6787ep-16, 0x4.0a5148p-8 }, /* 49 */

> +  { 0x1.3edd98p+7, 0x1.3eddfp+7, 0x1.3ede2ap+7, -0x3.a8f164p-24,

> +    0x1.02d354p-4, -0xd.0309dp-16, -0x3.2ebfb4p-8 }, /* 50 */

> +  { 0x1.452638p+7, 0x1.452676p+7, 0x1.4526b4p+7, -0x6.12505p-24,

> +    -0x1.005004p-4, 0xc.a0045p-16, 0x4.87c67p-8 }, /* 51 */

> +  { 0x1.4b6e8p+7, 0x1.4b6efap+7, 0x1.4b6f34p+7, 0x1.8acf4ep-24,

> +    0xf.ddf16p-8, -0xc.2d207p-16, -0x1.da6c36p-8 }, /* 52 */

> +  { 0x1.51b742p+7, 0x1.51b77ep+7, 0x1.51b7b2p+7, 0x1.39cf86p-24,

> +    -0xf.b7faep-8, 0xb.db598p-16, -0x8.945b1p-12 }, /* 53 */

> +  { 0x1.57ffc4p+7, 0x1.580002p+7, 0x1.58003cp+7, -0x2.5f8de8p-24,

> +    0xf.930fep-8, -0xb.91889p-16, -0xa.30df9p-12 }, /* 54 */

> +  { 0x1.5e483p+7, 0x1.5e4886p+7, 0x1.5e48c8p+7, 0x2.073d64p-24,

> +    -0xf.6f245p-8, 0xb.4085fp-16, 0x2.128188p-8 }, /* 55 */

> +  { 0x1.64908cp+7, 0x1.64910ap+7, 0x1.64912ap+7, -0x4.ed26ep-28,

> +    0xf.4c2cep-8, -0xa.fe719p-16, -0x2.9374b8p-8 }, /* 56 */

> +  { 0x1.6ad91ep+7, 0x1.6ad98ep+7, 0x1.6ad9cep+7, -0x2.ae5204p-24,

> +    -0xf.2a1efp-8, 0xa.aa585p-16, 0x2.1c0834p-8 }, /* 57 */

> +  { 0x1.7121cep+7, 0x1.712212p+7, 0x1.712238p+7, 0x6.d72168p-24,

> +    0xf.08f09p-8, -0xa.7da49p-16, -0x3.4f5f1cp-8 }, /* 58 */

> +  { 0x1.776a0cp+7, 0x1.776a94p+7, 0x1.776accp+7, 0x2.d3f294p-24,

> +    -0xe.e8986p-8, 0xa.23ccdp-16, 0x2.2a6678p-8 }, /* 59 */

> +  { 0x1.7db2e8p+7, 0x1.7db318p+7, 0x1.7db35ap+7, 0x3.88c0fp-24,

> +    0xe.c90d7p-8, -0x9.eaeap-16, -0x2.86438cp-8 }, /* 60 */

> +  { 0x1.83fb56p+7, 0x1.83fb9ap+7, 0x1.83fbep+7, 0x3.d94d34p-24,

> +    -0xe.aa478p-8, 0x9.abac7p-16, 0x1.ac2d84p-8 }, /* 61 */

> +  { 0x1.8a43e8p+7, 0x1.8a441ep+7, 0x1.8a446p+7, 0x4.66b7ep-24,

> +    0xe.8c3e9p-8, -0x9.87682p-16, -0x7.9ab4a8p-12 }, /* 62 */

> +  { 0x1.908c6p+7, 0x1.908cap+7, 0x1.908ce6p+7, 0xf.f7ac9p-28,

> +    -0xe.6eeb6p-8, 0x9.4423p-16, 0x4.54c4d8p-8 }, /* 63 */

> +};

> +

> +/* Formula page 5 of https://www.cl.cam.ac.uk/~jrh13/papers/bessel.pdf:

> +   y1f(x) ~ sqrt(2/(pi*x))*beta1(x)*sin(x-3pi/4-alpha1(x))

> +   where beta1(x) = 1 + 3/(16*x^2) - 99/(512*x^4)

> +   and alpha1(x) = -3/(8*x) + 21/(128*x^3) - 1899/(5120*x^5).  */

> +static float

> +y1f_asympt (float x)

> +{

> +  float cst = 0xc.c422ap-4; /* sqrt(2/pi) rounded to nearest */

> +  double y = 1.0 / (double) x;

> +  double y2 = y * y;

> +  double beta1 = 1.0f + y2 * (0x3p-4 - 0x3.18p-4 * y2);

> +  double alpha1;

> +  alpha1 = y * (-0x6p-4 + y2 * (0x2.ap-4 - 0x5.ef33333333334p-4 * y2));

> +  double h;

> +  int n;

> +  h = reduce_aux (x, &n, alpha1);

> +  n--; /* Subtract pi/2.  */

> +  /* Now x - 3pi/4 - alpha1 = h + n*pi/2 mod (2*pi).  */

> +  float xr = (float) h;

> +  n = n & 3;

> +  float t = cst / sqrtf (x) * (float) beta1;

> +  if (n == 0)

> +    return t * __sinf (xr);

> +  else if (n == 2) /* sin(x+pi) = -sin(x) */

> +    return -t * __sinf (xr);

> +  else if (n == 1) /* sin(x+pi/2) = cos(x) */

> +    return t * __cosf (xr);

> +  else             /* sin(x+3pi/2) = -cos(x) */


Double space before closing comment.

> +    return -t * __cosf (xr);

> +}

> +

> +/* Special code for x near a root of y1.

> +   z is the value computed by the generic code.

> +   For small x, we use a polynomial approximating y1 around its root.

> +   For large x, we use an asymptotic formula (y1f_asympt).  */

> +static float

> +y1f_near_root (float x, float z)

> +{

> +  float index_f;

> +  int index;

> +

> +  index_f = roundf ((x - FIRST_ZERO_Y1) / (float) M_PI);

> +  if (index_f >= SMALL_SIZE)

> +    return y1f_asympt (x);

> +  index = (int) index_f;

> +  const float *p = Py[index];

> +  float x0 = p[0];

> +  float x1 = p[2];

> +  /* If not in the interval [x0,x1] around xmid, return the value z.  */

> +  if (! (x0 <= x && x <= x1))

> +    return z;

> +  float xmid = p[1];

> +  float y = x - xmid, p6;

> +  if (index == 0)

> +    p6 = p[6] + y * (-0x1.28043p-8 + y * 0x2.50e83p-8);

> +  else if (index == 1)

> +    p6 = p[6] + y * -0xf.ff6b8p-12;

> +  else

> +    p6 = p[6];

> +  return p[3] + y * (p[4] + y * (p[5] + y * p6));

> +}

> +

>  float

>  __ieee754_y1f(float x)

>  {

> @@ -118,16 +560,18 @@ __ieee754_y1f(float x)

>  	if(__builtin_expect(ix==0, 0))

>  		return -1/zero; /* -inf and divide by zero exception.  */

>  	if(__builtin_expect(hx<0, 0)) return zero/(zero*x);

> -	if(ix >= 0x40000000) {  /* |x| >= 2.0 */

> +        if (ix >= 0x3fe0dfbc) { /* |x| >= 0x1.c1bf78p+0 */

>  		SET_RESTORE_ROUNDF (FE_TONEAREST);

>  		__sincosf (x, &s, &c);

>  		ss = -s-c;

>  		cc = s-c;

> -		if(ix<0x7f000000) {  /* make sure x+x not overflow */

> -		    z = __cosf(x+x);

> -		    if ((s*c)>zero) cc = z/ss;

> -		    else            ss = z/cc;

> -		}

> +                if (ix >= 0x7f000000)

> +		  /* x >= 2^127: use asymptotic expansion.  */

> +                  return y1f_asympt (x);

> +                /* Now we are sure that x+x cannot overflow.  */

> +                z = __cosf(x+x);

> +                if ((s*c)>zero) cc = z/ss;

> +                else            ss = z/cc;

>  	/* y1(x) = sqrt(2/(pi*x))*(p1(x)*sin(x0)+q1(x)*cos(x0))

>  	 * where x0 = x-3pi/4

>  	 *      Better formula:

> @@ -139,12 +583,20 @@ __ieee754_y1f(float x)

>  	 *              sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x))

>  	 * to compute the worse one.

>  	 */

> -		if(ix>0x5c000000) z = (invsqrtpi*ss)/sqrtf(x);

> -		else {

> -		    u = ponef(x); v = qonef(x);

> -		    z = invsqrtpi*(u*ss+v*cc)/sqrtf(x);

> -		}

> -		return z;

> +                if (ix <= 0x5c000000)

> +                  {

> +                    u = ponef(x); v = qonef(x);

> +                    ss = u*ss+v*cc;

> +                  }

> +                z = (invsqrtpi * ss) / sqrtf(x);

> +                float threshold = 0x1.3e014cp-2;

> +                /* The following threshold is optimal: for x=0x1.f7f16ap+0

> +                   and rounding upwards, |ss|=-0x1.3e014cp-2 and z is 11 ulps

> +                   far from the correctly rounded value.  */

> +                if (fabsf (ss) > threshold)

> +                  return z;

> +                else

> +                  return y1f_near_root (x, z);

>  	}

>  	if(__builtin_expect(ix<=0x33000000, 0)) {    /* x < 2**-25 */

>  	    z = -tpi / x;

> @@ -152,6 +604,7 @@ __ieee754_y1f(float x)

>  		__set_errno (ERANGE);

>  	    return z;

>  	}

> +        /* Now 2**-25 <= x < 0x1.c1bf78p+0.  */

>  	z = x*x;

>  	u = U0[0]+z*(U0[1]+z*(U0[2]+z*(U0[3]+z*U0[4])));

>  	v = one+z*(V0[0]+z*(V0[1]+z*(V0[2]+z*(V0[3]+z*V0[4]))));

> @@ -159,7 +612,7 @@ __ieee754_y1f(float x)

>  }

>  libm_alias_finite (__ieee754_y1f, __y1f)

>  

> -/* For x >= 8, the asymptotic expansions of pone is

> +/* For x >= 8, the asymptotic expansion of pone is

>   *	1 + 15/128 s^2 - 4725/2^15 s^4 - ...,	where s = 1/x.

>   * We approximate pone by

>   *	pone(x) = 1 + (R/S)

> @@ -252,8 +705,7 @@ ponef(float x)

>  	return one+ r/s;

>  }

>  

> -

> -/* For x >= 8, the asymptotic expansions of qone is

> +/* For x >= 8, the asymptotic expansion of qone is

>   *	3/8 s - 105/1024 s^3 - ..., where s = 1/x.

>   * We approximate pone by

>   *	qone(x) = s*(0.375 + (R/S))

> @@ -340,10 +792,10 @@ qonef(float x)

>  	GET_FLOAT_WORD(ix,x);

>  	ix &= 0x7fffffff;

>  	/* ix >= 0x40000000 for all calls to this function.  */

> -	if(ix>=0x40200000)     {p = qr8; q= qs8;}

> -	else if(ix>=0x40f71c58){p = qr5; q= qs5;}

> -	else if(ix>=0x4036db68){p = qr3; q= qs3;}

> -	else {p = qr2; q= qs2;}

> +	if(ix>=0x41000000)     {p = qr8; q= qs8;} /* x >= 8 */

> +	else if(ix>=0x40f71c58){p = qr5; q= qs5;} /* x >= 7.722209930e+00 */

> +	else if(ix>=0x4036db68){p = qr3; q= qs3;} /* x >= 2.857141495e+00 */

> +	else {p = qr2; q= qs2;}                   /* x >= 2 */


Double space before closing comment.

>  	z = one/(x*x);

>  	r = p[0]+z*(p[1]+z*(p[2]+z*(p[3]+z*(p[4]+z*p[5]))));

>  	s = one+z*(q[0]+z*(q[1]+z*(q[2]+z*(q[3]+z*(q[4]+z*q[5])))));

> diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps

> index 173388b92a..3010e10c75 100644

> --- a/sysdeps/powerpc/fpu/libm-test-ulps

> +++ b/sysdeps/powerpc/fpu/libm-test-ulps

> @@ -1310,50 +1310,50 @@ float128: 1

>  ldouble: 3

>  

>  Function: "j0":

> -double: 2

> -float: 8

> +double: 3

> +float: 9

>  float128: 2

> -ldouble: 2

> +ldouble: 5

>  

>  Function: "j0_downward":

> -double: 2

> -float: 4

> +double: 6

> +float: 9

>  float128: 4

>  ldouble: 12

>  

>  Function: "j0_towardzero":

> -double: 5

> -float: 6

> +double: 7

> +float: 9

>  float128: 4

>  ldouble: 16

>  

>  Function: "j0_upward":

> -double: 4

> -float: 5

> +double: 9

> +float: 8

>  float128: 5

> -ldouble: 6

> +ldouble: 14

>  

>  Function: "j1":

> -double: 2

> -float: 8

> +double: 4

> +float: 9

>  float128: 4

> -ldouble: 3

> +ldouble: 6

>  

>  Function: "j1_downward":

>  double: 3

> -float: 5

> +float: 8

>  float128: 4

>  ldouble: 7

>  

>  Function: "j1_towardzero":

> -double: 3

> -float: 2

> +double: 4

> +float: 8

>  float128: 4

>  ldouble: 7

>  

>  Function: "j1_upward":

> -double: 3

> -float: 4

> +double: 9

> +float: 9

>  float128: 3

>  ldouble: 6

>  

> @@ -1706,49 +1706,49 @@ ldouble: 5

>  

>  Function: "y0":

>  double: 2

> -float: 6

> +float: 8

>  float128: 3

> -ldouble: 1

> +ldouble: 10

>  

>  Function: "y0_downward":

>  double: 3

> -float: 4

> +float: 8

>  float128: 4

>  ldouble: 10

>  

>  Function: "y0_towardzero":

>  double: 3

> -float: 3

> +float: 8

>  float128: 3

> -ldouble: 8

> +ldouble: 9

>  

>  Function: "y0_upward":

>  double: 2

> -float: 5

> +float: 8

>  float128: 3

>  ldouble: 9

>  

>  Function: "y1":

>  double: 3

> -float: 2

> +float: 9

>  float128: 2

>  ldouble: 2

>  

>  Function: "y1_downward":

> -double: 3

> -float: 2

> +double: 6

> +float: 8

>  float128: 4

> -ldouble: 7

> +ldouble: 11

>  

>  Function: "y1_towardzero":

>  double: 3

> -float: 2

> +float: 9

>  float128: 2

>  ldouble: 9

>  

>  Function: "y1_upward":

> -double: 5

> -float: 2

> +double: 6

> +float: 9

>  float128: 5

>  ldouble: 9

>  

> diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps

> index 91f2c4c80f..9f85f66d4c 100644

> --- a/sysdeps/s390/fpu/libm-test-ulps

> +++ b/sysdeps/s390/fpu/libm-test-ulps

> @@ -1062,44 +1062,44 @@ double: 1

>  ldouble: 1

>  

>  Function: "j0":

> -double: 2

> -float: 8

> +double: 3

> +float: 9

>  ldouble: 2

>  

>  Function: "j0_downward":

> -double: 2

> -float: 4

> -ldouble: 4

> +double: 6

> +float: 9

> +ldouble: 9

>  

>  Function: "j0_towardzero":

> -double: 5

> -float: 6

> -ldouble: 4

> +double: 7

> +float: 9

> +ldouble: 9

>  

>  Function: "j0_upward":

> -double: 4

> -float: 5

> -ldouble: 5

> +double: 9

> +float: 8

> +ldouble: 7

>  

>  Function: "j1":

> -double: 2

> -float: 8

> +double: 4

> +float: 9

>  ldouble: 4

>  

>  Function: "j1_downward":

>  double: 3

> -float: 5

> -ldouble: 4

> +float: 8

> +ldouble: 6

>  

>  Function: "j1_towardzero":

> -double: 3

> -float: 2

> -ldouble: 4

> +double: 4

> +float: 8

> +ldouble: 9

>  

>  Function: "j1_upward":

> -double: 3

> -float: 4

> -ldouble: 3

> +double: 9

> +float: 9

> +ldouble: 9

>  

>  Function: "jn":

>  double: 4

> @@ -1348,42 +1348,42 @@ ldouble: 4

>  

>  Function: "y0":

>  double: 2

> -float: 6

> +float: 8

>  ldouble: 3

>  

>  Function: "y0_downward":

>  double: 3

> -float: 4

> -ldouble: 4

> +float: 8

> +ldouble: 7

>  

>  Function: "y0_towardzero":

>  double: 3

> -float: 3

> +float: 8

>  ldouble: 3

>  

>  Function: "y0_upward":

>  double: 3

> -float: 5

> -ldouble: 3

> +float: 8

> +ldouble: 4

>  

>  Function: "y1":

>  double: 3

> -float: 2

> -ldouble: 2

> +float: 9

> +ldouble: 5

>  

>  Function: "y1_downward":

> -double: 3

> -float: 2

> -ldouble: 4

> +double: 6

> +float: 8

> +ldouble: 5

>  

>  Function: "y1_towardzero":

>  double: 3

> -float: 2

> +float: 9

>  ldouble: 2

>  

>  Function: "y1_upward":

>  double: 7

> -float: 2

> +float: 9

>  ldouble: 5

>  

>  Function: "yn":

> diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps

> index 3304dcf85e..15967c2102 100644

> --- a/sysdeps/sparc/fpu/libm-test-ulps

> +++ b/sysdeps/sparc/fpu/libm-test-ulps

> @@ -1064,43 +1064,43 @@ ldouble: 1

>  

>  Function: "j0":

>  double: 2

> -float: 8

> +float: 9

>  ldouble: 2

>  

>  Function: "j0_downward":

> -double: 2

> -float: 4

> -ldouble: 4

> +double: 5

> +float: 9

> +ldouble: 9

>  

>  Function: "j0_towardzero":

> -double: 4

> -float: 5

> -ldouble: 4

> +double: 6

> +float: 9

> +ldouble: 9

>  

>  Function: "j0_upward":

> -double: 4

> -float: 5

> -ldouble: 5

> +double: 9

> +float: 9

> +ldouble: 7

>  

>  Function: "j1":

> -double: 2

> +double: 4

>  float: 9

>  ldouble: 4

>  

>  Function: "j1_downward":

> -double: 3

> -float: 5

> -ldouble: 4

> +double: 5

> +float: 8

> +ldouble: 6

>  

>  Function: "j1_towardzero":

> -double: 3

> -float: 2

> -ldouble: 4

> +double: 4

> +float: 8

> +ldouble: 9

>  

>  Function: "j1_upward":

> -double: 3

> -float: 5

> -ldouble: 3

> +double: 9

> +float: 9

> +ldouble: 9

>  

>  Function: "jn":

>  double: 4

> @@ -1360,42 +1360,42 @@ ldouble: 4

>  

>  Function: "y0":

>  double: 3

> -float: 8

> +float: 9

>  ldouble: 3

>  

>  Function: "y0_downward":

>  double: 3

> -float: 6

> -ldouble: 4

> +float: 9

> +ldouble: 7

>  

>  Function: "y0_towardzero":

> -double: 3

> -float: 3

> +double: 4

> +float: 9

>  ldouble: 3

>  

>  Function: "y0_upward":

>  double: 3

> -float: 6

> -ldouble: 3

> +float: 9

> +ldouble: 4

>  

>  Function: "y1":

>  double: 3

> -float: 2

> -ldouble: 2

> +float: 9

> +ldouble: 5

>  

>  Function: "y1_downward":

> -double: 3

> -float: 2

> -ldouble: 4

> +double: 6

> +float: 9

> +ldouble: 5

>  

>  Function: "y1_towardzero":

>  double: 3

> -float: 2

> +float: 9

>  ldouble: 2

>  

>  Function: "y1_upward":

>  double: 7

> -float: 2

> +float: 9

>  ldouble: 5

>  

>  Function: "yn":

> diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps

> index bd1fa63702..0edb95e14c 100644

> --- a/sysdeps/x86_64/fpu/libm-test-ulps

> +++ b/sysdeps/x86_64/fpu/libm-test-ulps

> @@ -1317,50 +1317,50 @@ ldouble: 1

>  

>  Function: "j0":

>  double: 2

> -float: 8

> +float: 9

>  float128: 2

> -ldouble: 2

> +ldouble: 8

>  

>  Function: "j0_downward":

> -double: 2

> -float: 4

> -float128: 4

> +double: 5

> +float: 9

> +float128: 9

>  ldouble: 6

>  

>  Function: "j0_towardzero":

> -double: 5

> -float: 6

> -float128: 4

> +double: 6

> +float: 9

> +float128: 9

>  ldouble: 6

>  

>  Function: "j0_upward":

> -double: 4

> -float: 5

> -float128: 5

> +double: 9

> +float: 9

> +float128: 7

>  ldouble: 6

>  

>  Function: "j1":

> -double: 2

> +double: 4

>  float: 9

>  float128: 4

> -ldouble: 5

> +ldouble: 9

>  

>  Function: "j1_downward":

> -double: 3

> -float: 5

> -float128: 4

> -ldouble: 4

> +double: 6

> +float: 8

> +float128: 6

> +ldouble: 8

>  

>  Function: "j1_towardzero":

> -double: 3

> -float: 2

> -float128: 4

> +double: 4

> +float: 9

> +float128: 9

>  ldouble: 4

>  

>  Function: "j1_upward":

> -double: 3

> -float: 5

> -float128: 3

> +double: 9

> +float: 9

> +float128: 9

>  ldouble: 3

>  

>  Function: "jn":

> @@ -1753,27 +1753,27 @@ ldouble: 5

>  

>  Function: "y0":

>  double: 3

> -float: 8

> +float: 9

>  float128: 3

> -ldouble: 1

> +ldouble: 2

>  

>  Function: "y0_downward":

> -double: 3

> -float: 6

> -float128: 4

> -ldouble: 5

> +double: 4

> +float: 9

> +float128: 7

> +ldouble: 7

>  

>  Function: "y0_towardzero":

> -double: 3

> -float: 3

> +double: 4

> +float: 9

>  float128: 3

> -ldouble: 6

> +ldouble: 8

>  

>  Function: "y0_upward":

>  double: 3

> -float: 6

> -float128: 3

> -ldouble: 5

> +float: 9

> +float128: 4

> +ldouble: 7

>  

>  Function: "y1":

>  double: 6

> @@ -1782,14 +1782,14 @@ float128: 5

>  ldouble: 3

>  

>  Function: "y1_downward":

> -double: 3

> -float: 2

> +double: 6

> +float: 9

>  float128: 5

>  ldouble: 7

>  

>  Function: "y1_towardzero":

>  double: 4

> -float: 5

> +float: 9

>  float128: 6

>  ldouble: 5

>  

> 


Ok.
Stafford Horne via Libc-alpha March 30, 2021, 6:41 p.m. | #2
On 30/03/2021 15:18, Adhemerval Zanella wrote:
> 

> 

> On 30/03/2021 14:18, Paul Zimmermann wrote:

>> ---

>>  math/auto-libm-test-in             |  15 +

>>  math/auto-libm-test-out-j0         |  50 +++

>>  math/auto-libm-test-out-j1         |  50 +++

>>  math/auto-libm-test-out-y0         |  50 +++

>>  math/auto-libm-test-out-y1         |  75 +++++

>>  sysdeps/aarch64/libm-test-ulps     |  70 ++--

>>  sysdeps/ieee754/flt-32/e_j0f.c     | 515 ++++++++++++++++++++++++++---

>>  sysdeps/ieee754/flt-32/e_j1f.c     | 512 ++++++++++++++++++++++++++--

>>  sysdeps/powerpc/fpu/libm-test-ulps |  62 ++--

>>  sysdeps/s390/fpu/libm-test-ulps    |  68 ++--

>>  sysdeps/sparc/fpu/libm-test-ulps   |  68 ++--

>>  sysdeps/x86_64/fpu/libm-test-ulps  |  76 ++---

>>  12 files changed, 1369 insertions(+), 242 deletions(-)



>> diff --git a/sysdeps/ieee754/flt-32/e_j0f.c b/sysdeps/ieee754/flt-32/e_j0f.c

>> index 5d29611eb7..b4a4ede0f1 100644

>> --- a/sysdeps/ieee754/flt-32/e_j0f.c

>> +++ b/sysdeps/ieee754/flt-32/e_j0f.c

>> @@ -16,7 +16,9 @@

>>  #include <math.h>

>>  #include <math-barriers.h>

>>  #include <math_private.h>

>> +#include <fenv_private.h>

>>  #include <libm-alias-finite.h>

>> +#include <reduce_aux.h>


You forgot to include the reduce_aux.h on this version.
Paul Zimmermann March 31, 2021, 5:57 a.m. | #3
Dear Adhemerval,

> Patch looks good, thanks for working on this.  Besides the minor style

> nits regarding comments, please open bug reports for the xfail entries

> added on auto-libm-test-in and add a comment to reference them.


thank you very much for your review. There was already a bug open for y0
(BZ #16492). I have created a new one for the three other functions
(j0, j1, y1), it is BZ #27670, and added a pointer in auto-libm-test-in
to those around the xfail entries. We can later split #27670 if needed,
but it is very likely that those functions will be addressed simultaneously.

I'll submit a new patch with these changes, the double spaces added, and
the missing reduce_aux.h.

Paul

Patch

diff --git a/math/auto-libm-test-in b/math/auto-libm-test-in
index 4edaaa8ee1..9ccbd04567 100644
--- a/math/auto-libm-test-in
+++ b/math/auto-libm-test-in
@@ -5792,6 +5792,9 @@  j0 0x1p16382
 j0 0x1p16383
 # the next value generates larger error bounds on x86_64 (binary32)
 j0 0x2.602774p+0 xfail-rounding:ibm128-libgcc
+# the next values yield the largest error (9 ulps) for binary32 (x86_64)
+j0 0x1.04c39cp+6
+j0 0x1.4b7066p+7
 # the next value exercises the flt-32 code path for x >= 2^127
 j0 0x8.2f4ecp+124
 
@@ -5827,6 +5830,9 @@  j1 0x1p-600
 j1 0x1p-10000
 # the next value generates larger error bounds on x86_64 (binary32)
 j1 0x3.ae4b2p+0 xfail-rounding:ibm128-libgcc
+# the next two values yield large errors in the binary32 format
+j1 0x1.2f28eap+7 xfail-rounding:binary64 xfail-rounding:binary128 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc
+j1 0x1.a1d20ap+6 xfail-rounding:binary128 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc
 j1 min
 j1 -min
 j1 min_subnorm
@@ -8275,6 +8281,11 @@  y0 0x1p-600
 y0 0x1p-10000
 # the next value generates larger error bounds on x86_64 (binary32)
 y0 0xd.3432bp-4
+# the next two values yield the largest error (9 ulps) for binary32 (x86_64)
+# the first one fails for binary64 and ldbl-96 because y0 is known to have
+# large errors (#16492)
+y0 0x1.33eaacp+5 xfail:binary64 xfail:intel96 xfail-rounding:ibm128-libgcc
+y0 0x1.a681cep-1 xfail-rounding:ibm128-libgcc
 y0 min
 y0 min_subnorm
 
@@ -8303,6 +8314,10 @@  y1 0x1p-100
 y1 0x1p-110
 y1 0x1p-600
 y1 0x1p-10000
+# the next three values yield the largest error in the binary32 format
+y1 0x1.065194p+7 xfail-rounding:binary64 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc
+y1 0x1.c1badep+0 xfail-rounding:ibm128-libgcc
+y1 0x1.c1bc2ep+0
 y1 min
 y1 min_subnorm
 
diff --git a/math/auto-libm-test-out-j0 b/math/auto-libm-test-out-j0
index cc1fea074e..85c4148de3 100644
--- a/math/auto-libm-test-out-j0
+++ b/math/auto-libm-test-out-j0
@@ -1359,6 +1359,56 @@  j0 0x2.602774p+0 xfail-rounding:ibm128-libgcc
 = j0 tonearest ibm128 0x2.602774p+0 : 0x3.e83779fe19911fa806cee83ab7p-8 : inexact-ok
 = j0 towardzero ibm128 0x2.602774p+0 : 0x3.e83779fe19911fa806cee83ab7p-8 : xfail:ibm128-libgcc inexact-ok
 = j0 upward ibm128 0x2.602774p+0 : 0x3.e83779fe19911fa806cee83ab8p-8 : xfail:ibm128-libgcc inexact-ok
+j0 0x1.04c39cp+6
+= j0 downward binary32 0x4.130e7p+4 : -0x6.dd65d8p-16 : inexact-ok
+= j0 tonearest binary32 0x4.130e7p+4 : -0x6.dd65dp-16 : inexact-ok
+= j0 towardzero binary32 0x4.130e7p+4 : -0x6.dd65dp-16 : inexact-ok
+= j0 upward binary32 0x4.130e7p+4 : -0x6.dd65dp-16 : inexact-ok
+= j0 downward binary64 0x4.130e7p+4 : -0x6.dd65d0005a19p-16 : inexact-ok
+= j0 tonearest binary64 0x4.130e7p+4 : -0x6.dd65d0005a18cp-16 : inexact-ok
+= j0 towardzero binary64 0x4.130e7p+4 : -0x6.dd65d0005a18cp-16 : inexact-ok
+= j0 upward binary64 0x4.130e7p+4 : -0x6.dd65d0005a18cp-16 : inexact-ok
+= j0 downward intel96 0x4.130e7p+4 : -0x6.dd65d0005a18c04p-16 : inexact-ok
+= j0 tonearest intel96 0x4.130e7p+4 : -0x6.dd65d0005a18c04p-16 : inexact-ok
+= j0 towardzero intel96 0x4.130e7p+4 : -0x6.dd65d0005a18c038p-16 : inexact-ok
+= j0 upward intel96 0x4.130e7p+4 : -0x6.dd65d0005a18c038p-16 : inexact-ok
+= j0 downward m68k96 0x4.130e7p+4 : -0x6.dd65d0005a18c04p-16 : inexact-ok
+= j0 tonearest m68k96 0x4.130e7p+4 : -0x6.dd65d0005a18c04p-16 : inexact-ok
+= j0 towardzero m68k96 0x4.130e7p+4 : -0x6.dd65d0005a18c038p-16 : inexact-ok
+= j0 upward m68k96 0x4.130e7p+4 : -0x6.dd65d0005a18c038p-16 : inexact-ok
+= j0 downward binary128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bb64p-16 : inexact-ok
+= j0 tonearest binary128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bb6p-16 : inexact-ok
+= j0 towardzero binary128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bb6p-16 : inexact-ok
+= j0 upward binary128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bb6p-16 : inexact-ok
+= j0 downward ibm128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bcp-16 : inexact-ok
+= j0 tonearest ibm128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bcp-16 : inexact-ok
+= j0 towardzero ibm128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bap-16 : inexact-ok
+= j0 upward ibm128 0x4.130e7p+4 : -0x6.dd65d0005a18c03fc1c61141bap-16 : inexact-ok
+j0 0x1.4b7066p+7
+= j0 downward binary32 0xa.5b833p+4 : 0xf.80edp-20 : inexact-ok
+= j0 tonearest binary32 0xa.5b833p+4 : 0xf.80edp-20 : inexact-ok
+= j0 towardzero binary32 0xa.5b833p+4 : 0xf.80edp-20 : inexact-ok
+= j0 upward binary32 0xa.5b833p+4 : 0xf.80ed1p-20 : inexact-ok
+= j0 downward binary64 0xa.5b833p+4 : 0xf.80ed0055a8e58p-20 : inexact-ok
+= j0 tonearest binary64 0xa.5b833p+4 : 0xf.80ed0055a8e6p-20 : inexact-ok
+= j0 towardzero binary64 0xa.5b833p+4 : 0xf.80ed0055a8e58p-20 : inexact-ok
+= j0 upward binary64 0xa.5b833p+4 : 0xf.80ed0055a8e6p-20 : inexact-ok
+= j0 downward intel96 0xa.5b833p+4 : 0xf.80ed0055a8e5c88p-20 : inexact-ok
+= j0 tonearest intel96 0xa.5b833p+4 : 0xf.80ed0055a8e5c89p-20 : inexact-ok
+= j0 towardzero intel96 0xa.5b833p+4 : 0xf.80ed0055a8e5c88p-20 : inexact-ok
+= j0 upward intel96 0xa.5b833p+4 : 0xf.80ed0055a8e5c89p-20 : inexact-ok
+= j0 downward m68k96 0xa.5b833p+4 : 0xf.80ed0055a8e5c88p-20 : inexact-ok
+= j0 tonearest m68k96 0xa.5b833p+4 : 0xf.80ed0055a8e5c89p-20 : inexact-ok
+= j0 towardzero m68k96 0xa.5b833p+4 : 0xf.80ed0055a8e5c88p-20 : inexact-ok
+= j0 upward m68k96 0xa.5b833p+4 : 0xf.80ed0055a8e5c89p-20 : inexact-ok
+= j0 downward binary128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe12ap-20 : inexact-ok
+= j0 tonearest binary128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe12a8p-20 : inexact-ok
+= j0 towardzero binary128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe12ap-20 : inexact-ok
+= j0 upward binary128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe12a8p-20 : inexact-ok
+= j0 downward ibm128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe1p-20 : inexact-ok
+= j0 tonearest ibm128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe14p-20 : inexact-ok
+= j0 towardzero ibm128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe1p-20 : inexact-ok
+= j0 upward ibm128 0xa.5b833p+4 : 0xf.80ed0055a8e5c88af9c0edfe14p-20 : inexact-ok
 j0 0x8.2f4ecp+124
 = j0 downward binary32 0x8.2f4ecp+124 : 0xd.331efp-68 : inexact-ok
 = j0 tonearest binary32 0x8.2f4ecp+124 : 0xd.331efp-68 : inexact-ok
diff --git a/math/auto-libm-test-out-j1 b/math/auto-libm-test-out-j1
index 6bc3bbe5a4..f808801976 100644
--- a/math/auto-libm-test-out-j1
+++ b/math/auto-libm-test-out-j1
@@ -993,6 +993,56 @@  j1 0x3.ae4b2p+0 xfail-rounding:ibm128-libgcc
 = j1 tonearest ibm128 0x3.ae4b2p+0 : 0xf.d085c66e86f30267f22d6f787cp-8 : inexact-ok
 = j1 towardzero ibm128 0x3.ae4b2p+0 : 0xf.d085c66e86f30267f22d6f787cp-8 : xfail:ibm128-libgcc inexact-ok
 = j1 upward ibm128 0x3.ae4b2p+0 : 0xf.d085c66e86f30267f22d6f788p-8 : xfail:ibm128-libgcc inexact-ok
+j1 0x1.2f28eap+7 xfail-rounding:binary64 xfail-rounding:binary128 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc
+= j1 downward binary32 0x9.79475p+4 : 0x2.49a6fp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest binary32 0x9.79475p+4 : 0x2.49a6f4p-16 : inexact-ok
+= j1 towardzero binary32 0x9.79475p+4 : 0x2.49a6fp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward binary32 0x9.79475p+4 : 0x2.49a6f4p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward binary64 0x9.79475p+4 : 0x2.49a6f3fb00346p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest binary64 0x9.79475p+4 : 0x2.49a6f3fb00346p-16 : inexact-ok
+= j1 towardzero binary64 0x9.79475p+4 : 0x2.49a6f3fb00346p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward binary64 0x9.79475p+4 : 0x2.49a6f3fb00348p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward intel96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest intel96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : inexact-ok
+= j1 towardzero intel96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward intel96 0x9.79475p+4 : 0x2.49a6f3fb003462a4p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward m68k96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest m68k96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : inexact-ok
+= j1 towardzero m68k96 0x9.79475p+4 : 0x2.49a6f3fb003462ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward m68k96 0x9.79475p+4 : 0x2.49a6f3fb003462a4p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward binary128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec05ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest binary128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec05cp-16 : inexact-ok
+= j1 towardzero binary128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec05ap-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward binary128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec05cp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward ibm128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cecp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest ibm128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cecp-16 : inexact-ok
+= j1 towardzero ibm128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cecp-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward ibm128 0x9.79475p+4 : 0x2.49a6f3fb003462a1f15135cec1p-16 : xfail:binary64 xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+j1 0x1.a1d20ap+6 xfail-rounding:binary128 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc
+= j1 downward binary32 0x6.874828p+4 : -0x3.d6f0b8p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest binary32 0x6.874828p+4 : -0x3.d6f0b4p-16 : inexact-ok
+= j1 towardzero binary32 0x6.874828p+4 : -0x3.d6f0b4p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward binary32 0x6.874828p+4 : -0x3.d6f0b4p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward binary64 0x6.874828p+4 : -0x3.d6f0b408112dp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest binary64 0x6.874828p+4 : -0x3.d6f0b408112dp-16 : inexact-ok
+= j1 towardzero binary64 0x6.874828p+4 : -0x3.d6f0b408112cep-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward binary64 0x6.874828p+4 : -0x3.d6f0b408112cep-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward intel96 0x6.874828p+4 : -0x3.d6f0b408112cf3dp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest intel96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : inexact-ok
+= j1 towardzero intel96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward intel96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward m68k96 0x6.874828p+4 : -0x3.d6f0b408112cf3dp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest m68k96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : inexact-ok
+= j1 towardzero m68k96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward m68k96 0x6.874828p+4 : -0x3.d6f0b408112cf3ccp-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward binary128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7088p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest binary128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7086p-16 : inexact-ok
+= j1 towardzero binary128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7086p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward binary128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7086p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 downward ibm128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca71p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 tonearest ibm128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca71p-16 : inexact-ok
+= j1 towardzero ibm128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= j1 upward ibm128 0x6.874828p+4 : -0x3.d6f0b408112cf3cdb53ca3ca7p-16 : xfail:binary128 xfail:intel96 xfail:ibm128-libgcc inexact-ok
 j1 min
 = j1 downward binary32 0x4p-128 : 0x1.fffff8p-128 : inexact-ok underflow errno-erange-ok
 = j1 tonearest binary32 0x4p-128 : 0x2p-128 : inexact-ok underflow errno-erange-ok
diff --git a/math/auto-libm-test-out-y0 b/math/auto-libm-test-out-y0
index 8ebb585170..f9742b0f69 100644
--- a/math/auto-libm-test-out-y0
+++ b/math/auto-libm-test-out-y0
@@ -820,6 +820,56 @@  y0 0xd.3432bp-4
 = y0 tonearest ibm128 0xd.3432bp-4 : -0xf.fdd871793bc71f92d6b137b44p-8 : inexact-ok
 = y0 towardzero ibm128 0xd.3432bp-4 : -0xf.fdd871793bc71f92d6b137b44p-8 : inexact-ok
 = y0 upward ibm128 0xd.3432bp-4 : -0xf.fdd871793bc71f92d6b137b44p-8 : inexact-ok
+y0 0x1.33eaacp+5 xfail:binary64 xfail:intel96 xfail-rounding:ibm128-libgcc
+= y0 downward binary32 0x2.67d558p+4 : 0xf.6b0ep-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 tonearest binary32 0x2.67d558p+4 : 0xf.6b0ep-16 : xfail:binary64 xfail:intel96 inexact-ok
+= y0 towardzero binary32 0x2.67d558p+4 : 0xf.6b0ep-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 upward binary32 0x2.67d558p+4 : 0xf.6b0e1p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 downward binary64 0x2.67d558p+4 : 0xf.6b0e005e95ad8p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 tonearest binary64 0x2.67d558p+4 : 0xf.6b0e005e95ad8p-16 : xfail:binary64 xfail:intel96 inexact-ok
+= y0 towardzero binary64 0x2.67d558p+4 : 0xf.6b0e005e95ad8p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 upward binary64 0x2.67d558p+4 : 0xf.6b0e005e95aep-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 downward intel96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 tonearest intel96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 inexact-ok
+= y0 towardzero intel96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 upward intel96 0x2.67d558p+4 : 0xf.6b0e005e95ad82bp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 downward m68k96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 tonearest m68k96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 inexact-ok
+= y0 towardzero m68k96 0x2.67d558p+4 : 0xf.6b0e005e95ad82ap-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 upward m68k96 0x2.67d558p+4 : 0xf.6b0e005e95ad82bp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 downward binary128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b86p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 tonearest binary128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b86p-16 : xfail:binary64 xfail:intel96 inexact-ok
+= y0 towardzero binary128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b86p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 upward binary128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b868p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 downward ibm128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b8p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 tonearest ibm128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b8p-16 : xfail:binary64 xfail:intel96 inexact-ok
+= y0 towardzero ibm128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6b8p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y0 upward ibm128 0x2.67d558p+4 : 0xf.6b0e005e95ad82a5093316a6bcp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+y0 0x1.a681cep-1 xfail-rounding:ibm128-libgcc
+= y0 downward binary32 0xd.340e7p-4 : -0xf.ffff8p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 tonearest binary32 0xd.340e7p-4 : -0xf.ffff8p-8 : inexact-ok
+= y0 towardzero binary32 0xd.340e7p-4 : -0xf.ffff7p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 upward binary32 0xd.340e7p-4 : -0xf.ffff7p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 downward binary64 0xd.340e7p-4 : -0xf.ffff7f7e5aba8p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 tonearest binary64 0xd.340e7p-4 : -0xf.ffff7f7e5aba8p-8 : inexact-ok
+= y0 towardzero binary64 0xd.340e7p-4 : -0xf.ffff7f7e5abap-8 : xfail:ibm128-libgcc inexact-ok
+= y0 upward binary64 0xd.340e7p-4 : -0xf.ffff7f7e5abap-8 : xfail:ibm128-libgcc inexact-ok
+= y0 downward intel96 0xd.340e7p-4 : -0xf.ffff7f7e5aba736p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 tonearest intel96 0xd.340e7p-4 : -0xf.ffff7f7e5aba736p-8 : inexact-ok
+= y0 towardzero intel96 0xd.340e7p-4 : -0xf.ffff7f7e5aba735p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 upward intel96 0xd.340e7p-4 : -0xf.ffff7f7e5aba735p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 downward m68k96 0xd.340e7p-4 : -0xf.ffff7f7e5aba736p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 tonearest m68k96 0xd.340e7p-4 : -0xf.ffff7f7e5aba736p-8 : inexact-ok
+= y0 towardzero m68k96 0xd.340e7p-4 : -0xf.ffff7f7e5aba735p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 upward m68k96 0xd.340e7p-4 : -0xf.ffff7f7e5aba735p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 downward binary128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25de8p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 tonearest binary128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25dep-8 : inexact-ok
+= y0 towardzero binary128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25dep-8 : xfail:ibm128-libgcc inexact-ok
+= y0 upward binary128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25dep-8 : xfail:ibm128-libgcc inexact-ok
+= y0 downward ibm128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b26p-8 : xfail:ibm128-libgcc inexact-ok
+= y0 tonearest ibm128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25cp-8 : inexact-ok
+= y0 towardzero ibm128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25cp-8 : xfail:ibm128-libgcc inexact-ok
+= y0 upward ibm128 0xd.340e7p-4 : -0xf.ffff7f7e5aba735ccb0b13b25cp-8 : xfail:ibm128-libgcc inexact-ok
 y0 min
 = y0 downward binary32 0x4p-128 : -0x3.7ac89cp+4 : inexact-ok
 = y0 tonearest binary32 0x4p-128 : -0x3.7ac89cp+4 : inexact-ok
diff --git a/math/auto-libm-test-out-y1 b/math/auto-libm-test-out-y1
index af68e6c05a..ab920d1542 100644
--- a/math/auto-libm-test-out-y1
+++ b/math/auto-libm-test-out-y1
@@ -795,6 +795,81 @@  y1 0x1p-10000
 = y1 tonearest binary128 0x1p-10000 : -0xa.2f9836e4e441529fc2757d1f535p+9996 : inexact-ok
 = y1 towardzero binary128 0x1p-10000 : -0xa.2f9836e4e441529fc2757d1f5348p+9996 : inexact-ok
 = y1 upward binary128 0x1p-10000 : -0xa.2f9836e4e441529fc2757d1f5348p+9996 : inexact-ok
+y1 0x1.065194p+7 xfail-rounding:binary64 xfail-rounding:intel96 xfail-rounding:ibm128-libgcc
+= y1 downward binary32 0x8.328cap+4 : -0x3.2fe874p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 tonearest binary32 0x8.328cap+4 : -0x3.2fe87p-16 : inexact-ok
+= y1 towardzero binary32 0x8.328cap+4 : -0x3.2fe87p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 upward binary32 0x8.328cap+4 : -0x3.2fe87p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 downward binary64 0x8.328cap+4 : -0x3.2fe87001a1df4p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 tonearest binary64 0x8.328cap+4 : -0x3.2fe87001a1df4p-16 : inexact-ok
+= y1 towardzero binary64 0x8.328cap+4 : -0x3.2fe87001a1df2p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 upward binary64 0x8.328cap+4 : -0x3.2fe87001a1df2p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 downward intel96 0x8.328cap+4 : -0x3.2fe87001a1df3754p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 tonearest intel96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : inexact-ok
+= y1 towardzero intel96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 upward intel96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 downward m68k96 0x8.328cap+4 : -0x3.2fe87001a1df3754p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 tonearest m68k96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : inexact-ok
+= y1 towardzero m68k96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 upward m68k96 0x8.328cap+4 : -0x3.2fe87001a1df375p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 downward binary128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecb2p-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 tonearest binary128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecb2p-16 : inexact-ok
+= y1 towardzero binary128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecbp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 upward binary128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecbp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 downward ibm128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aedp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 tonearest ibm128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aedp-16 : inexact-ok
+= y1 towardzero ibm128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+= y1 upward ibm128 0x8.328cap+4 : -0x3.2fe87001a1df375068d0356aecp-16 : xfail:binary64 xfail:intel96 xfail:ibm128-libgcc inexact-ok
+y1 0x1.c1badep+0 xfail-rounding:ibm128-libgcc
+= y1 downward binary32 0x1.c1badep+0 : -0x3.ff6378p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 tonearest binary32 0x1.c1badep+0 : -0x3.ff6374p-4 : inexact-ok
+= y1 towardzero binary32 0x1.c1badep+0 : -0x3.ff6374p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 upward binary32 0x1.c1badep+0 : -0x3.ff6374p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 downward binary64 0x1.c1badep+0 : -0x3.ff6375c9f4404p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 tonearest binary64 0x1.c1badep+0 : -0x3.ff6375c9f4402p-4 : inexact-ok
+= y1 towardzero binary64 0x1.c1badep+0 : -0x3.ff6375c9f4402p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 upward binary64 0x1.c1badep+0 : -0x3.ff6375c9f4402p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 downward intel96 0x1.c1badep+0 : -0x3.ff6375c9f440230cp-4 : xfail:ibm128-libgcc inexact-ok
+= y1 tonearest intel96 0x1.c1badep+0 : -0x3.ff6375c9f440230cp-4 : inexact-ok
+= y1 towardzero intel96 0x1.c1badep+0 : -0x3.ff6375c9f4402308p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 upward intel96 0x1.c1badep+0 : -0x3.ff6375c9f4402308p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 downward m68k96 0x1.c1badep+0 : -0x3.ff6375c9f440230cp-4 : xfail:ibm128-libgcc inexact-ok
+= y1 tonearest m68k96 0x1.c1badep+0 : -0x3.ff6375c9f440230cp-4 : inexact-ok
+= y1 towardzero m68k96 0x1.c1badep+0 : -0x3.ff6375c9f4402308p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 upward m68k96 0x1.c1badep+0 : -0x3.ff6375c9f4402308p-4 : xfail:ibm128-libgcc inexact-ok
+= y1 downward binary128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1ebp-4 : xfail:ibm128-libgcc inexact-ok
+= y1 tonearest binary128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1ebp-4 : inexact-ok
+= y1 towardzero binary128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1eaep-4 : xfail:ibm128-libgcc inexact-ok
+= y1 upward binary128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1eaep-4 : xfail:ibm128-libgcc inexact-ok
+= y1 downward ibm128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1fp-4 : xfail:ibm128-libgcc inexact-ok
+= y1 tonearest ibm128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1fp-4 : inexact-ok
+= y1 towardzero ibm128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1ep-4 : xfail:ibm128-libgcc inexact-ok
+= y1 upward ibm128 0x1.c1badep+0 : -0x3.ff6375c9f440230a8962842c1ep-4 : xfail:ibm128-libgcc inexact-ok
+y1 0x1.c1bc2ep+0
+= y1 downward binary32 0x1.c1bc2ep+0 : -0x3.ff56acp-4 : inexact-ok
+= y1 tonearest binary32 0x1.c1bc2ep+0 : -0x3.ff56a8p-4 : inexact-ok
+= y1 towardzero binary32 0x1.c1bc2ep+0 : -0x3.ff56a8p-4 : inexact-ok
+= y1 upward binary32 0x1.c1bc2ep+0 : -0x3.ff56a8p-4 : inexact-ok
+= y1 downward binary64 0x1.c1bc2ep+0 : -0x3.ff56a991ca78ap-4 : inexact-ok
+= y1 tonearest binary64 0x1.c1bc2ep+0 : -0x3.ff56a991ca788p-4 : inexact-ok
+= y1 towardzero binary64 0x1.c1bc2ep+0 : -0x3.ff56a991ca788p-4 : inexact-ok
+= y1 upward binary64 0x1.c1bc2ep+0 : -0x3.ff56a991ca788p-4 : inexact-ok
+= y1 downward intel96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd8p-4 : inexact-ok
+= y1 tonearest intel96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok
+= y1 towardzero intel96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok
+= y1 upward intel96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok
+= y1 downward m68k96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd8p-4 : inexact-ok
+= y1 tonearest m68k96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok
+= y1 towardzero m68k96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok
+= y1 upward m68k96 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd4p-4 : inexact-ok
+= y1 downward binary128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd5084138201442p-4 : inexact-ok
+= y1 tonearest binary128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd5084138201442p-4 : inexact-ok
+= y1 towardzero binary128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd508413820144p-4 : inexact-ok
+= y1 upward binary128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd508413820144p-4 : inexact-ok
+= y1 downward ibm128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd50841382015p-4 : inexact-ok
+= y1 tonearest ibm128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd50841382014p-4 : inexact-ok
+= y1 towardzero ibm128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd50841382014p-4 : inexact-ok
+= y1 upward ibm128 0x1.c1bc2ep+0 : -0x3.ff56a991ca788cd50841382014p-4 : inexact-ok
 y1 min
 = y1 downward binary32 0x4p-128 : -0x2.8be61p+124 : inexact-ok
 = y1 tonearest binary32 0x4p-128 : -0x2.8be60cp+124 : inexact-ok
diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps
index 21ff7dc5ce..a0329dea63 100644
--- a/sysdeps/aarch64/libm-test-ulps
+++ b/sysdeps/aarch64/libm-test-ulps
@@ -1066,44 +1066,44 @@  double: 1
 ldouble: 1
 
 Function: "j0":
-double: 2
-float: 8
+double: 3
+float: 9
 ldouble: 2
 
 Function: "j0_downward":
-double: 2
-float: 4
-ldouble: 4
+double: 6
+float: 9
+ldouble: 9
 
 Function: "j0_towardzero":
-double: 5
-float: 6
-ldouble: 4
+double: 7
+float: 9
+ldouble: 9
 
 Function: "j0_upward":
-double: 4
-float: 5
-ldouble: 5
+double: 9
+float: 8
+ldouble: 7
 
 Function: "j1":
-double: 2
-float: 8
+double: 4
+float: 9
 ldouble: 4
 
 Function: "j1_downward":
 double: 3
-float: 5
-ldouble: 4
+float: 8
+ldouble: 6
 
 Function: "j1_towardzero":
-double: 3
-float: 2
-ldouble: 4
+double: 4
+float: 8
+ldouble: 9
 
 Function: "j1_upward":
-double: 3
-float: 4
-ldouble: 3
+double: 9
+float: 9
+ldouble: 9
 
 Function: "jn":
 double: 4
@@ -1364,42 +1364,42 @@  ldouble: 4
 
 Function: "y0":
 double: 2
-float: 6
+float: 8
 ldouble: 3
 
 Function: "y0_downward":
 double: 3
-float: 4
-ldouble: 4
+float: 8
+ldouble: 7
 
 Function: "y0_towardzero":
 double: 3
-float: 3
+float: 8
 ldouble: 3
 
 Function: "y0_upward":
 double: 2
-float: 5
-ldouble: 3
+float: 8
+ldouble: 4
 
 Function: "y1":
 double: 3
-float: 2
-ldouble: 2
+float: 9
+ldouble: 5
 
 Function: "y1_downward":
-double: 3
-float: 2
-ldouble: 4
+double: 6
+float: 8
+ldouble: 5
 
 Function: "y1_towardzero":
 double: 3
-float: 2
+float: 9
 ldouble: 2
 
 Function: "y1_upward":
-double: 5
-float: 2
+double: 6
+float: 9
 ldouble: 5
 
 Function: "yn":
diff --git a/sysdeps/ieee754/flt-32/e_j0f.c b/sysdeps/ieee754/flt-32/e_j0f.c
index 5d29611eb7..b4a4ede0f1 100644
--- a/sysdeps/ieee754/flt-32/e_j0f.c
+++ b/sysdeps/ieee754/flt-32/e_j0f.c
@@ -16,7 +16,9 @@ 
 #include <math.h>
 #include <math-barriers.h>
 #include <math_private.h>
+#include <fenv_private.h>
 #include <libm-alias-finite.h>
+#include <reduce_aux.h>
 
 static float pzerof(float), qzerof(float);
 
@@ -37,6 +39,218 @@  S04  =  1.1661400734e-09; /* 0x30a045e8 */
 
 static const float zero = 0.0;
 
+/* This is the nearest approximation of the first zero of j0.  */
+#define FIRST_ZERO_J0 0xf.26247p-28f
+
+#define SMALL_SIZE 64
+
+/* The following table contains successive zeros of j0 and degree-3
+   polynomial approximations of j0 around these zeros: Pj[0] for the first
+   zero (2.40482), Pj[1] for the second one (5.520078), and so on.
+   Each line contains:
+              {x0, xmid, x1, p0, p1, p2, p3}
+   where [x0,x1] is the interval around the zero, xmid is the binary32 number
+   closest to the zero, and p0+p1*x+p2*x^2+p3*x^3 is the approximation
+   polynomial.  Each polynomial was generated using Sollya on the interval
+   [x0,x1] around the corresponding zero where the error exceeds 9 ulps
+   for the alternate code.  Degree 3 is enough to get an error <= 9 ulps.
+*/
+static const float Pj[SMALL_SIZE][7] = {
+  /* The following polynomial was optimized by hand with respect to the one
+     generated by Sollya, to ensure the maximal error is at most 9 ulps,
+     both if the polynomial is evaluated with fma or not.  */
+  { 0x1.31e5c4p+1, 0x1.33d152p+1, 0x1.3b58dep+1, 0xf.2623fp-28,
+    -0x8.4e6d7p-4, 0x1.ba2aaap-4, 0xe.4b9ap-8 }, /* 0 */
+  { 0x1.60eafap+2, 0x1.6148f6p+2, 0x1.62955cp+2, 0x6.9205fp-28,
+    0x5.71b98p-4, -0x7.e3e798p-8, -0xd.87d1p-8 }, /* 1 */
+  { 0x1.14cde2p+3, 0x1.14eb56p+3, 0x1.1525c6p+3, 0x1.bcc1cap-24,
+    -0x4.57de6p-4, 0x4.03e7cp-8, 0xb.39a37p-8 }, /* 2 */
+  { 0x1.7931d8p+3, 0x1.79544p+3, 0x1.7998d6p+3, -0xf.2976fp-32,
+    0x3.b827ccp-4, -0x2.8603ep-8, -0x9.bf49bp-8 }, /* 3 */
+  { 0x1.ddb6d4p+3, 0x1.ddca14p+3, 0x1.ddf0c8p+3, -0x1.bd67d8p-28,
+    -0x3.4e03ap-4, 0x1.c562a2p-8, 0x8.90ec2p-8 }, /* 4 */
+  { 0x1.2118e4p+4, 0x1.212314p+4, 0x1.21375p+4, 0x1.62209cp-28,
+    0x3.00efecp-4, -0x1.5458dap-8, -0x8.10063p-8 }, /* 5 */
+  { 0x1.535d28p+4, 0x1.5362dep+4, 0x1.536e48p+4, -0x2.853f74p-24,
+    -0x2.c5b274p-4, 0x1.0b9db4p-8, 0x7.8c3578p-8 }, /* 6 */
+  { 0x1.859ddp+4, 0x1.85a3bap+4, 0x1.85aff4p+4, 0x2.19ed1cp-24,
+    0x2.96545cp-4, -0xd.997e6p-12, -0x6.d9af28p-8 }, /* 7 */
+  { 0x1.b7decap+4, 0x1.b7e54ap+4, 0x1.b7f038p+4, 0xe.959aep-28,
+    -0x2.6f5594p-4, 0xb.538dp-12, 0x7.003ea8p-8 }, /* 8 */
+  { 0x1.ea21c6p+4, 0x1.ea275ap+4, 0x1.ea337ap+4, 0x2.0c3964p-24,
+    0x2.4e80fcp-4, -0x9.a2216p-12, -0x6.61e0a8p-8 }, /* 9 */
+  { 0x1.0e3316p+5, 0x1.0e34e2p+5, 0x1.0e379ap+5, -0x3.642554p-24,
+    -0x2.325e48p-4, 0x8.4f49cp-12, 0x7.d37c3p-8 }, /* 10 */
+  { 0x1.275456p+5, 0x1.275638p+5, 0x1.2759e2p+5, 0x1.6c015ap-24,
+    0x2.19e7d8p-4, -0x7.4c1bf8p-12, -0x4.af7ef8p-8 }, /* 11 */
+  { 0x1.4075ecp+5, 0x1.4077a8p+5, 0x1.407b96p+5, -0x4.b18c9p-28,
+    -0x2.046174p-4, 0x6.705618p-12, 0x5.f2d28p-8 }, /* 12 */
+  { 0x1.59973p+5, 0x1.59992cp+5, 0x1.599b2ap+5, -0x1.8b8792p-24,
+    0x1.f13fbp-4, -0x5.c14938p-12, -0x5.73e0cp-8 }, /* 13 */
+  { 0x1.72b958p+5, 0x1.72bacp+5, 0x1.72bc5ap+5, 0x3.a26e0cp-24,
+    -0x1.e018dap-4, 0x5.30e8dp-12, 0x2.81099p-8 }, /* 14 */
+  { 0x1.8bdb4ap+5, 0x1.8bdc62p+5, 0x1.8bde7ep+5, -0x2.18fabcp-24,
+    0x1.d09b22p-4, -0x4.b0b688p-12, -0x5.5fd308p-8 }, /* 15 */
+  { 0x1.a4fcecp+5, 0x1.a4fe0ep+5, 0x1.a50042p+5, 0x3.2370e8p-24,
+    -0x1.c28614p-4, 0x4.4647e8p-12, 0x5.68a28p-8 }, /* 16 */
+  { 0x1.be1ebcp+5, 0x1.be1fc4p+5, 0x1.be21fp+5, -0x5.9eae3p-28,
+    0x1.b5a622p-4, -0x3.eb9054p-12, -0x5.12d8cp-8 }, /* 17 */
+  { 0x1.d7405p+5, 0x1.d7418p+5, 0x1.d74294p+5, 0x2.9fa1e8p-24,
+    -0x1.a9d184p-4, 0x3.9d1e7p-12, 0x4.33d058p-8 }, /* 18 */
+  { 0x1.f0624p+5, 0x1.f06344p+5, 0x1.f0645ep+5, 0x9.9ac67p-28,
+    0x1.9ee5eep-4, -0x3.5816e8p-12, -0x2.6e5004p-8 }, /* 19 */
+  { 0x1.04c22ep+6, 0x1.04c286p+6, 0x1.04c316p+6, 0xd.6ab94p-28,
+    -0x1.94c6f6p-4, 0x3.174efcp-12, 0x7.9a092p-8 }, /* 20 */
+  { 0x1.1153p+6, 0x1.11536cp+6, 0x1.11541p+6, -0x4.4cb2d8p-24,
+    0x1.8b5cccp-4, -0x2.e3c238p-12, -0x4.e5437p-8 }, /* 21 */
+  { 0x1.1de3d8p+6, 0x1.1de456p+6, 0x1.1de4dap+6, -0x4.4aa8c8p-24,
+    -0x1.829356p-4, 0x2.b45124p-12, 0x5.baf638p-8 }, /* 22 */
+  { 0x1.2a74f8p+6, 0x1.2a754p+6, 0x1.2a75bp+6, 0x2.077c38p-24,
+    0x1.7a597ep-4, -0x2.8a0414p-12, -0x2.838d3p-8 }, /* 23 */
+  { 0x1.3705d4p+6, 0x1.37062cp+6, 0x1.3706b2p+6, -0x2.6a6cd8p-24,
+    -0x1.72a09ap-4, 0x2.623a3cp-12, 0x5.5256a8p-8 }, /* 24 */
+  { 0x1.4396dp+6, 0x1.439718p+6, 0x1.43976ep+6, -0x5.08287p-24,
+    0x1.6b5c06p-4, -0x2.3da154p-12, -0x7.a2254p-8 }, /* 25 */
+  { 0x1.5027acp+6, 0x1.502808p+6, 0x1.50288cp+6, -0x3.4598dcp-24,
+    -0x1.6480c4p-4, 0x2.1cb944p-12, 0x7.27c77p-8 }, /* 26 */
+  { 0x1.5cb89ap+6, 0x1.5cb8f8p+6, 0x1.5cb97ep+6, 0x5.4e74bp-24,
+    0x1.5e0544p-4, -0x2.00b158p-12, -0x5.9bc4a8p-8 }, /* 27 */
+  { 0x1.69498cp+6, 0x1.6949e8p+6, 0x1.694a42p+6, -0x2.05751cp-24,
+    -0x1.57e12p-4, 0x1.e78edcp-12, 0x9.9667dp-8 }, /* 28 */
+  { 0x1.75da7ep+6, 0x1.75dadap+6, 0x1.75db3p+6, 0x4.c5e278p-24,
+    0x1.520ceep-4, -0x1.d0127cp-12, -0xd.62681p-8 }, /* 29 */
+  { 0x1.826b7ep+6, 0x1.826bccp+6, 0x1.826c2cp+6, -0x3.50e62cp-24,
+    -0x1.4c822p-4, 0x1.ba5832p-12, -0x1.eb2ee2p-8 }, /* 30 */
+  { 0x1.8efc84p+6, 0x1.8efcbep+6, 0x1.8efd16p+6, -0x1.c39f38p-24,
+    0x1.473ae6p-4, -0x1.a616c8p-12, 0xf.f352ap-12 }, /* 31 */
+  { 0x1.9b8d84p+6, 0x1.9b8db2p+6, 0x1.9b8e7p+6, -0x1.9245b6p-28,
+    -0x1.42320ap-4, 0x1.932a04p-12, 0x2.dc113cp-8 }, /* 32 */
+  { 0x1.a81e72p+6, 0x1.a81ea6p+6, 0x1.a81f04p+6, -0x1.0acf8p-24,
+    0x1.3d62e6p-4, -0x1.7c4b14p-12, -0x1.cfc5c2p-4 }, /* 33 */
+  { 0x1.b4af6ap+6, 0x1.b4af9ap+6, 0x1.b4afeep+6, 0x4.cd92d8p-24,
+    -0x1.38c94ap-4, 0x1.643154p-12, 0x1.4c2a06p-4 }, /* 34 */
+  { 0x1.c1406p+6, 0x1.c1409p+6, 0x1.c140cp+6, -0x1.37bf8ap-24,
+    0x1.34617p-4, -0x1.5f504ap-12, -0x1.e2d324p-4 }, /* 35 */
+  { 0x1.cdd154p+6, 0x1.cdd186p+6, 0x1.cdd1eap+6, -0x1.8f62dep-28,
+    -0x1.3027fp-4, 0x1.534a02p-12, 0x2.c7f144p-12 }, /* 36 */
+  { 0x1.da6248p+6, 0x1.da627cp+6, 0x1.da62e6p+6, -0x9.81e79p-28,
+    0x1.2c19b4p-4, -0x1.4b8288p-12, 0x7.2d8bap-8 }, /* 37 */
+  { 0x1.e6f33ep+6, 0x1.e6f372p+6, 0x1.e6f3a8p+6, 0x3.103b3p-24,
+    -0x1.2833eep-4, 0x1.36f4d2p-12, 0x9.29f91p-8 }, /* 38 */
+  { 0x1.f38434p+6, 0x1.f3846ap+6, 0x1.f384d8p+6, 0x2.07b058p-24,
+    0x1.24740ap-4, -0x1.2ee58ap-12, 0xd.f1393p-12 }, /* 39 */
+  { 0x1.000a98p+7, 0x1.000abp+7, 0x1.000ac8p+7, 0x3.87576cp-24,
+    -0x1.20d7b6p-4, 0x1.2083e2p-12, 0x3.9a7aap-8 }, /* 40 */
+  { 0x1.06531p+7, 0x1.06532cp+7, 0x1.065348p+7, -0x1.691ecp-24,
+    0x1.1d5ccap-4, -0x1.166726p-12, -0x1.e4af48p-8 }, /* 41 */
+  { 0x1.0c9b9ap+7, 0x1.0c9ba8p+7, 0x1.0c9bbep+7, 0x9.b406dp-28,
+    -0x1.1a015p-4, 0x1.038f9cp-12, -0x4.021058p-4 }, /* 42 */
+  { 0x1.12e412p+7, 0x1.12e424p+7, 0x1.12e436p+7, -0xf.bfd8fp-28,
+    0x1.16c37ap-4, -0x1.039edep-12, 0x1.f0033p-4 }, /* 43 */
+  { 0x1.192c92p+7, 0x1.192cap+7, 0x1.192cb6p+7, 0x2.6d50c8p-24,
+    -0x1.13a19ep-4, 0xf.9df8ap-16, 0x4.ecd978p-8 }, /* 44 */
+  { 0x1.1f7512p+7, 0x1.1f751cp+7, 0x1.1f753ap+7, -0x4.d475c8p-24,
+    0x1.109a32p-4, -0x1.04fb3ap-12, -0xd.c271p-12 }, /* 45 */
+  { 0x1.25bd8ep+7, 0x1.25bd98p+7, 0x1.25bdap+7, 0x8.1982p-24,
+    -0x1.0dabc8p-4, 0xe.88eabp-16, -0x4.ed75dp-4 }, /* 46 */
+  { 0x1.2c060cp+7, 0x1.2c0616p+7, 0x1.2c0644p+7, 0x4.864518p-24,
+    0x1.0ad51p-4, -0xe.27196p-16, 0xb.97a3ep-8 }, /* 47 */
+  { 0x1.324e86p+7, 0x1.324e92p+7, 0x1.324e9ep+7, 0x6.8917a8p-28,
+    -0x1.0814d4p-4, 0xd.4fe7ep-16, -0x6.8d8d6p-4 }, /* 48 */
+  { 0x1.389702p+7, 0x1.38970ep+7, 0x1.389728p+7, -0x5.fa18fp-24,
+    0x1.0569fp-4, -0xd.5b0d4p-16, 0x1.50353ap-4 }, /* 49 */
+  { 0x1.3edf84p+7, 0x1.3edf8cp+7, 0x1.3edfaap+7, -0x4.0e5c98p-24,
+    -0x1.02d354p-4, 0xb.7b255p-16, 0x7.8a916p-4 }, /* 50 */
+  { 0x1.4527fp+7, 0x1.452808p+7, 0x1.452812p+7, -0x2.c3ddbp-24,
+    0x1.005004p-4, -0xd.7729cp-16, -0x3.bcc354p-8 }, /* 51 */
+  { 0x1.4b7076p+7, 0x1.4b7086p+7, 0x1.4b70a4p+7, -0x5.d052p-24,
+    -0xf.ddf16p-8, 0xc.318c1p-16, 0x5.7947p-8 }, /* 52 */
+  { 0x1.51b8f4p+7, 0x1.51b902p+7, 0x1.51b90ep+7, -0x2.0b97dcp-24,
+    0xf.b7fafp-8, -0xc.1429dp-16, -0x3.43c36p-4 }, /* 53 */
+  { 0x1.580168p+7, 0x1.58018p+7, 0x1.580188p+7, -0x5.4aab5p-24,
+    -0xf.930fep-8, 0xa.ecc24p-16, 0x9.c62cdp-12 }, /* 54 */
+  { 0x1.5e49eap+7, 0x1.5e49fcp+7, 0x1.5e4a12p+7, -0x3.6dadd8p-24,
+    0xf.6f245p-8, -0xb.6816cp-16, 0xa.d731ap-8 }, /* 55 */
+  { 0x1.649272p+7, 0x1.64927ap+7, 0x1.64929p+7, -0x2.d7e038p-24,
+    -0xf.4c2cep-8, 0xb.118bep-16, 0xb.69a4ep-8 }, /* 56 */
+  { 0x1.6adae6p+7, 0x1.6adaf6p+7, 0x1.6adb04p+7, -0x6.977a1p-24,
+    0xf.2a1fp-8, -0xa.a8911p-16, -0x4.bf6d2p-8 }, /* 57 */
+  { 0x1.712366p+7, 0x1.712374p+7, 0x1.71238ep+7, 0x1.3cc95ep-24,
+    -0xf.08f0ap-8, 0x9.f0858p-16, 0x1.77f7f4p-4 }, /* 58 */
+  { 0x1.776beap+7, 0x1.776bf2p+7, 0x1.776bfap+7, 0x3.a4921p-24,
+    0xe.e8986p-8, -0xa.39dfp-16, -0x6.7ba3dp-4 }, /* 59 */
+  { 0x1.7db464p+7, 0x1.7db46ep+7, 0x1.7db476p+7, 0x6.b45a7p-24,
+    -0xe.c90d8p-8, 0xa.e586fp-16, -0x1.d66becp-4 }, /* 60 */
+  { 0x1.83fce2p+7, 0x1.83fcecp+7, 0x1.83fd0ep+7, -0x2.8f34a4p-24,
+    0xe.aa478p-8, -0x9.810bp-16, -0x3.a5f3fcp-8 }, /* 61 */
+  { 0x1.8a455cp+7, 0x1.8a456ap+7, 0x1.8a4588p+7, -0x1.325968p-24,
+    -0xe.8c3eap-8, 0x9.0a765p-16, 0x1.29a54ap-4 }, /* 62 */
+  { 0x1.908dd8p+7, 0x1.908de8p+7, 0x1.908df4p+7, 0x4.96b808p-24,
+    0xe.6eeb5p-8, -0x9.0251bp-16, 0x1.41a488p-4 }, /* 63 */
+};
+
+/* Formula page 5 of https://www.cl.cam.ac.uk/~jrh13/papers/bessel.pdf:
+   j0f(x) ~ sqrt(2/(pi*x))*beta0(x)*cos(x-pi/4-alpha0(x))
+   where beta0(x) = 1 - 1/(16*x^2) + 53/(512*x^4)
+   and alpha0(x) = 1/(8*x) - 25/(384*x^3).  */
+static float
+j0f_asympt (float x)
+{
+  /* The following code fails to give an error <= 9 ulps in only two cases,
+     for which we tabulate the result.  */
+  if (x == 0x1.4665d2p+24f)
+    return 0xa.50206p-52f;
+  if (x == 0x1.a9afdep+7f)
+    return 0xf.47039p-28f;
+  double y = 1.0 / (double) x;
+  double y2 = y * y;
+  double beta0 = 1.0f + y2 * (-0x1p-4f + 0x1.a8p-4 * y2);
+  double alpha0 = y * (0x2p-4 - 0x1.0aaaaap-4 * y2);
+  double h;
+  int n;
+  h = reduce_aux (x, &n, alpha0);
+  /* Now x - pi/4 - alpha0 = h + n*pi/2 mod (2*pi).  */
+  float xr = (float) h;
+  n = n & 3;
+  float cst = 0xc.c422ap-4; /* sqrt(2/pi) rounded to nearest */
+  float t = cst / sqrtf (x) * (float) beta0;
+  if (n == 0)
+    return t * __cosf (xr);
+  else if (n == 2) /* cos(x+pi) = -cos(x) */
+    return -t * __cosf (xr);
+  else if (n == 1) /* cos(x+pi/2) = -sin(x) */
+    return -t * __sinf (xr);
+  else             /* cos(x+3pi/2) = sin(x) */
+    return t * __sinf (xr);
+}
+
+/* Special code for x near a root of j0.
+   z is the value computed by the generic code.
+   For small x, we use a polynomial approximating j0 around its root.
+   For large x, we use an asymptotic formula (j0f_asympt).  */
+static float
+j0f_near_root (float x, float z)
+{
+  float index_f;
+  int index;
+
+  index_f = roundf ((x - FIRST_ZERO_J0) / (float) M_PI);
+  /* j0f_asympt fails to give an error <= 9 ulps for x=0x1.324e92p+7
+     (index 48) thus we can't reduce SMALL_SIZE below 49.  */
+  if (index_f >= SMALL_SIZE)
+    return j0f_asympt (x);
+  index = (int) index_f;
+  const float *p = Pj[index];
+  float x0 = p[0];
+  float x1 = p[2];
+  /* If not in the interval [x0,x1] around xmid, we return the value z.  */
+  if (! (x0 <= x && x <= x1))
+    return z;
+  float xmid = p[1];
+  float y = x - xmid;
+  return p[3] + y * (p[4] + y * (p[5] + y * p[6]));
+}
+
 float
 __ieee754_j0f(float x)
 {
@@ -48,39 +262,35 @@  __ieee754_j0f(float x)
 	if(ix>=0x7f800000) return one/(x*x);
 	x = fabsf(x);
 	if(ix >= 0x40000000) {	/* |x| >= 2.0 */
+                SET_RESTORE_ROUNDF (FE_TONEAREST);
 		__sincosf (x, &s, &c);
 		ss = s-c;
 		cc = s+c;
-		if(ix<0x7f000000) {  /* make sure x+x not overflow */
-		    z = -__cosf(x+x);
-		    if ((s*c)<zero) cc = z/ss;
-		    else	    ss = z/cc;
-		} else {
-		    /* We subtract (exactly) a value x0 such that
-		       cos(x0)+sin(x0) is very near to 0, and use the identity
-		       sin(x-x0) = sin(x)*cos(x0)-cos(x)*sin(x0) to get
-		       sin(x) + cos(x) with extra accuracy.  */
-		    float x0 = 0xe.d4108p+124f;
-		    float y = x - x0; /* exact  */
-		    /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0)  */
-		    z = __sinf (y);
-		    float eps = 0x1.5f263ep-24f;
-		    /* cos(x0) ~ -sin(x0) + eps  */
-		    z += eps * __cosf (x);
-		    /* now z ~ (sin(x)-cos(x))*cos(x0)  */
-		    float cosx0 = -0xb.504f3p-4f;
-		    cc = z / cosx0;
-                }
+                if (ix >= 0x7f000000)
+		  /* x >= 2^127: use asymptotic expansion.  */
+                  return j0f_asympt (x);
+                /* Now we are sure that x+x cannot overflow.  */
+                z = -__cosf(x+x);
+                if ((s*c)<zero) cc = z/ss;
+                else	    ss = z/cc;
 	/*
 	 * j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
 	 * y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)
 	 */
-		if(ix>0x5c000000) z = (invsqrtpi*cc)/sqrtf(x);
-		else {
-		    u = pzerof(x); v = qzerof(x);
-		    z = invsqrtpi*(u*cc-v*ss)/sqrtf(x);
-		}
-		return z;
+                if (ix <= 0x5c000000)
+                  {
+                    u = pzerof(x); v = qzerof(x);
+                    cc = u*cc-v*ss;
+                  }
+                z = (invsqrtpi * cc) / sqrtf(x);
+                /* The following threshold is optimal: for x=0x1.3b58dep+1
+                   and rounding upwards, |cc|=0x1.579b26p-4 and z is 10 ulps
+                   far from the correctly rounded value.  */
+                float threshold = 0x1.579b26p-4;
+                if (fabsf (cc) > threshold)
+                  return z;
+                else
+                  return j0f_near_root (x, z);
 	}
 	if(ix<0x39000000) {	/* |x| < 2**-13 */
 	    math_force_eval(huge+x);		/* raise inexact if x != 0 */
@@ -112,6 +322,219 @@  v02  =  7.6006865129e-05, /* 0x389f65e0 */
 v03  =  2.5915085189e-07, /* 0x348b216c */
 v04  =  4.4111031494e-10; /* 0x2ff280c2 */
 
+/* This is the nearest approximation of the first zero of y0.  */
+#define FIRST_ZERO_Y0 0xe.4c166p-4f
+
+/* The following table contains successive zeros of y0 and degree-3
+   polynomial approximations of y0 around these zeros: Py[0] for the first
+   zero (0.89358), Py[1] for the second one (3.957678), and so on.
+   Each line contains:
+              {x0, xmid, x1, p0, p1, p2, p3}
+   where [x0,x1] is the interval around the zero, xmid is the binary32 number
+   closest to the zero, and p0+p1*x+p2*x^2+p3*x^3 is the approximation
+   polynomial.  Each polynomial was generated using Sollya on the interval
+   [x0,x1] around the corresponding zero where the error exceeds 9 ulps
+   for the alternate code.  Degree 3 is enough, except for index 0 where we
+   use degree 5, and the coefficients of degree 4 and 5 are hard-coded in
+   y0f_near_root.
+*/
+static const float Py[SMALL_SIZE][7] = {
+  { 0x1.a681dap-1, 0x1.c982ecp-1, 0x1.ef6bcap-1, 0x3.274468p-28,
+    0xe.121b8p-4, -0x7.df8b3p-4, 0x3.877be4p-4
+    /*, -0x3.a46c9p-4, 0x3.735478p-4*/ }, /* 0 */
+  { 0x1.f957c6p+1, 0x1.fa9534p+1, 0x1.fd11b2p+1, 0xa.f1f83p-28,
+    -0x6.70d098p-4, 0xd.04d48p-8, 0xe.f61a9p-8 }, /* 1 */
+  { 0x1.c51832p+2, 0x1.c581dcp+2, 0x1.c65164p+2, -0x5.e2a51p-28,
+    0x4.cd3328p-4, -0x5.6bbe08p-8, -0xc.46d8p-8 }, /* 2 */
+  { 0x1.46fd84p+3, 0x1.471d74p+3, 0x1.475bfcp+3, -0x1.4b0aeep-24,
+    -0x3.fec6b8p-4, 0x3.2068a4p-8, 0xa.76e2dp-8 }, /* 3 */
+  { 0x1.ab7afap+3, 0x1.ab8e1cp+3, 0x1.abb294p+3, -0x8.179d7p-28,
+    0x3.7e6544p-4, -0x2.1799fp-8, -0x9.0e1c4p-8 }, /* 4 */
+  { 0x1.07f9aap+4, 0x1.0803c8p+4, 0x1.08170cp+4, -0x2.5b8078p-24,
+    -0x3.24b868p-4, 0x1.8631ecp-8, 0x8.3cb46p-8 }, /* 5 */
+  { 0x1.3a38eap+4, 0x1.3a42cep+4, 0x1.3a4d8ap+4, 0x1.cd304ap-28,
+    0x2.e189ecp-4, -0x1.2c6954p-8, -0x7.8178ep-8 }, /* 6 */
+  { 0x1.6c7d42p+4, 0x1.6c833p+4, 0x1.6c99fp+4, -0x6.c63f1p-28,
+    -0x2.acc9a8p-4, 0xf.09e31p-12, 0x7.0b5ab8p-8 }, /* 7 */
+  { 0x1.9ebec4p+4, 0x1.9ec47p+4, 0x1.9ed016p+4, 0x1.e53838p-24,
+    0x2.81f2p-4, -0xc.5ff51p-12, -0x7.05ep-8 }, /* 8 */
+  { 0x1.d1008ep+4, 0x1.d10644p+4, 0x1.d11262p+4, 0x1.636feep-24,
+    -0x2.5e40dcp-4, 0xa.6f81dp-12, 0x5.ff6p-8 }, /* 9 */
+  { 0x1.01a27cp+5, 0x1.01a442p+5, 0x1.01a924p+5, -0x4.04e1bp-28,
+    0x2.3febd8p-4, -0x8.f11e2p-12, -0x6.0111ap-8 }, /* 10 */
+  { 0x1.1ac3bcp+5, 0x1.1ac588p+5, 0x1.1ac912p+5, 0x3.6063d8p-24,
+    -0x2.25baacp-4, 0x7.c93cdp-12, 0x4.e7577p-8 }, /* 11 */
+  { 0x1.33e508p+5, 0x1.33e6ecp+5, 0x1.33ea1ap+5, -0x3.f39ebcp-24,
+    0x2.0ed04cp-4, -0x6.d9434p-12, -0x4.fc0b7p-8 }, /* 12 */
+  { 0x1.4d0666p+5, 0x1.4d0868p+5, 0x1.4d0c14p+5, -0x4.ea23p-28,
+    -0x1.fa8b4p-4, 0x6.1470e8p-12, 0x5.97f71p-8 }, /* 13 */
+  { 0x1.6628b8p+5, 0x1.6629f4p+5, 0x1.662e0ep+5, -0x3.5df0c8p-24,
+    0x1.e8727ep-4, -0x5.76a038p-12, -0x4.ee37c8p-8 }, /* 14 */
+  { 0x1.7f4a7cp+5, 0x1.7f4b9p+5, 0x1.7f4daap+5, 0x1.1ef09ep-24,
+    -0x1.d8293ap-4, 0x4.ed8a28p-12, 0x4.d43708p-8 }, /* 15 */
+  { 0x1.986c5cp+5, 0x1.986d38p+5, 0x1.986f6p+5, 0x1.b70cecp-24,
+    0x1.c967p-4, -0x4.7a70b8p-12, -0x5.6840e8p-8 }, /* 16 */
+  { 0x1.b18dcap+5, 0x1.b18ee8p+5, 0x1.b19122p+5, 0x1.abaadcp-24,
+    -0x1.bbf246p-4, 0x4.1a35bp-12, 0x3.c2d46p-8 }, /* 17 */
+  { 0x1.caaf86p+5, 0x1.cab0a2p+5, 0x1.cab2fep+5, 0x1.63989ep-24,
+    0x1.af9cb4p-4, -0x3.c2f2dcp-12, -0x4.cf8108p-8 }, /* 18 */
+  { 0x1.e3d146p+5, 0x1.e3d262p+5, 0x1.e3d492p+5, -0x1.68a8ecp-24,
+    -0x1.a4407ep-4, 0x3.7733ecp-12, 0x5.97916p-8 }, /* 19 */
+  { 0x1.fcf316p+5, 0x1.fcf428p+5, 0x1.fcf59ap+5, 0x1.e1bb5p-24,
+    0x1.99be74p-4, -0x3.37210cp-12, -0x5.d19bf8p-8 }, /* 20 */
+  { 0x1.0b0a7cp+6, 0x1.0b0afap+6, 0x1.0b0b9cp+6, -0x5.5bbcfp-24,
+    -0x1.8ffc9ap-4, 0x2.ffe638p-12, 0x2.ed28e8p-8 }, /* 21 */
+  { 0x1.179b66p+6, 0x1.179bep+6, 0x1.179d0ap+6, -0x4.9e34a8p-24,
+    0x1.86e51cp-4, -0x2.cc7a68p-12, -0x3.3642c4p-8 }, /* 22 */
+  { 0x1.242c5cp+6, 0x1.242ccap+6, 0x1.242d68p+6, 0x1.966706p-24,
+    -0x1.7e657p-4, 0x2.9aed4cp-12, 0x7.b87a58p-8 }, /* 23 */
+  { 0x1.30bd62p+6, 0x1.30bdb6p+6, 0x1.30beb2p+6, 0x3.4b3b68p-24,
+    0x1.766dc2p-4, -0x2.72651cp-12, -0x3.e347f8p-8 }, /* 24 */
+  { 0x1.3d4e56p+6, 0x1.3d4ea2p+6, 0x1.3d4f2ep+6, 0x6.a99008p-28,
+    -0x1.6ef07ep-4, 0x2.53aec4p-12, 0x2.9e3d88p-12 }, /* 25 */
+  { 0x1.49df38p+6, 0x1.49df9p+6, 0x1.49e042p+6, -0x7.a9fa6p-32,
+    0x1.67e1dap-4, -0x2.324d7p-12, -0xc.0e669p-12 }, /* 26 */
+  { 0x1.56702ep+6, 0x1.56708p+6, 0x1.567116p+6, -0x5.026808p-24,
+    -0x1.613798p-4, 0x2.114594p-12, 0x1.a22402p-8 }, /* 27 */
+  { 0x1.630126p+6, 0x1.63017p+6, 0x1.630226p+6, 0x4.46aa2p-24,
+    0x1.5ae8c2p-4, -0x1.f4aaa4p-12, -0x3.58593cp-8 }, /* 28 */
+  { 0x1.6f9234p+6, 0x1.6f926p+6, 0x1.6f92b2p+6, 0x1.5cfccp-24,
+    -0x1.54ed76p-4, 0x1.dd540ap-12, -0xb.e9429p-12 }, /* 29 */
+  { 0x1.7c22fep+6, 0x1.7c2352p+6, 0x1.7c23c2p+6, -0xb.4dc4cp-28,
+    0x1.4f3ebcp-4, -0x1.c463fp-12, -0x1.e94c54p-8 }, /* 30 */
+  { 0x1.88b412p+6, 0x1.88b444p+6, 0x1.88b50ap+6, 0x3.f5343p-24,
+    -0x1.49d668p-4, 0x1.a53f24p-12, 0x5.128198p-8 }, /* 31 */
+  { 0x1.9544dcp+6, 0x1.954538p+6, 0x1.95459p+6, -0x6.e6f32p-28,
+    0x1.44aefap-4, -0x1.9a6ef8p-12, -0x6.c639cp-8 }, /* 32 */
+  { 0x1.a1d5fap+6, 0x1.a1d62cp+6, 0x1.a1d674p+6, 0x1.f359c2p-28,
+    -0x1.3fc386p-4, 0x1.887ebep-12, 0x1.6c813cp-8 }, /* 33 */
+  { 0x1.ae66cp+6, 0x1.ae672p+6, 0x1.ae6788p+6, -0x2.9de748p-24,
+    0x1.3b0fa4p-4, -0x1.777f26p-12, 0x1.c128ccp-8 }, /* 34 */
+  { 0x1.baf7c2p+6, 0x1.baf816p+6, 0x1.baf86cp+6, -0x2.24ccc8p-24,
+    -0x1.368f5cp-4, 0x1.62bd9ep-12, 0xa.df002p-8 }, /* 35 */
+  { 0x1.c788dap+6, 0x1.c7890cp+6, 0x1.c7896cp+6, 0x4.7dcea8p-24,
+    0x1.323f16p-4, -0x1.61abf4p-12, 0xa.ad73ep-8 }, /* 36 */
+  { 0x1.d419ccp+6, 0x1.d41a02p+6, 0x1.d41a68p+6, -0x4.b538p-24,
+    -0x1.2e1b98p-4, 0x1.4a4d64p-12, 0x3.a47674p-8 }, /* 37 */
+  { 0x1.e0aacep+6, 0x1.e0aaf8p+6, 0x1.e0ab5ep+6, 0x3.09dc4cp-24,
+    0x1.2a21ecp-4, -0x1.3fa20cp-12, 0x2.216e8cp-8 }, /* 38 */
+  { 0x1.ed3bb8p+6, 0x1.ed3beep+6, 0x1.ed3c56p+6, 0x4.d5a58p-28,
+    -0x1.264f66p-4, 0x1.32c4cep-12, 0x1.53cbb4p-8 }, /* 39 */
+  { 0x1.f9ccaep+6, 0x1.f9cce6p+6, 0x1.f9cd52p+6, 0x3.f4c44cp-24,
+    0x1.22a192p-4, -0x1.1f8514p-12, -0xc.0de32p-8 }, /* 40 */
+  { 0x1.032ed6p+7, 0x1.032eeep+7, 0x1.032f0cp+7, 0x2.4beae8p-24,
+    -0x1.1f1634p-4, 0x1.171664p-12, 0x1.72a654p-4 }, /* 41 */
+  { 0x1.097756p+7, 0x1.09776ap+7, 0x1.09779cp+7, -0xd.a581ep-28,
+    0x1.1bab3cp-4, -0xf.9f507p-16, -0xc.ba2d4p-8 }, /* 42 */
+  { 0x1.0fbfdp+7, 0x1.0fbfe6p+7, 0x1.0fbff6p+7, 0xa.7c0bdp-28,
+    -0x1.185eccp-4, 0x1.01d7dep-12, -0x1.a2186ep-4 }, /* 43 */
+  { 0x1.160856p+7, 0x1.160862p+7, 0x1.16087ap+7, -0x1.9452ecp-24,
+    0x1.152f26p-4, -0x1.07b4aap-12, 0x1.6bbd7ep-4 }, /* 44 */
+  { 0x1.1c50dp+7, 0x1.1c50dep+7, 0x1.1c5118p+7, 0x3.83b7b8p-24,
+    -0x1.121ab2p-4, 0x1.0e938cp-12, -0x5.1a6dfp-8 }, /* 45 */
+  { 0x1.22995p+7, 0x1.22995ap+7, 0x1.229976p+7, -0x6.5ca3a8p-24,
+    0x1.0f1ff2p-4, -0xe.f198p-16, -0x3.8e98b8p-8 }, /* 46 */
+  { 0x1.28e1ccp+7, 0x1.28e1d8p+7, 0x1.28e1f4p+7, -0x6.bb61ap-24,
+    -0x1.0c3d8ap-4, 0xf.a14b9p-16, 0x9.81e82p-4 }, /* 47 */
+  { 0x1.2f2a48p+7, 0x1.2f2a54p+7, 0x1.2f2a74p+7, 0x2.2438p-24,
+    0x1.097236p-4, -0xd.fed5ep-16, -0x3.19eb5cp-8 }, /* 48 */
+  { 0x1.3572b8p+7, 0x1.3572dp+7, 0x1.3572ecp+7, 0x3.1e0054p-24,
+    -0x1.06bcc8p-4, 0xd.d2596p-16, -0x1.67e00ap-4 }, /* 49 */
+  { 0x1.3bbb3ep+7, 0x1.3bbb4ep+7, 0x1.3bbb6ap+7, 0x7.46c908p-24,
+    0x1.041c28p-4, -0xd.04045p-16, -0x8.fb297p-8 }, /* 50 */
+  { 0x1.4203aep+7, 0x1.4203cap+7, 0x1.4203e6p+7, -0xb.4f158p-28,
+    -0x1.018f52p-4, 0xc.ccf6fp-16, 0x1.4d5dp-4 }, /* 51 */
+  { 0x1.484c38p+7, 0x1.484c46p+7, 0x1.484c56p+7, -0x6.5a89c8p-24,
+    0xf.f155p-8, -0xc.5d21dp-16, -0xd.aca34p-8 }, /* 52 */
+  { 0x1.4e94b8p+7, 0x1.4e94c4p+7, 0x1.4e94d4p+7, -0x1.ef16c8p-24,
+    -0xf.cad3fp-8, 0xb.d75f8p-16, 0x1.f36732p-4 }, /* 53 */
+  { 0x1.54dd36p+7, 0x1.54dd4p+7, 0x1.54dd52p+7, -0x6.1e7b68p-24,
+    0xf.a564cp-8, -0xb.ec1cfp-16, 0xe.e7421p-8 }, /* 54 */
+  { 0x1.5b25aep+7, 0x1.5b25bep+7, 0x1.5b25d4p+7, -0xf.8c858p-28,
+    -0xf.80faep-8, 0xb.8b6c5p-16, -0x5.835ed8p-8 }, /* 55 */
+  { 0x1.616e34p+7, 0x1.616e3cp+7, 0x1.616e4ep+7, 0x7.75d858p-24,
+    0xf.5d8abp-8, -0xb.b3779p-16, 0x2.40b948p-4 }, /* 56 */
+  { 0x1.67b6bp+7, 0x1.67b6b8p+7, 0x1.67b6dp+7, 0x1.d78632p-24,
+    -0xf.3b096p-8, 0xa.daf89p-16, 0x1.aa8548p-8 }, /* 57 */
+  { 0x1.6dff28p+7, 0x1.6dff36p+7, 0x1.6dff54p+7, 0x3.b24794p-24,
+    0xf.196c7p-8, -0xb.1afe1p-16, -0x1.77538cp-8 }, /* 58 */
+  { 0x1.7447a2p+7, 0x1.7447b2p+7, 0x1.7447cap+7, 0x6.39cbc8p-24,
+    -0xe.f8aa5p-8, 0xa.50daap-16, 0x1.9592c2p-8 }, /* 59 */
+  { 0x1.7a902p+7, 0x1.7a903p+7, 0x1.7a903ep+7, -0x1.820e3ap-24,
+    0xe.d8b9dp-8, -0xa.998cp-16, -0x2.c35d78p-4 }, /* 60 */
+  { 0x1.80d89ep+7, 0x1.80d8aep+7, 0x1.80d8bep+7, -0x2.c7e038p-24,
+    -0xe.b9925p-8, 0x9.ce06p-16, -0x2.2b3054p-4 }, /* 61 */
+  { 0x1.87211cp+7, 0x1.87212cp+7, 0x1.872144p+7, 0x6.ab31c8p-24,
+    0xe.9b2bep-8, -0x9.4de7p-16, -0x1.32cb5ep-4 }, /* 62 */
+  { 0x1.8d699ap+7, 0x1.8d69a8p+7, 0x1.8d69bp+7, 0x4.4ef25p-24,
+    -0xe.7d7ecp-8, 0x9.a0f1ep-16, 0x1.6ac076p-4 }, /* 63 */
+};
+
+/* Formula page 5 of https://www.cl.cam.ac.uk/~jrh13/papers/bessel.pdf:
+   y0(x) ~ sqrt(2/(pi*x))*beta0(x)*sin(x-pi/4-alpha0(x))
+   where beta0(x) = 1 - 1/(16*x^2) + 53/(512*x^4)
+   and alpha0(x) = 1/(8*x) - 25/(384*x^3).  */
+static float
+y0f_asympt (float x)
+{
+  /* The following code fails to give an error <= 9 ulps in only two cases,
+     for which we tabulate the correctly-rounded result.  */
+  if (x == 0x1.bfad96p+7f)
+    return -0x7.f32bdp-32f;
+  if (x == 0x1.2e2a42p+17f)
+    return 0x1.a48974p-40f;
+  double y = 1.0 / (double) x;
+  double y2 = y * y;
+  double beta0 = 1.0f + y2 * (-0x1p-4f + 0x1.a8p-4 * y2);
+  double alpha0 = y * (0x2p-4 - 0x1.0aaaaap-4 * y2);
+  double h;
+  int n;
+  h = reduce_aux (x, &n, alpha0);
+  /* Now x - pi/4 - alpha0 = h + n*pi/2 mod (2*pi).  */
+  float xr = (float) h;
+  n = n & 3;
+  float cst = 0xc.c422ap-4; /* sqrt(2/pi) rounded to nearest */
+  float t = cst / sqrtf (x) * (float) beta0;
+  if (n == 0)
+    return t * __sinf (xr);
+  else if (n == 2) /* sin(x+pi) = -sin(x) */
+    return -t * __sinf (xr);
+  else if (n == 1) /* sin(x+pi/2) = cos(x) */
+    return t * __cosf (xr);
+  else             /* sin(x+3pi/2) = -cos(x) */
+    return -t * __cosf (xr);
+}
+
+/* Special code for x near a root of y0.
+   z is the value computed by the generic code.
+   For small x, use a polynomial approximating y0 around its root.
+   For large x, use an asymptotic formula (y0f_asympt).  */
+static float
+y0f_near_root (float x, float z)
+{
+  float index_f;
+  int index;
+
+  index_f = roundf ((x - FIRST_ZERO_Y0) / (float) M_PI);
+  if (index_f >= SMALL_SIZE)
+    return y0f_asympt (x);
+  index = (int) index_f;
+  const float *p = Py[index];
+  float x0 = p[0];
+  float x1 = p[2];
+  /* If not in the interval [x0,x1] around xmid, return the value z.  */
+  if (! (x0 <= x && x <= x1))
+    return z;
+  float xmid = p[1];
+  float y = x - xmid;
+  /* For degree 0 use a degree-5 polynomial, where the coefficients of
+     degree 4 and 5 are hard-coded.  */
+  float p6 = (index > 0) ? p[6]
+    : p[6] + y * (-0x3.a46c9p-4 + y * 0x3.735478p-4);
+  float res = p[3] + y * (p[4] + y * (p[5] + y * p6));
+  return res;
+}
+
 float
 __ieee754_y0f(float x)
 {
@@ -124,7 +547,9 @@  __ieee754_y0f(float x)
 	if(ix>=0x7f800000) return  one/(x+x*x);
 	if(ix==0) return -1/zero; /* -inf and divide by zero exception.  */
 	if(hx<0) return zero/(zero*x);
-	if(ix >= 0x40000000) {  /* |x| >= 2.0 */
+	if(ix >= 0x40000000 || (0x3f5340ed <= ix && ix <= 0x3f77b5e5)) {
+          /* |x| >= 2.0 or
+	     0x1.a681dap-1 <= |x| <= 0x1.ef6bcap-1 (around 1st zero) */
 	/* y0(x) = sqrt(2/(pi*x))*(p0(x)*sin(x0)+q0(x)*cos(x0))
 	 * where x0 = x-pi/4
 	 *      Better formula:
@@ -136,6 +561,7 @@  __ieee754_y0f(float x)
 	 *              sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x))
 	 * to compute the worse one.
 	 */
+                SET_RESTORE_ROUNDF (FE_TONEAREST);
 		__sincosf (x, &s, &c);
 		ss = s-c;
 		cc = s+c;
@@ -143,17 +569,26 @@  __ieee754_y0f(float x)
 	 * j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
 	 * y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)
 	 */
-		if(ix<0x7f000000) {  /* make sure x+x not overflow */
-		    z = -__cosf(x+x);
-		    if ((s*c)<zero) cc = z/ss;
-		    else            ss = z/cc;
-		}
-		if(ix>0x5c000000) z = (invsqrtpi*ss)/sqrtf(x);
-		else {
-		    u = pzerof(x); v = qzerof(x);
-		    z = invsqrtpi*(u*ss+v*cc)/sqrtf(x);
-		}
-		return z;
+                if (ix >= 0x7f000000)
+		  /* x >= 2^127: use asymptotic expansion.  */
+                  return y0f_asympt (x);
+                /* Now we are sure that x+x cannot overflow.  */
+                z = -__cosf(x+x);
+                if ((s*c)<zero) cc = z/ss;
+                else            ss = z/cc;
+                if (ix <= 0x5c000000)
+                  {
+                    u = pzerof(x); v = qzerof(x);
+                    ss = u*ss+v*cc;
+                  }
+                z = (invsqrtpi*ss)/sqrtf(x);
+                /* The following threshold is optimal (determined on
+                   aarch64-linux-gnu).  */
+                float threshold = 0x1.be585ap-4;
+                if (fabsf (ss) > threshold)
+                  return z;
+                else
+                  return y0f_near_root (x, z);
 	}
 	if(ix<=0x39800000) {	/* x < 2**-13 */
 	    return(u00 + tpi*__ieee754_logf(x));
@@ -165,7 +600,7 @@  __ieee754_y0f(float x)
 }
 libm_alias_finite (__ieee754_y0f, __y0f)
 
-/* The asymptotic expansions of pzero is
+/* The asymptotic expansion of pzero is
  *	1 - 9/128 s^2 + 11025/98304 s^4 - ...,	where s = 1/x.
  * For x >= 2, We approximate pzero by
  *	pzero(x) = 1 + (R/S)
@@ -257,7 +692,7 @@  pzerof(float x)
 }
 
 
-/* For x >= 8, the asymptotic expansions of qzero is
+/* For x >= 8, the asymptotic expansion of qzero is
  *	-1/8 s + 75/1024 s^3 - ..., where s = 1/x.
  * We approximate pzero by
  *	qzero(x) = s*(-1.25 + (R/S))
diff --git a/sysdeps/ieee754/flt-32/e_j1f.c b/sysdeps/ieee754/flt-32/e_j1f.c
index ac5bb76828..eca1c0bae1 100644
--- a/sysdeps/ieee754/flt-32/e_j1f.c
+++ b/sysdeps/ieee754/flt-32/e_j1f.c
@@ -21,6 +21,7 @@ 
 #include <fenv_private.h>
 #include <math-underflow.h>
 #include <libm-alias-finite.h>
+#include <reduce_aux.h>
 
 static float ponef(float), qonef(float);
 
@@ -42,6 +43,223 @@  s05  =  1.2354227016e-11; /* 0x2d59567e */
 
 static const float zero    = 0.0;
 
+/* This is the nearest approximation of the first positive zero of j1.  */
+#define FIRST_ZERO_J1 0x3.d4eabp+0f
+
+#define SMALL_SIZE 64
+
+/* The following table contains successive zeros of j1 and degree-3
+   polynomial approximations of j1 around these zeros: Pj[0] for the first
+   positive zero (3.831705), Pj[1] for the second one (7.015586), and so on.
+   Each line contains:
+              {x0, xmid, x1, p0, p1, p2, p3}
+   where [x0,x1] is the interval around the zero, xmid is the binary32 number
+   closest to the zero, and p0+p1*x+p2*x^2+p3*x^3 is the approximation
+   polynomial.  Each polynomial was generated using Sollya on the interval
+   [x0,x1] around the corresponding zero where the error exceeds 9 ulps
+   for the alternate code.  Degree 3 is enough to get an error at most
+   9 ulps, except around the first zero.
+*/
+static const float Pj[SMALL_SIZE][7] = {
+  /* For index 0, we use a degree-4 polynomial generated by Sollya, with the
+     coefficient of degree 4 hard-coded in j1f_near_root().  */
+  { 0x1.e09e5ep+1, 0x1.ea7558p+1, 0x1.ef7352p+1, -0x8.4f069p-28,
+    -0x6.71b3d8p-4, 0xd.744a2p-8, 0xd.acd9p-8/*, -0x1.3e51aap-8*/ }, /* 0 */
+  { 0x1.bdb4c2p+2, 0x1.c0ff6p+2, 0x1.c27a8cp+2, 0xe.c2858p-28,
+    0x4.cd464p-4, -0x5.79b71p-8, -0xc.11124p-8 }, /* 1 */
+  { 0x1.43b214p+3, 0x1.458d0ep+3, 0x1.460ccep+3, -0x1.e7acecp-24,
+    -0x3.feca9p-4, 0x3.2470f8p-8, 0xa.625b7p-8 }, /* 2 */
+  { 0x1.a9c98p+3, 0x1.aa5bbp+3, 0x1.aaa4d8p+3, 0x1.698158p-24,
+    0x3.7e666cp-4, -0x2.1900ap-8, -0x9.2755p-8 }, /* 3 */
+  { 0x1.073be4p+4, 0x1.0787b4p+4, 0x1.07aed8p+4, -0x1.f5f658p-24,
+    -0x3.24b8ep-4, 0x1.86e35cp-8, 0x8.4e4bbp-8 }, /* 4 */
+  { 0x1.39ae2ap+4, 0x1.39da8ep+4, 0x1.39f3dap+4, -0x1.4e744p-24,
+    0x2.e18a24p-4, -0x1.2ccd16p-8, -0x7.a27ep-8 }, /* 5 */
+  { 0x1.6bfa46p+4, 0x1.6c294ep+4, 0x1.6c412p+4, 0xa.3fb7fp-28,
+    -0x2.acc9c4p-4, 0xf.0b783p-12, 0x7.1c0d3p-8 }, /* 6 */
+  { 0x1.9e42bep+4, 0x1.9e757p+4, 0x1.9e876ep+4, -0x2.29f6f4p-24,
+    0x2.81f21p-4, -0xc.641bp-12, -0x6.a7ea58p-8 }, /* 7 */
+  { 0x1.d08a3ep+4, 0x1.d0bfdp+4, 0x1.d0cd3cp+4, -0x1.b5d196p-24,
+    -0x2.5e40e4p-4, 0xa.7059fp-12, 0x6.4d6bfp-8 }, /* 8 */
+  { 0x1.017794p+5, 0x1.018476p+5, 0x1.018b8cp+5, -0x4.0e001p-24,
+    0x2.3febep-4, -0x8.f23aap-12, -0x6.0102cp-8 }, /* 9 */
+  { 0x1.1a9e78p+5, 0x1.1aa89p+5, 0x1.1aaf88p+5, 0x3.b26f2p-24,
+    -0x2.25babp-4, 0x7.c6d948p-12, 0x5.a1d988p-8 }, /* 10 */
+  { 0x1.33bddep+5, 0x1.33cc52p+5, 0x1.33d2e4p+5, -0xf.c8cdap-28,
+    0x2.0ed05p-4, -0x6.d97dbp-12, -0x5.8da498p-8 }, /* 11 */
+  { 0x1.4ce7cp+5, 0x1.4cefdp+5, 0x1.4cf7d4p+5, -0x3.9940e4p-24,
+    -0x1.fa8b4p-4, 0x6.16108p-12, 0x5.4355e8p-8 }, /* 12 */
+  { 0x1.6603e8p+5, 0x1.661316p+5, 0x1.66173ap+5, 0x9.da15dp-28,
+    0x1.e8727ep-4, -0x5.742468p-12, -0x5.117c28p-8 }, /* 13 */
+  { 0x1.7f2ebcp+5, 0x1.7f3632p+5, 0x1.7f3a7ep+5, -0x3.39b218p-24,
+    -0x1.d8293ap-4, 0x4.ee3348p-12, 0x4.f9bep-8 }, /* 14 */
+  { 0x1.9850e6p+5, 0x1.985928p+5, 0x1.985d9ep+5, -0x3.7b5108p-24,
+    0x1.c96702p-4, -0x4.7b0d08p-12, -0x4.c784a8p-8 }, /* 15 */
+  { 0x1.b172e8p+5, 0x1.b17c04p+5, 0x1.b1805cp+5, -0x1.91e43ep-24,
+    -0x1.bbf246p-4, 0x4.18ad78p-12, 0x4.9bfae8p-8 }, /* 16 */
+  { 0x1.ca955ap+5, 0x1.ca9ec6p+5, 0x1.caa2a4p+5, 0x1.28453cp-24,
+    0x1.af9cb4p-4, -0x3.c3a494p-12, -0x4.78b69p-8 }, /* 17 */
+  { 0x1.e3bc94p+5, 0x1.e3c174p+5, 0x1.e3c64p+5, -0x2.e7fef4p-24,
+    -0x1.a4407ep-4, 0x3.79b228p-12, 0x4.874f7p-8 }, /* 18 */
+  { 0x1.fcdf16p+5, 0x1.fce40ep+5, 0x1.fce71p+5, -0x3.23b2fcp-24,
+    0x1.99be76p-4, -0x3.39ad7cp-12, -0x4.92a55p-8 }, /* 19 */
+  { 0x1.0afe34p+6, 0x1.0b034ep+6, 0x1.0b054ap+6, -0xd.19e93p-28,
+    -0x1.8ffc9cp-4, 0x2.fee7f8p-12, 0x4.2d33b8p-8 }, /* 20 */
+  { 0x1.179344p+6, 0x1.17948ep+6, 0x1.1795bp+6, 0x1.c2ac48p-24,
+    0x1.86e51cp-4, -0x2.cc5abp-12, -0x4.866d08p-8 }, /* 21 */
+  { 0x1.24231ep+6, 0x1.2425c8p+6, 0x1.2426e2p+6, -0xd.31027p-28,
+    -0x1.7e656ep-4, 0x2.9db23cp-12, 0x3.cc63c8p-8 }, /* 22 */
+  { 0x1.30b5a8p+6, 0x1.30b6fep+6, 0x1.30b84ep+6, 0x5.b5e53p-24,
+    0x1.766dc2p-4, -0x2.754cfcp-12, -0x3.c39bb4p-8 }, /* 23 */
+  { 0x1.3d46ccp+6, 0x1.3d482ep+6, 0x1.3d495ep+6, -0x1.340a8ap-24,
+    -0x1.6ef07ep-4, 0x2.4ff9d4p-12, 0x3.9b36e4p-8 }, /* 24 */
+  { 0x1.49d688p+6, 0x1.49d95ap+6, 0x1.49dabep+6, -0x3.ba66p-24,
+    0x1.67e1dcp-4, -0x2.2f32b8p-12, -0x3.e2aaf4p-8 }, /* 25 */
+  { 0x1.566916p+6, 0x1.566a84p+6, 0x1.566bcp+6, 0x6.47ca5p-28,
+    -0x1.61379ap-4, 0x2.1096acp-12, 0x4.2d0968p-8 }, /* 26 */
+  { 0x1.62f8dap+6, 0x1.62fbaap+6, 0x1.62fc9cp+6, -0x2.12affp-24,
+    0x1.5ae8c4p-4, -0x1.f32444p-12, -0x3.9e592p-8 }, /* 27 */
+  { 0x1.6f89e6p+6, 0x1.6f8ccep+6, 0x1.6f8e34p+6, -0x7.4853ap-28,
+    -0x1.54ed76p-4, 0x1.db004ap-12, 0x3.907034p-8 }, /* 28 */
+  { 0x1.7c1c6ap+6, 0x1.7c1deep+6, 0x1.7c1f4cp+6, -0x4.f0a998p-24,
+    0x1.4f3ebcp-4, -0x1.c26808p-12, -0x2.da8df8p-8 }, /* 29 */
+  { 0x1.88adaep+6, 0x1.88af0ep+6, 0x1.88afc4p+6, -0x1.80c246p-24,
+    -0x1.49d668p-4, 0x1.aebc26p-12, 0x3.af7b5cp-8 }, /* 30 */
+  { 0x1.953d7p+6, 0x1.95402ap+6, 0x1.9540ep+6, -0x2.22aff8p-24,
+    0x1.44aefap-4, -0x1.99f25p-12, -0x3.5e9198p-8 }, /* 31 */
+  { 0x1.a1d01ep+6, 0x1.a1d146p+6, 0x1.a1d20ap+6, -0x3.aad6d4p-24,
+    -0x1.3fc386p-4, 0x1.892858p-12, 0x3.fe0184p-8 }, /* 32 */
+  { 0x1.ae60ecp+6, 0x1.ae625ep+6, 0x1.ae6326p+6, -0x2.010be4p-24,
+    0x1.3b0fa4p-4, -0x1.7539ap-12, -0x2.b2c9bp-8 }, /* 33 */
+  { 0x1.baf234p+6, 0x1.baf376p+6, 0x1.baf442p+6, -0xd.4fd17p-32,
+    -0x1.368f5cp-4, 0x1.6734e4p-12, 0x3.59f514p-8 }, /* 34 */
+  { 0x1.c782e6p+6, 0x1.c7848cp+6, 0x1.c78516p+6, -0xa.d662dp-28,
+    0x1.323f18p-4, -0x1.571c02p-12, -0x3.2be5bp-8 }, /* 35 */
+  { 0x1.d4144ep+6, 0x1.d415ap+6, 0x1.d41622p+6, 0x4.9f217p-24,
+    -0x1.2e1b9ap-4, 0x1.4a2edap-12, 0x3.a4e96p-8 }, /* 36 */
+  { 0x1.e0a5ep+6, 0x1.e0a6b4p+6, 0x1.e0a788p+6, -0x2.d167p-24,
+    0x1.2a21eep-4, -0x1.3c4b46p-12, -0x4.9e0978p-8 }, /* 37 */
+  { 0x1.ed36eep+6, 0x1.ed37c8p+6, 0x1.ed3892p+6, -0x4.15a83p-24,
+    -0x1.264f66p-4, 0x1.31dea4p-12, 0x3.d125ecp-8 }, /* 38 */
+  { 0x1.f9c77p+6, 0x1.f9c8d8p+6, 0x1.f9c9acp+6, -0x2.a5bbbp-24,
+    0x1.22a192p-4, -0x1.25e59ep-12, -0x2.ef6934p-8 }, /* 39 */
+  { 0x1.032c54p+7, 0x1.032cf4p+7, 0x1.032d66p+7, 0x4.e828bp-24,
+    -0x1.1f1634p-4, 0x1.1c2394p-12, 0x3.6d744cp-8 }, /* 40 */
+  { 0x1.09750cp+7, 0x1.09757cp+7, 0x1.0975b6p+7, -0x3.28a3bcp-24,
+    0x1.1bab3ep-4, -0x1.1569cep-12, -0x5.84da7p-8 }, /* 41 */
+  { 0x1.0fbd9ap+7, 0x1.0fbe04p+7, 0x1.0fbe5ep+7, -0x2.2f667p-24,
+    -0x1.185eccp-4, 0x1.07f42cp-12, 0x2.87896cp-8 }, /* 42 */
+  { 0x1.160628p+7, 0x1.16068ap+7, 0x1.1606cep+7, -0x6.9097dp-24,
+    0x1.152f28p-4, -0x1.0227fep-12, -0x5.da6e6p-8 }, /* 43 */
+  { 0x1.1c4e9ap+7, 0x1.1c4f12p+7, 0x1.1c4f7cp+7, -0x5.1b408p-24,
+    -0x1.121abp-4, 0xf.6efcp-16, 0x2.c5e954p-8 }, /* 44 */
+  { 0x1.2296aap+7, 0x1.229798p+7, 0x1.2297d4p+7, 0x2.70d0dp-24,
+    0x1.0f1ffp-4, -0xf.523f5p-16, -0x3.5c0568p-8 }, /* 45 */
+  { 0x1.28dfa4p+7, 0x1.28e01ep+7, 0x1.28e054p+7, -0x2.7c176p-24,
+    -0x1.0c3d8ap-4, 0xe.8329ap-16, 0x3.5eb34p-8 }, /* 46 */
+  { 0x1.2f282ap+7, 0x1.2f28a4p+7, 0x1.2f28dep+7, 0x4.fd6368p-24,
+    0x1.097236p-4, -0xe.17299p-16, -0x3.73a2e4p-8 }, /* 47 */
+  { 0x1.3570bp+7, 0x1.357128p+7, 0x1.35716p+7, 0x6.b05f68p-24,
+    -0x1.06bccap-4, 0xd.527b8p-16, 0x2.b8bf9cp-8 }, /* 48 */
+  { 0x1.3bb932p+7, 0x1.3bb9aep+7, 0x1.3bb9eap+7, 0x4.0d622p-28,
+    0x1.041c28p-4, -0xd.0ac11p-16, -0x1.65f2ccp-8 }, /* 49 */
+  { 0x1.4201b6p+7, 0x1.420232p+7, 0x1.42027p+7, 0x7.0d98cp-24,
+    -0x1.018f52p-4, 0xc.c4d8ep-16, 0x2.8f250cp-8 }, /* 50 */
+  { 0x1.484a78p+7, 0x1.484ab8p+7, 0x1.484af4p+7, 0x3.93d10cp-24,
+    0xf.f154fp-8, -0xc.7b7fep-16, -0x3.6b6e4cp-8 }, /* 51 */
+  { 0x1.4e92c8p+7, 0x1.4e933cp+7, 0x1.4e9368p+7, 0xd.88185p-32,
+    -0xf.cad3fp-8, 0xc.1462p-16, 0x2.bd66p-8 }, /* 52 */
+  { 0x1.54db84p+7, 0x1.54dbcp+7, 0x1.54dbf4p+7, -0x1.fe6b92p-24,
+    0xf.a564cp-8, -0xb.c4e6cp-16, -0x3.d51decp-8 }, /* 53 */
+  { 0x1.5b23c4p+7, 0x1.5b2444p+7, 0x1.5b2486p+7, 0x2.6137f4p-24,
+    -0xf.80faep-8, 0xb.5199ep-16, 0x1.9ca85ap-8 }, /* 54 */
+  { 0x1.616c62p+7, 0x1.616cc8p+7, 0x1.616d0ap+7, -0x1.55468p-24,
+    0xf.5d8acp-8, -0xb.21d16p-16, -0x1.b8809ap-8 }, /* 55 */
+  { 0x1.67b4fp+7, 0x1.67b54cp+7, 0x1.67b588p+7, -0x1.08c6bep-24,
+    -0xf.3b096p-8, 0xa.e65efp-16, 0x3.642304p-8 }, /* 56 */
+  { 0x1.6dfd8ep+7, 0x1.6dfddp+7, 0x1.6dfe0ap+7, 0x4.9ebfa8p-24,
+    0xf.196c7p-8, -0xa.ba8c8p-16, -0x5.ad6a2p-8 }, /* 57 */
+  { 0x1.74461p+7, 0x1.744652p+7, 0x1.744692p+7, 0x5.a4017p-24,
+    -0xe.f8aa5p-8, 0xa.49748p-16, 0x2.a86498p-8 }, /* 58 */
+  { 0x1.7a8e5ep+7, 0x1.7a8ed6p+7, 0x1.7a8ef8p+7, 0x3.bcb2a8p-28,
+    0xe.d8b9dp-8, -0x9.c43bep-16, -0x1.e7124ap-8 }, /* 59 */
+  { 0x1.80d6cep+7, 0x1.80d75ap+7, 0x1.80d78ap+7, -0x7.1091fp-24,
+    -0xe.b9925p-8, 0x9.c43dap-16, 0x1.aba86p-8 }, /* 60 */
+  { 0x1.871f58p+7, 0x1.871fdcp+7, 0x1.87201ep+7, 0x2.ca1cf4p-28,
+    0xe.9b2bep-8, -0x9.843b3p-16, -0x2.093e68p-8 }, /* 61 */
+  { 0x1.8d67e8p+7, 0x1.8d685ep+7, 0x1.8d688ep+7, 0x5.aa8908p-24,
+    -0xe.7d7ecp-8, 0x9.501a8p-16, 0x2.54a754p-8 }, /* 62 */
+  { 0x1.93b09cp+7, 0x1.93b0e2p+7, 0x1.93b10ep+7, 0x3.d9cd9cp-24,
+    0xe.6083ap-8, -0x9.45dadp-16, -0x5.112908p-8 }, /* 63 */
+};
+
+/* Formula page 5 of https://www.cl.cam.ac.uk/~jrh13/papers/bessel.pdf:
+   j1f(x) ~ sqrt(2/(pi*x))*beta1(x)*cos(x-3pi/4-alpha1(x))
+   where beta1(x) = 1 + 3/(16*x^2) - 99/(512*x^4)
+   and alpha1(x) = -3/(8*x) + 21/(128*x^3) - 1899/(5120*x^5).  */
+static float
+j1f_asympt (float x)
+{
+  float cst = 0xc.c422ap-4; /* sqrt(2/pi) rounded to nearest */
+  if (x < 0)
+    {
+      x = -x;
+      cst = -cst;
+    }
+  double y = 1.0 / (double) x;
+  double y2 = y * y;
+  double beta1 = 1.0f + y2 * (0x3p-4 - 0x3.18p-4 * y2);
+  double alpha1;
+  alpha1 = y * (-0x6p-4 + y2 * (0x2.ap-4 - 0x5.ef33333333334p-4 * y2));
+  double h;
+  int n;
+  h = reduce_aux (x, &n, alpha1);
+  n--; /* Subtract pi/2.  */
+  /* Now x - 3pi/4 - alpha1 = h + n*pi/2 mod (2*pi).  */
+  float xr = (float) h;
+  n = n & 3;
+  float t = cst / sqrtf (x) * (float) beta1;
+  if (n == 0)
+    return t * __cosf (xr);
+  else if (n == 2) /* cos(x+pi) = -cos(x) */
+    return -t * __cosf (xr);
+  else if (n == 1) /* cos(x+pi/2) = -sin(x) */
+    return -t * __sinf (xr);
+  else             /* cos(x+3pi/2) = sin(x) */
+    return t * __sinf (xr);
+}
+
+/* Special code for x near a root of j1.
+   z is the value computed by the generic code.
+   For small x, we use a polynomial approximating j1 around its root.
+   For large x, we use an asymptotic formula (j1f_asympt).  */
+static float
+j1f_near_root (float x, float z)
+{
+  float index_f, sign = 1.0f;
+  int index;
+
+  if (x < 0)
+    {
+      x = -x;
+      sign = -1.0f;
+    }
+  index_f = roundf ((x - FIRST_ZERO_J1) / (float) M_PI);
+  if (index_f >= SMALL_SIZE)
+    return sign * j1f_asympt (x);
+  index = (int) index_f;
+  const float *p = Pj[index];
+  float x0 = p[0];
+  float x1 = p[2];
+  /* If not in the interval [x0,x1] around xmid, return the value z.  */
+  if (! (x0 <= x && x <= x1))
+    return z;
+  float xmid = p[1];
+  float y = x - xmid;
+  float p6 = (index > 0) ? p[6] : p[6] + y * -0x1.3e51aap-8f;
+  return sign * (p[3] + y * (p[4] + y * (p[5] + y * p6)));
+}
+
 float
 __ieee754_j1f(float x)
 {
@@ -53,25 +271,37 @@  __ieee754_j1f(float x)
 	if(__builtin_expect(ix>=0x7f800000, 0)) return one/x;
 	y = fabsf(x);
 	if(ix >= 0x40000000) {	/* |x| >= 2.0 */
+                SET_RESTORE_ROUNDF (FE_TONEAREST);
 		__sincosf (y, &s, &c);
 		ss = -s-c;
 		cc = s-c;
-		if(ix<0x7f000000) {  /* make sure y+y not overflow */
-		    z = __cosf(y+y);
-		    if ((s*c)>zero) cc = z/ss;
-		    else	    ss = z/cc;
-		}
+                if (ix >= 0x7f000000)
+		  /* x >= 2^127: use asymptotic expansion.  */
+                  return j1f_asympt (x);
+                /* Now we are sure that x+x cannot overflow.  */
+                z = __cosf(y+y);
+                if ((s*c)>zero) cc = z/ss;
+                else	        ss = z/cc;
 	/*
 	 * j1(x) = 1/sqrt(pi) * (P(1,x)*cc - Q(1,x)*ss) / sqrt(x)
 	 * y1(x) = 1/sqrt(pi) * (P(1,x)*ss + Q(1,x)*cc) / sqrt(x)
 	 */
-		if(ix>0x5c000000) z = (invsqrtpi*cc)/sqrtf(y);
-		else {
+		if (ix <= 0x5c000000)
+                  {
 		    u = ponef(y); v = qonef(y);
-		    z = invsqrtpi*(u*cc-v*ss)/sqrtf(y);
-		}
-		if(hx<0) return -z;
-		else	 return  z;
+		    cc = u*cc-v*ss;
+                  }
+                z = (invsqrtpi * cc) / sqrtf(y);
+                /* Adjust sign of z.  */
+                z = (hx < 0) ? -z : z;
+                /* The following threshold is optimal: for x=0x1.e09e5ep+1
+                   and rounding upwards, cc=0x1.b79638p-4 and z is 10 ulps
+                   far from the correctly rounded value.  */
+                float threshold = 0x1.b79638p-4;
+                if (fabsf (cc) > threshold)
+                  return z;
+                else
+                  return j1f_near_root (x, z);
 	}
 	if(__builtin_expect(ix<0x32000000, 0)) {	/* |x|<2**-27 */
 	    if(huge+x>one) {		/* inexact if x!=0 necessary */
@@ -105,6 +335,218 @@  static const float V0[5] = {
   1.6655924903e-11, /* 0x2d9281cf */
 };
 
+/* This is the nearest approximation of the first zero of y1.  */
+#define FIRST_ZERO_Y1 0x2.3277dcp+0f
+
+/* The following table contains successive zeros of y1 and degree-3
+   polynomial approximations of y1 around these zeros: Py[0] for the first
+   positive zero (2.197141), Py[1] for the second one (5.429681), and so on.
+   Each line contains:
+              {x0, xmid, x1, p0, p1, p2, p3}
+   where [x0,x1] is the interval around the zero, xmid is the binary32 number
+   closest to the zero, and p0+p1*x+p2*x^2+p3*x^3 is the approximation
+   polynomial.  Each polynomial was generated using Sollya on the interval
+   [x0,x1] around the corresponding zero where the error exceeds 9 ulps
+   for the alternate code.  Degree 3 is enough, except for the first roots.
+*/
+static const float Py[SMALL_SIZE][7] = {
+  /* For index 0, we use a degree-5 polynomial generated by Sollya, with the
+     coefficients of degree 4 and 5 hard-coded in y1f_near_root().  */
+  { 0x1.f7f16ap+0, 0x1.193beep+1, 0x1.2105dcp+1, 0xb.96749p-28,
+    0x8.55241p-4, -0x1.e570bp-4, -0x8.68b61p-8
+    /*, -0x1.28043p-8, 0x2.50e83p-8*/ }, /* 0 */
+  /* For index 1, we use a degree-4 polynomial generated by Sollya, with the
+     coefficient of degree 4 hard-coded in y1f_near_root().  */
+  { 0x1.55c6d2p+2, 0x1.5b7fe4p+2, 0x1.5cf8cap+2, 0x1.3c7822p-24,
+    -0x5.71f158p-4, 0x8.05cb4p-8, 0xd.0b15p-8/*, -0xf.ff6b8p-12*/ }, /* 1 */
+  { 0x1.113c6p+3, 0x1.13127ap+3, 0x1.1387dcp+3, -0x1.f3ad8ep-24,
+    0x4.57e66p-4, -0x4.0afb58p-8, -0xb.29207p-8 }, /* 2 */
+  { 0x1.76e7dep+3, 0x1.77f914p+3, 0x1.786a6ap+3, -0xd.5608fp-28,
+    -0x3.b829d4p-4, 0x2.8852cp-8, 0x9.b70e3p-8 }, /* 3 */
+  { 0x1.dc2794p+3, 0x1.dcb7d8p+3, 0x1.dd032p+3, -0xe.a7c04p-28,
+    0x3.4e0458p-4, -0x1.c64b18p-8, -0x8.b0e7fp-8 }, /* 4 */
+  { 0x1.20874p+4, 0x1.20b1c6p+4, 0x1.20c71p+4, 0x1.c2626p-24,
+    -0x3.00f03cp-4, 0x1.54f806p-8, 0x7.f9cf9p-8 }, /* 5 */
+  { 0x1.52d848p+4, 0x1.530254p+4, 0x1.531962p+4, -0x1.9503ecp-24,
+    0x2.c5b29cp-4, -0x1.0bf28p-8, -0x7.562e58p-8 }, /* 6 */
+  { 0x1.851e64p+4, 0x1.854fa4p+4, 0x1.85679p+4, -0x2.8d40fcp-24,
+    -0x2.96547p-4, 0xd.9c38bp-12, 0x6.dcbf8p-8 }, /* 7 */
+  { 0x1.b7808ep+4, 0x1.b79acep+4, 0x1.b7b2a8p+4, -0x2.36df5cp-24,
+    0x2.6f55ap-4, -0xb.57f9fp-12, -0x6.82569p-8 }, /* 8 */
+  { 0x1.e9c8fp+4, 0x1.e9e48p+4, 0x1.e9f24p+4, 0xd.e2eb7p-28,
+    -0x2.4e8104p-4, 0x9.a4be2p-12, 0x6.2541fp-8 }, /* 9 */
+  { 0x1.0e0808p+5, 0x1.0e169p+5, 0x1.0e1d92p+5, -0x2.3070f4p-24,
+    0x2.325e4cp-4, -0x8.53604p-12, -0x5.ca03a8p-8 }, /* 10 */
+  { 0x1.272e08p+5, 0x1.273a7cp+5, 0x1.2741fcp+5, -0x3.525508p-24,
+    -0x2.19e7dcp-4, 0x7.49d1dp-12, 0x5.9cb02p-8 }, /* 11 */
+  { 0x1.404ec6p+5, 0x1.405e18p+5, 0x1.4065cep+5, -0xe.6e158p-28,
+    0x2.046174p-4, -0x6.71b3dp-12, -0x5.4c3c8p-8 }, /* 12 */
+  { 0x1.5971dcp+5, 0x1.598178p+5, 0x1.598592p+5, 0x1.e72698p-24,
+    -0x1.f13fb2p-4, 0x5.c0f938p-12, 0x5.28ca78p-8 }, /* 13 */
+  { 0x1.729c4ep+5, 0x1.72a4a8p+5, 0x1.72a8eap+5, -0x1.5bed9cp-24,
+    0x1.e018dcp-4, -0x5.2f11e8p-12, -0x5.16ce48p-8 }, /* 14 */
+  { 0x1.8bbf4ep+5, 0x1.8bc7b2p+5, 0x1.8bcc1p+5, -0x3.6b654cp-24,
+    -0x1.d09b2p-4, 0x4.b1747p-12, 0x4.bd22fp-8 }, /* 15 */
+  { 0x1.a4e272p+5, 0x1.a4ea9ap+5, 0x1.a4eef4p+5, 0x1.6f11bp-24,
+    0x1.c28612p-4, -0x4.47462p-12, -0x4.947c5p-8 }, /* 16 */
+  { 0x1.be08bep+5, 0x1.be0d68p+5, 0x1.be1088p+5, -0x2.0bc074p-24,
+    -0x1.b5a622p-4, 0x3.ed52d4p-12, 0x4.b76fc8p-8 }, /* 17 */
+  { 0x1.d7272ap+5, 0x1.d7301ep+5, 0x1.d734aep+5, -0x2.87dd4p-24,
+    0x1.a9d184p-4, -0x3.9cf494p-12, -0x4.6303ep-8 }, /* 18 */
+  { 0x1.f0499ap+5, 0x1.f052c4p+5, 0x1.f05758p+5, -0x2.fb964p-24,
+    -0x1.9ee5eep-4, 0x3.5800dp-12, 0x4.4e9f9p-8 }, /* 19 */
+  { 0x1.04b63ap+6, 0x1.04baacp+6, 0x1.04bc92p+6, 0x2.cf5adp-24,
+    0x1.94c6f4p-4, -0x3.1a83e4p-12, -0x4.2311fp-8 }, /* 20 */
+  { 0x1.1146dp+6, 0x1.114beep+6, 0x1.114e12p+6, 0x3.6766fp-24,
+    -0x1.8b5cccp-4, 0x2.e4a4e4p-12, 0x4.20bf9p-8 }, /* 21 */
+  { 0x1.1dda8cp+6, 0x1.1ddd2cp+6, 0x1.1dde7ap+6, 0x3.501424p-24,
+    0x1.829356p-4, -0x2.b47524p-12, -0x4.04bf18p-8 }, /* 22 */
+  { 0x1.2a6bcp+6, 0x1.2a6e64p+6, 0x1.2a6faap+6, -0x5.c05808p-24,
+    -0x1.7a597ep-4, 0x2.8a0498p-12, 0x4.187258p-8 }, /* 23 */
+  { 0x1.36fcd6p+6, 0x1.36ff96p+6, 0x1.3700f6p+6, 0x7.1e1478p-28,
+    0x1.72a09ap-4, -0x2.61a7fp-12, -0x3.c0b54p-8 }, /* 24 */
+  { 0x1.438f46p+6, 0x1.4390c4p+6, 0x1.4392p+6, 0x3.e36e6cp-24,
+    -0x1.6b5c06p-4, 0x2.3f612p-12, 0x4.18f868p-8 }, /* 25 */
+  { 0x1.501f4cp+6, 0x1.5021fp+6, 0x1.50235p+6, 0x1.3f9e5ap-24,
+    0x1.6480c4p-4, -0x2.1f28fcp-12, -0x3.bb4e3cp-8 }, /* 26 */
+  { 0x1.5cb07cp+6, 0x1.5cb318p+6, 0x1.5cb464p+6, -0x2.39e41cp-24,
+    -0x1.5e0544p-4, 0x2.0189f4p-12, 0x3.8b55acp-8 }, /* 27 */
+  { 0x1.694166p+6, 0x1.69443cp+6, 0x1.694594p+6, -0x2.912f84p-24,
+    0x1.57e12p-4, -0x1.e6fabep-12, -0x3.850174p-8 }, /* 28 */
+  { 0x1.75d27cp+6, 0x1.75d55ep+6, 0x1.75d67ep+6, 0x3.d5b00cp-24,
+    -0x1.520ceep-4, 0x1.d0286ep-12, 0x3.8e7d1p-8 }, /* 29 */
+  { 0x1.82653ep+6, 0x1.82667ep+6, 0x1.82674p+6, -0x3.1726ecp-24,
+    0x1.4c8222p-4, -0x1.b98206p-12, -0x3.f34978p-8 }, /* 30 */
+  { 0x1.8ef4b4p+6, 0x1.8ef79cp+6, 0x1.8ef888p+6, 0x1.949e22p-24,
+    -0x1.473ae6p-4, 0x1.a47388p-12, 0x3.69eefcp-8 }, /* 31 */
+  { 0x1.9b8728p+6, 0x1.9b88b8p+6, 0x1.9b896cp+6, -0x5.5553bp-28,
+    0x1.42320ap-4, -0x1.90f0b8p-12, -0x3.6565p-8 }, /* 32 */
+  { 0x1.a8183cp+6, 0x1.a819d2p+6, 0x1.a81aecp+6, 0x3.2df7ecp-28,
+    -0x1.3d62e4p-4, 0x1.7dae28p-12, 0x2.9eb128p-8 }, /* 33 */
+  { 0x1.b4aa1cp+6, 0x1.b4aaeap+6, 0x1.b4abb8p+6, -0x1.e13fcep-24,
+    0x1.38c948p-4, -0x1.6eb0ecp-12, -0x1.f9ddf8p-8 }, /* 34 */
+  { 0x1.c13a7ap+6, 0x1.c13c02p+6, 0x1.c13cbp+6, -0x3.ad9974p-24,
+    -0x1.34616ep-4, 0x1.5e36ecp-12, 0x2.a9fc5p-8 }, /* 35 */
+  { 0x1.cdcb76p+6, 0x1.cdcd16p+6, 0x1.cdcde4p+6, -0x3.6977e8p-24,
+    0x1.3027fp-4, -0x1.4f703p-12, -0x2.9817d4p-8 }, /* 36 */
+  { 0x1.da5cdep+6, 0x1.da5e2ap+6, 0x1.da5efp+6, 0x4.654cbp-24,
+    -0x1.2c19b6p-4, 0x1.455982p-12, 0x3.f1c564p-8 }, /* 37 */
+  { 0x1.e6edccp+6, 0x1.e6ef3ep+6, 0x1.e6f00ap+6, 0x8.825c8p-32,
+    0x1.2833eep-4, -0x1.39097p-12, -0x3.b2646p-8 }, /* 38 */
+  { 0x1.f37f72p+6, 0x1.f3805p+6, 0x1.f3812ap+6, -0x2.0d11d8p-28,
+    -0x1.24740ap-4, 0x1.2c16p-12, 0x1.fc3804p-8 }, /* 39 */
+  { 0x1.000842p+7, 0x1.0008bp+7, 0x1.000908p+7, -0x4.4e495p-24,
+    0x1.20d7b6p-4, -0x1.20816p-12, -0x2.d1ebe8p-8 }, /* 40 */
+  { 0x1.06505cp+7, 0x1.065138p+7, 0x1.06518p+7, 0x4.81c1c8p-24,
+    -0x1.1d5ccap-4, 0x1.17ad5ap-12, 0x2.fda33p-8 }, /* 41 */
+  { 0x1.0c98dap+7, 0x1.0c99cp+7, 0x1.0c9a28p+7, -0xe.99386p-28,
+    0x1.1a015p-4, -0x1.0bd50ap-12, -0x2.9dfb68p-8 }, /* 42 */
+  { 0x1.12e212p+7, 0x1.12e248p+7, 0x1.12e29p+7, -0x6.16f1c8p-24,
+    -0x1.16c37ap-4, 0x1.0303dcp-12, 0x4.34316p-8 }, /* 43 */
+  { 0x1.192a68p+7, 0x1.192acep+7, 0x1.192b02p+7, -0x1.129336p-24,
+    0x1.13a19ep-4, -0xf.bd247p-16, -0x3.851d18p-8 }, /* 44 */
+  { 0x1.1f727p+7, 0x1.1f7354p+7, 0x1.1f73ap+7, 0x5.19c09p-24,
+    -0x1.109a32p-4, 0xf.09644p-16, 0x2.d78194p-8 }, /* 45 */
+  { 0x1.25bb8p+7, 0x1.25bbdap+7, 0x1.25bc12p+7, -0x6.497dp-24,
+    0x1.0dabc8p-4, -0xe.a1d25p-16, -0x2.3378bp-8 }, /* 46 */
+  { 0x1.2c04p+7, 0x1.2c046p+7, 0x1.2c04ap+7, 0x4.e4f338p-24,
+    -0x1.0ad512p-4, 0xe.52d84p-16, 0x4.3bfa08p-8 }, /* 47 */
+  { 0x1.324cbp+7, 0x1.324ce6p+7, 0x1.324d4p+7, -0x1.287c58p-24,
+    0x1.0814d4p-4, -0xe.03a95p-16, 0x3.9930ap-12 }, /* 48 */
+  { 0x1.3894f6p+7, 0x1.38956cp+7, 0x1.3895ap+7, -0x4.b594ep-24,
+    -0x1.0569fp-4, 0xd.6787ep-16, 0x4.0a5148p-8 }, /* 49 */
+  { 0x1.3edd98p+7, 0x1.3eddfp+7, 0x1.3ede2ap+7, -0x3.a8f164p-24,
+    0x1.02d354p-4, -0xd.0309dp-16, -0x3.2ebfb4p-8 }, /* 50 */
+  { 0x1.452638p+7, 0x1.452676p+7, 0x1.4526b4p+7, -0x6.12505p-24,
+    -0x1.005004p-4, 0xc.a0045p-16, 0x4.87c67p-8 }, /* 51 */
+  { 0x1.4b6e8p+7, 0x1.4b6efap+7, 0x1.4b6f34p+7, 0x1.8acf4ep-24,
+    0xf.ddf16p-8, -0xc.2d207p-16, -0x1.da6c36p-8 }, /* 52 */
+  { 0x1.51b742p+7, 0x1.51b77ep+7, 0x1.51b7b2p+7, 0x1.39cf86p-24,
+    -0xf.b7faep-8, 0xb.db598p-16, -0x8.945b1p-12 }, /* 53 */
+  { 0x1.57ffc4p+7, 0x1.580002p+7, 0x1.58003cp+7, -0x2.5f8de8p-24,
+    0xf.930fep-8, -0xb.91889p-16, -0xa.30df9p-12 }, /* 54 */
+  { 0x1.5e483p+7, 0x1.5e4886p+7, 0x1.5e48c8p+7, 0x2.073d64p-24,
+    -0xf.6f245p-8, 0xb.4085fp-16, 0x2.128188p-8 }, /* 55 */
+  { 0x1.64908cp+7, 0x1.64910ap+7, 0x1.64912ap+7, -0x4.ed26ep-28,
+    0xf.4c2cep-8, -0xa.fe719p-16, -0x2.9374b8p-8 }, /* 56 */
+  { 0x1.6ad91ep+7, 0x1.6ad98ep+7, 0x1.6ad9cep+7, -0x2.ae5204p-24,
+    -0xf.2a1efp-8, 0xa.aa585p-16, 0x2.1c0834p-8 }, /* 57 */
+  { 0x1.7121cep+7, 0x1.712212p+7, 0x1.712238p+7, 0x6.d72168p-24,
+    0xf.08f09p-8, -0xa.7da49p-16, -0x3.4f5f1cp-8 }, /* 58 */
+  { 0x1.776a0cp+7, 0x1.776a94p+7, 0x1.776accp+7, 0x2.d3f294p-24,
+    -0xe.e8986p-8, 0xa.23ccdp-16, 0x2.2a6678p-8 }, /* 59 */
+  { 0x1.7db2e8p+7, 0x1.7db318p+7, 0x1.7db35ap+7, 0x3.88c0fp-24,
+    0xe.c90d7p-8, -0x9.eaeap-16, -0x2.86438cp-8 }, /* 60 */
+  { 0x1.83fb56p+7, 0x1.83fb9ap+7, 0x1.83fbep+7, 0x3.d94d34p-24,
+    -0xe.aa478p-8, 0x9.abac7p-16, 0x1.ac2d84p-8 }, /* 61 */
+  { 0x1.8a43e8p+7, 0x1.8a441ep+7, 0x1.8a446p+7, 0x4.66b7ep-24,
+    0xe.8c3e9p-8, -0x9.87682p-16, -0x7.9ab4a8p-12 }, /* 62 */
+  { 0x1.908c6p+7, 0x1.908cap+7, 0x1.908ce6p+7, 0xf.f7ac9p-28,
+    -0xe.6eeb6p-8, 0x9.4423p-16, 0x4.54c4d8p-8 }, /* 63 */
+};
+
+/* Formula page 5 of https://www.cl.cam.ac.uk/~jrh13/papers/bessel.pdf:
+   y1f(x) ~ sqrt(2/(pi*x))*beta1(x)*sin(x-3pi/4-alpha1(x))
+   where beta1(x) = 1 + 3/(16*x^2) - 99/(512*x^4)
+   and alpha1(x) = -3/(8*x) + 21/(128*x^3) - 1899/(5120*x^5).  */
+static float
+y1f_asympt (float x)
+{
+  float cst = 0xc.c422ap-4; /* sqrt(2/pi) rounded to nearest */
+  double y = 1.0 / (double) x;
+  double y2 = y * y;
+  double beta1 = 1.0f + y2 * (0x3p-4 - 0x3.18p-4 * y2);
+  double alpha1;
+  alpha1 = y * (-0x6p-4 + y2 * (0x2.ap-4 - 0x5.ef33333333334p-4 * y2));
+  double h;
+  int n;
+  h = reduce_aux (x, &n, alpha1);
+  n--; /* Subtract pi/2.  */
+  /* Now x - 3pi/4 - alpha1 = h + n*pi/2 mod (2*pi).  */
+  float xr = (float) h;
+  n = n & 3;
+  float t = cst / sqrtf (x) * (float) beta1;
+  if (n == 0)
+    return t * __sinf (xr);
+  else if (n == 2) /* sin(x+pi) = -sin(x) */
+    return -t * __sinf (xr);
+  else if (n == 1) /* sin(x+pi/2) = cos(x) */
+    return t * __cosf (xr);
+  else             /* sin(x+3pi/2) = -cos(x) */
+    return -t * __cosf (xr);
+}
+
+/* Special code for x near a root of y1.
+   z is the value computed by the generic code.
+   For small x, we use a polynomial approximating y1 around its root.
+   For large x, we use an asymptotic formula (y1f_asympt).  */
+static float
+y1f_near_root (float x, float z)
+{
+  float index_f;
+  int index;
+
+  index_f = roundf ((x - FIRST_ZERO_Y1) / (float) M_PI);
+  if (index_f >= SMALL_SIZE)
+    return y1f_asympt (x);
+  index = (int) index_f;
+  const float *p = Py[index];
+  float x0 = p[0];
+  float x1 = p[2];
+  /* If not in the interval [x0,x1] around xmid, return the value z.  */
+  if (! (x0 <= x && x <= x1))
+    return z;
+  float xmid = p[1];
+  float y = x - xmid, p6;
+  if (index == 0)
+    p6 = p[6] + y * (-0x1.28043p-8 + y * 0x2.50e83p-8);
+  else if (index == 1)
+    p6 = p[6] + y * -0xf.ff6b8p-12;
+  else
+    p6 = p[6];
+  return p[3] + y * (p[4] + y * (p[5] + y * p6));
+}
+
 float
 __ieee754_y1f(float x)
 {
@@ -118,16 +560,18 @@  __ieee754_y1f(float x)
 	if(__builtin_expect(ix==0, 0))
 		return -1/zero; /* -inf and divide by zero exception.  */
 	if(__builtin_expect(hx<0, 0)) return zero/(zero*x);
-	if(ix >= 0x40000000) {  /* |x| >= 2.0 */
+        if (ix >= 0x3fe0dfbc) { /* |x| >= 0x1.c1bf78p+0 */
 		SET_RESTORE_ROUNDF (FE_TONEAREST);
 		__sincosf (x, &s, &c);
 		ss = -s-c;
 		cc = s-c;
-		if(ix<0x7f000000) {  /* make sure x+x not overflow */
-		    z = __cosf(x+x);
-		    if ((s*c)>zero) cc = z/ss;
-		    else            ss = z/cc;
-		}
+                if (ix >= 0x7f000000)
+		  /* x >= 2^127: use asymptotic expansion.  */
+                  return y1f_asympt (x);
+                /* Now we are sure that x+x cannot overflow.  */
+                z = __cosf(x+x);
+                if ((s*c)>zero) cc = z/ss;
+                else            ss = z/cc;
 	/* y1(x) = sqrt(2/(pi*x))*(p1(x)*sin(x0)+q1(x)*cos(x0))
 	 * where x0 = x-3pi/4
 	 *      Better formula:
@@ -139,12 +583,20 @@  __ieee754_y1f(float x)
 	 *              sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x))
 	 * to compute the worse one.
 	 */
-		if(ix>0x5c000000) z = (invsqrtpi*ss)/sqrtf(x);
-		else {
-		    u = ponef(x); v = qonef(x);
-		    z = invsqrtpi*(u*ss+v*cc)/sqrtf(x);
-		}
-		return z;
+                if (ix <= 0x5c000000)
+                  {
+                    u = ponef(x); v = qonef(x);
+                    ss = u*ss+v*cc;
+                  }
+                z = (invsqrtpi * ss) / sqrtf(x);
+                float threshold = 0x1.3e014cp-2;
+                /* The following threshold is optimal: for x=0x1.f7f16ap+0
+                   and rounding upwards, |ss|=-0x1.3e014cp-2 and z is 11 ulps
+                   far from the correctly rounded value.  */
+                if (fabsf (ss) > threshold)
+                  return z;
+                else
+                  return y1f_near_root (x, z);
 	}
 	if(__builtin_expect(ix<=0x33000000, 0)) {    /* x < 2**-25 */
 	    z = -tpi / x;
@@ -152,6 +604,7 @@  __ieee754_y1f(float x)
 		__set_errno (ERANGE);
 	    return z;
 	}
+        /* Now 2**-25 <= x < 0x1.c1bf78p+0.  */
 	z = x*x;
 	u = U0[0]+z*(U0[1]+z*(U0[2]+z*(U0[3]+z*U0[4])));
 	v = one+z*(V0[0]+z*(V0[1]+z*(V0[2]+z*(V0[3]+z*V0[4]))));
@@ -159,7 +612,7 @@  __ieee754_y1f(float x)
 }
 libm_alias_finite (__ieee754_y1f, __y1f)
 
-/* For x >= 8, the asymptotic expansions of pone is
+/* For x >= 8, the asymptotic expansion of pone is
  *	1 + 15/128 s^2 - 4725/2^15 s^4 - ...,	where s = 1/x.
  * We approximate pone by
  *	pone(x) = 1 + (R/S)
@@ -252,8 +705,7 @@  ponef(float x)
 	return one+ r/s;
 }
 
-
-/* For x >= 8, the asymptotic expansions of qone is
+/* For x >= 8, the asymptotic expansion of qone is
  *	3/8 s - 105/1024 s^3 - ..., where s = 1/x.
  * We approximate pone by
  *	qone(x) = s*(0.375 + (R/S))
@@ -340,10 +792,10 @@  qonef(float x)
 	GET_FLOAT_WORD(ix,x);
 	ix &= 0x7fffffff;
 	/* ix >= 0x40000000 for all calls to this function.  */
-	if(ix>=0x40200000)     {p = qr8; q= qs8;}
-	else if(ix>=0x40f71c58){p = qr5; q= qs5;}
-	else if(ix>=0x4036db68){p = qr3; q= qs3;}
-	else {p = qr2; q= qs2;}
+	if(ix>=0x41000000)     {p = qr8; q= qs8;} /* x >= 8 */
+	else if(ix>=0x40f71c58){p = qr5; q= qs5;} /* x >= 7.722209930e+00 */
+	else if(ix>=0x4036db68){p = qr3; q= qs3;} /* x >= 2.857141495e+00 */
+	else {p = qr2; q= qs2;}                   /* x >= 2 */
 	z = one/(x*x);
 	r = p[0]+z*(p[1]+z*(p[2]+z*(p[3]+z*(p[4]+z*p[5]))));
 	s = one+z*(q[0]+z*(q[1]+z*(q[2]+z*(q[3]+z*(q[4]+z*q[5])))));
diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps
index 173388b92a..3010e10c75 100644
--- a/sysdeps/powerpc/fpu/libm-test-ulps
+++ b/sysdeps/powerpc/fpu/libm-test-ulps
@@ -1310,50 +1310,50 @@  float128: 1
 ldouble: 3
 
 Function: "j0":
-double: 2
-float: 8
+double: 3
+float: 9
 float128: 2
-ldouble: 2
+ldouble: 5
 
 Function: "j0_downward":
-double: 2
-float: 4
+double: 6
+float: 9
 float128: 4
 ldouble: 12
 
 Function: "j0_towardzero":
-double: 5
-float: 6
+double: 7
+float: 9
 float128: 4
 ldouble: 16
 
 Function: "j0_upward":
-double: 4
-float: 5
+double: 9
+float: 8
 float128: 5
-ldouble: 6
+ldouble: 14
 
 Function: "j1":
-double: 2
-float: 8
+double: 4
+float: 9
 float128: 4
-ldouble: 3
+ldouble: 6
 
 Function: "j1_downward":
 double: 3
-float: 5
+float: 8
 float128: 4
 ldouble: 7
 
 Function: "j1_towardzero":
-double: 3
-float: 2
+double: 4
+float: 8
 float128: 4
 ldouble: 7
 
 Function: "j1_upward":
-double: 3
-float: 4
+double: 9
+float: 9
 float128: 3
 ldouble: 6
 
@@ -1706,49 +1706,49 @@  ldouble: 5
 
 Function: "y0":
 double: 2
-float: 6
+float: 8
 float128: 3
-ldouble: 1
+ldouble: 10
 
 Function: "y0_downward":
 double: 3
-float: 4
+float: 8
 float128: 4
 ldouble: 10
 
 Function: "y0_towardzero":
 double: 3
-float: 3
+float: 8
 float128: 3
-ldouble: 8
+ldouble: 9
 
 Function: "y0_upward":
 double: 2
-float: 5
+float: 8
 float128: 3
 ldouble: 9
 
 Function: "y1":
 double: 3
-float: 2
+float: 9
 float128: 2
 ldouble: 2
 
 Function: "y1_downward":
-double: 3
-float: 2
+double: 6
+float: 8
 float128: 4
-ldouble: 7
+ldouble: 11
 
 Function: "y1_towardzero":
 double: 3
-float: 2
+float: 9
 float128: 2
 ldouble: 9
 
 Function: "y1_upward":
-double: 5
-float: 2
+double: 6
+float: 9
 float128: 5
 ldouble: 9
 
diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps
index 91f2c4c80f..9f85f66d4c 100644
--- a/sysdeps/s390/fpu/libm-test-ulps
+++ b/sysdeps/s390/fpu/libm-test-ulps
@@ -1062,44 +1062,44 @@  double: 1
 ldouble: 1
 
 Function: "j0":
-double: 2
-float: 8
+double: 3
+float: 9
 ldouble: 2
 
 Function: "j0_downward":
-double: 2
-float: 4
-ldouble: 4
+double: 6
+float: 9
+ldouble: 9
 
 Function: "j0_towardzero":
-double: 5
-float: 6
-ldouble: 4
+double: 7
+float: 9
+ldouble: 9
 
 Function: "j0_upward":
-double: 4
-float: 5
-ldouble: 5
+double: 9
+float: 8
+ldouble: 7
 
 Function: "j1":
-double: 2
-float: 8
+double: 4
+float: 9
 ldouble: 4
 
 Function: "j1_downward":
 double: 3
-float: 5
-ldouble: 4
+float: 8
+ldouble: 6
 
 Function: "j1_towardzero":
-double: 3
-float: 2
-ldouble: 4
+double: 4
+float: 8
+ldouble: 9
 
 Function: "j1_upward":
-double: 3
-float: 4
-ldouble: 3
+double: 9
+float: 9
+ldouble: 9
 
 Function: "jn":
 double: 4
@@ -1348,42 +1348,42 @@  ldouble: 4
 
 Function: "y0":
 double: 2
-float: 6
+float: 8
 ldouble: 3
 
 Function: "y0_downward":
 double: 3
-float: 4
-ldouble: 4
+float: 8
+ldouble: 7
 
 Function: "y0_towardzero":
 double: 3
-float: 3
+float: 8
 ldouble: 3
 
 Function: "y0_upward":
 double: 3
-float: 5
-ldouble: 3
+float: 8
+ldouble: 4
 
 Function: "y1":
 double: 3
-float: 2
-ldouble: 2
+float: 9
+ldouble: 5
 
 Function: "y1_downward":
-double: 3
-float: 2
-ldouble: 4
+double: 6
+float: 8
+ldouble: 5
 
 Function: "y1_towardzero":
 double: 3
-float: 2
+float: 9
 ldouble: 2
 
 Function: "y1_upward":
 double: 7
-float: 2
+float: 9
 ldouble: 5
 
 Function: "yn":
diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps
index 3304dcf85e..15967c2102 100644
--- a/sysdeps/sparc/fpu/libm-test-ulps
+++ b/sysdeps/sparc/fpu/libm-test-ulps
@@ -1064,43 +1064,43 @@  ldouble: 1
 
 Function: "j0":
 double: 2
-float: 8
+float: 9
 ldouble: 2
 
 Function: "j0_downward":
-double: 2
-float: 4
-ldouble: 4
+double: 5
+float: 9
+ldouble: 9
 
 Function: "j0_towardzero":
-double: 4
-float: 5
-ldouble: 4
+double: 6
+float: 9
+ldouble: 9
 
 Function: "j0_upward":
-double: 4
-float: 5
-ldouble: 5
+double: 9
+float: 9
+ldouble: 7
 
 Function: "j1":
-double: 2
+double: 4
 float: 9
 ldouble: 4
 
 Function: "j1_downward":
-double: 3
-float: 5
-ldouble: 4
+double: 5
+float: 8
+ldouble: 6
 
 Function: "j1_towardzero":
-double: 3
-float: 2
-ldouble: 4
+double: 4
+float: 8
+ldouble: 9
 
 Function: "j1_upward":
-double: 3
-float: 5
-ldouble: 3
+double: 9
+float: 9
+ldouble: 9
 
 Function: "jn":
 double: 4
@@ -1360,42 +1360,42 @@  ldouble: 4
 
 Function: "y0":
 double: 3
-float: 8
+float: 9
 ldouble: 3
 
 Function: "y0_downward":
 double: 3
-float: 6
-ldouble: 4
+float: 9
+ldouble: 7
 
 Function: "y0_towardzero":
-double: 3
-float: 3
+double: 4
+float: 9
 ldouble: 3
 
 Function: "y0_upward":
 double: 3
-float: 6
-ldouble: 3
+float: 9
+ldouble: 4
 
 Function: "y1":
 double: 3
-float: 2
-ldouble: 2
+float: 9
+ldouble: 5
 
 Function: "y1_downward":
-double: 3
-float: 2
-ldouble: 4
+double: 6
+float: 9
+ldouble: 5
 
 Function: "y1_towardzero":
 double: 3
-float: 2
+float: 9
 ldouble: 2
 
 Function: "y1_upward":
 double: 7
-float: 2
+float: 9
 ldouble: 5
 
 Function: "yn":
diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps
index bd1fa63702..0edb95e14c 100644
--- a/sysdeps/x86_64/fpu/libm-test-ulps
+++ b/sysdeps/x86_64/fpu/libm-test-ulps
@@ -1317,50 +1317,50 @@  ldouble: 1
 
 Function: "j0":
 double: 2
-float: 8
+float: 9
 float128: 2
-ldouble: 2
+ldouble: 8
 
 Function: "j0_downward":
-double: 2
-float: 4
-float128: 4
+double: 5
+float: 9
+float128: 9
 ldouble: 6
 
 Function: "j0_towardzero":
-double: 5
-float: 6
-float128: 4
+double: 6
+float: 9
+float128: 9
 ldouble: 6
 
 Function: "j0_upward":
-double: 4
-float: 5
-float128: 5
+double: 9
+float: 9
+float128: 7
 ldouble: 6
 
 Function: "j1":
-double: 2
+double: 4
 float: 9
 float128: 4
-ldouble: 5
+ldouble: 9
 
 Function: "j1_downward":
-double: 3
-float: 5
-float128: 4
-ldouble: 4
+double: 6
+float: 8
+float128: 6
+ldouble: 8
 
 Function: "j1_towardzero":
-double: 3
-float: 2
-float128: 4
+double: 4
+float: 9
+float128: 9
 ldouble: 4
 
 Function: "j1_upward":
-double: 3
-float: 5
-float128: 3
+double: 9
+float: 9
+float128: 9
 ldouble: 3
 
 Function: "jn":
@@ -1753,27 +1753,27 @@  ldouble: 5
 
 Function: "y0":
 double: 3
-float: 8
+float: 9
 float128: 3
-ldouble: 1
+ldouble: 2
 
 Function: "y0_downward":
-double: 3
-float: 6
-float128: 4
-ldouble: 5
+double: 4
+float: 9
+float128: 7
+ldouble: 7
 
 Function: "y0_towardzero":
-double: 3
-float: 3
+double: 4
+float: 9
 float128: 3
-ldouble: 6
+ldouble: 8
 
 Function: "y0_upward":
 double: 3
-float: 6
-float128: 3
-ldouble: 5
+float: 9
+float128: 4
+ldouble: 7
 
 Function: "y1":
 double: 6
@@ -1782,14 +1782,14 @@  float128: 5
 ldouble: 3
 
 Function: "y1_downward":
-double: 3
-float: 2
+double: 6
+float: 9
 float128: 5
 ldouble: 7
 
 Function: "y1_towardzero":
 double: 4
-float: 5
+float: 9
 float128: 6
 ldouble: 5