fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is tiny (v2)

Message ID mw4kpqx0ph.fsf@tomate.loria.fr
State Superseded
Headers show
Series
  • fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is tiny (v2)
Related show

Commit Message

Paul Zimmermann July 29, 2020, 7:03 a.m.
Dear Adhemerval,

thank you for your review. Here is a v2 with all fixes. I did fix the
indentation issue, but did not see the "open brackets on next line" issue:
in other places of this file, we have "else {" with the open brackets on the
same line. I also did add an entry in math/auto-libm-test-in that corresponds
to the larger error for the new code path. No adjustment is needed however,
since the new code is more accurate.

Best regards,
Paul

From a4fe8c3e6c172c7eea198f3225efb05b6afe0f65 Mon Sep 17 00:00:00 2001
From: Paul Zimmermann <Paul.Zimmermann@inria.fr>

Date: Wed, 29 Jul 2020 08:59:12 +0200
Subject: [PATCH] fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is
 tiny (v2)

---
 math/auto-libm-test-in         |  2 ++
 sysdeps/ieee754/flt-32/e_j0f.c | 16 ++++++++++++++++
 2 files changed, 18 insertions(+)

-- 
2.27.0

Comments

Adhemerval Zanella via Libc-alpha July 29, 2020, 8:25 p.m. | #1
On 29/07/2020 04:03, Paul Zimmermann wrote:
>        Dear Adhemerval,

> 

> thank you for your review. Here is a v2 with all fixes. I did fix the

> indentation issue, but did not see the "open brackets on next line" issue:

> in other places of this file, we have "else {" with the open brackets on the

> same line. I also did add an entry in math/auto-libm-test-in that corresponds

> to the larger error for the new code path. No adjustment is needed however,

> since the new code is more accurate.


Some code were imported from other projects and does not follow the glibc
code guidelines.  I am not sure which is the correct strategy to follow
in such cases, but even this patch does not really follow the file 
indentation (for instance, 'else' are added after the close bracket, so
no new line in this case).

In any case, following the already set file style should be fine.

> 

> Best regards,

> Paul

> 

> From a4fe8c3e6c172c7eea198f3225efb05b6afe0f65 Mon Sep 17 00:00:00 2001

> From: Paul Zimmermann <Paul.Zimmermann@inria.fr>

> Date: Wed, 29 Jul 2020 08:59:12 +0200

> Subject: [PATCH] fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is

>  tiny (v2)

> 

> ---

>  math/auto-libm-test-in         |  2 ++

>  sysdeps/ieee754/flt-32/e_j0f.c | 16 ++++++++++++++++

>  2 files changed, 18 insertions(+)

> 

> diff --git a/math/auto-libm-test-in b/math/auto-libm-test-in

> index 4414e54d93..5d488a8711 100644

> --- a/math/auto-libm-test-in

> +++ b/math/auto-libm-test-in

> @@ -5748,6 +5748,8 @@ j0 0x1p16382

>  j0 0x1p16383

>  # the next value generates larger error bounds on x86_64 (binary32)

>  j0 0x2.602774p+0 xfail-rounding:ibm128-libgcc

> +# the next value exercises the flt-32 code path for x >= 2^127

> +j0 0x8.2f4ecp+124

>  

>  j1 -1.0

>  j1 0.0


Does it require any ULP adjustment (at least on the architecture you
tested it)?

> diff --git a/sysdeps/ieee754/flt-32/e_j0f.c b/sysdeps/ieee754/flt-32/e_j0f.c

> index c89b9f2688..8540d00b7b 100644

> --- a/sysdeps/ieee754/flt-32/e_j0f.c

> +++ b/sysdeps/ieee754/flt-32/e_j0f.c

> @@ -56,6 +56,22 @@ __ieee754_j0f(float x)

>  		    if ((s*c)<zero) cc = z/ss;

>  		    else	    ss = z/cc;

>  		}

> +                else {


No new line for the else ( '} else {' ).

> +                  /* we subtract (exactly) a value x0 such that cos(x0)+sin(x0)


s/we/We.

> +                     is very near to 0, and use the identity

> +                     sin(x-x0) = sin(x)*cos(x0)-cos(x)*sin(x0) to get

> +                     sin(x) + cos(x) with extra accuracy */


Period and double space prior comment close ('[...] accuracy.  */').

> +                  float x0 = 0xe.d4108p+124f;

> +                  float y = x - x0; /* exact */

> +                  /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0) */

> +                  z = __sinf (y);

> +                  float eps = 0x1.5f263ep-24f;

> +                  /* cos(x0) ~ -sin(x0) + eps */

> +                  z += eps * __cosf (x);

> +                  /* now z ~ (sin(x)-cos(x))*cos(x0) */

> +                  float cosx0 = -0xb.504f3p-4f;

> +                  cc = z / cosx0;

> +                }

>  	/*

>  	 * j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)

>  	 * y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)

>
Paul Zimmermann July 30, 2020, 7:20 a.m. | #2
Dear Adhemerval,

thank you again for your useful comments.

> Does it require any ULP adjustment (at least on the architecture you

> tested it)?


I reset to 0 the j0/float values in sysdeps/x86_64/fpu/libm-test-ulps,
then ran "make regen-ulps", and got:

$ diff ../sysdeps/x86_64/fpu/libm-test-ulps /localdisk/zimmerma/glibc/build/math/NewUlps
1315c1315
< float: 0
---
> float: 8

1321c1321
< float: 0
---
> float: 4

1327c1327
< float: 0
---
> float: 5

1333c1333
< float: 0
---
> float: 5


which were the original values except for j0_towardzero (it was 6).
However the same applies to "master", so this is independent from my change.
Anyway, I decreased the value from 6 to 5.

Below is a v3.

Paul

From 34d336cf2e8495bba21254157be13e02f52d934a Mon Sep 17 00:00:00 2001
From: Paul Zimmermann <Paul.Zimmermann@inria.fr>

Date: Thu, 30 Jul 2020 09:16:36 +0200
Subject: [PATCH] fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is
 tiny (v3)

---
 math/auto-libm-test-in            |  2 ++
 sysdeps/ieee754/flt-32/e_j0f.c    | 17 ++++++++++++++++-
 sysdeps/x86_64/fpu/libm-test-ulps |  2 +-
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/math/auto-libm-test-in b/math/auto-libm-test-in
index 4414e54d93..5d488a8711 100644
--- a/math/auto-libm-test-in
+++ b/math/auto-libm-test-in
@@ -5748,6 +5748,8 @@ j0 0x1p16382
 j0 0x1p16383
 # the next value generates larger error bounds on x86_64 (binary32)
 j0 0x2.602774p+0 xfail-rounding:ibm128-libgcc
+# the next value exercises the flt-32 code path for x >= 2^127
+j0 0x8.2f4ecp+124
 
 j1 -1.0
 j1 0.0
diff --git a/sysdeps/ieee754/flt-32/e_j0f.c b/sysdeps/ieee754/flt-32/e_j0f.c
index c89b9f2688..91e8de8fe3 100644
--- a/sysdeps/ieee754/flt-32/e_j0f.c
+++ b/sysdeps/ieee754/flt-32/e_j0f.c
@@ -55,7 +55,22 @@ __ieee754_j0f(float x)
 		    z = -__cosf(x+x);
 		    if ((s*c)<zero) cc = z/ss;
 		    else	    ss = z/cc;
-		}
+		} else {
+                  /* We subtract (exactly) a value x0 such that cos(x0)+sin(x0)
+                     is very near to 0, and use the identity
+                     sin(x-x0) = sin(x)*cos(x0)-cos(x)*sin(x0) to get
+                     sin(x) + cos(x) with extra accuracy. */
+                  float x0 = 0xe.d4108p+124f;
+                  float y = x - x0; /* exact */
+                  /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0) */
+                  z = __sinf (y);
+                  float eps = 0x1.5f263ep-24f;
+                  /* cos(x0) ~ -sin(x0) + eps */
+                  z += eps * __cosf (x);
+                  /* now z ~ (sin(x)-cos(x))*cos(x0) */
+                  float cosx0 = -0xb.504f3p-4f;
+                  cc = z / cosx0;
+                }
 	/*
 	 * j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
 	 * y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)
diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps
index d71d86d961..2561243fb7 100644
--- a/sysdeps/x86_64/fpu/libm-test-ulps
+++ b/sysdeps/x86_64/fpu/libm-test-ulps
@@ -1324,7 +1324,7 @@ ldouble: 4
 
 Function: "j0_towardzero":
 double: 5
-float: 6
+float: 5
 float128: 2
 ldouble: 5
 
-- 
2.27.0
Adhemerval Zanella via Libc-alpha July 30, 2020, 8:53 a.m. | #3
* Paul Zimmermann:

>        Dear Adhemerval,

>

> thank you again for your useful comments.

>

>> Does it require any ULP adjustment (at least on the architecture you

>> tested it)?

>

> I reset to 0 the j0/float values in sysdeps/x86_64/fpu/libm-test-ulps,

> then ran "make regen-ulps", and got:

>

> $ diff ../sysdeps/x86_64/fpu/libm-test-ulps /localdisk/zimmerma/glibc/build/math/NewUlps

> 1315c1315

> < float: 0

> ---

>> float: 8

> 1321c1321

> < float: 0

> ---

>> float: 4

> 1327c1327

> < float: 0

> ---

>> float: 5

> 1333c1333

> < float: 0

> ---

>> float: 5

>

> which were the original values except for j0_towardzero (it was 6).

> However the same applies to "master", so this is independent from my change.

> Anyway, I decreased the value from 6 to 5.


I suggest to leave it at 6, other CPU variants may still need the 6
there.

Thanks,
Florian

Patch

diff --git a/math/auto-libm-test-in b/math/auto-libm-test-in
index 4414e54d93..5d488a8711 100644
--- a/math/auto-libm-test-in
+++ b/math/auto-libm-test-in
@@ -5748,6 +5748,8 @@  j0 0x1p16382
 j0 0x1p16383
 # the next value generates larger error bounds on x86_64 (binary32)
 j0 0x2.602774p+0 xfail-rounding:ibm128-libgcc
+# the next value exercises the flt-32 code path for x >= 2^127
+j0 0x8.2f4ecp+124
 
 j1 -1.0
 j1 0.0
diff --git a/sysdeps/ieee754/flt-32/e_j0f.c b/sysdeps/ieee754/flt-32/e_j0f.c
index c89b9f2688..8540d00b7b 100644
--- a/sysdeps/ieee754/flt-32/e_j0f.c
+++ b/sysdeps/ieee754/flt-32/e_j0f.c
@@ -56,6 +56,22 @@  __ieee754_j0f(float x)
 		    if ((s*c)<zero) cc = z/ss;
 		    else	    ss = z/cc;
 		}
+                else {
+                  /* we subtract (exactly) a value x0 such that cos(x0)+sin(x0)
+                     is very near to 0, and use the identity
+                     sin(x-x0) = sin(x)*cos(x0)-cos(x)*sin(x0) to get
+                     sin(x) + cos(x) with extra accuracy */
+                  float x0 = 0xe.d4108p+124f;
+                  float y = x - x0; /* exact */
+                  /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0) */
+                  z = __sinf (y);
+                  float eps = 0x1.5f263ep-24f;
+                  /* cos(x0) ~ -sin(x0) + eps */
+                  z += eps * __cosf (x);
+                  /* now z ~ (sin(x)-cos(x))*cos(x0) */
+                  float cosx0 = -0xb.504f3p-4f;
+                  cc = z / cosx0;
+                }
 	/*
 	 * j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
 	 * y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)