[3/3] Fix large ulp error in pow without fma very near 1.0

Message ID 48429409-1742-ea7b-699c-6870bc90b5a4@arm.com
State Superseded
Headers show
Series
  • Updates to the new math code
Related show

Commit Message

Szabolcs Nagy July 4, 2018, 3:54 p.m.

Patch

From 068894aab1088aeda0a411d7f609ba38928881ff Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date: Tue, 3 Jul 2018 13:05:31 +0100
Subject: [PATCH 3/3] Fix large ulp error in pow without fma very near 1.0

The !__HAVE_FAST_FMA code path split r = z/c - 1 into r = rhi + rlo such
that when z = 1-tiny and c = 1 then rlo and rhi could have much larger
magnitude than r which later caused large rounding errors.

So do a nearest rounding instead of truncation at the split.

In newlib with default settings this was observable on some arm targets
that enable the new math code but has no fma.
---
 newlib/libm/common/pow.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/newlib/libm/common/pow.c b/newlib/libm/common/pow.c
index 7d8060751..4863821a5 100644
--- a/newlib/libm/common/pow.c
+++ b/newlib/libm/common/pow.c
@@ -79,11 +79,13 @@  log_inline (uint64_t ix, double_t *tail)
   logc = T[i].logc;
   logctail = T[i].logctail;
 
-  /* r = z/c - 1, arranged to be exact.  */
+  /* Note: 1/c is j/N or j/N/2 where j is an integer in [N,2N) and
+     |z/c - 1| < 1/N, so r = z/c - 1 is exactly representible.  */
 #if __HAVE_FAST_FMA
   r = fma (z, invc, -1.0);
 #else
-  double_t zhi = asdouble (iz & (-1ULL << 32));
+  /* Split z such that rhi, rlo and rhi*rhi are exact and |rlo| <= |r|.  */
+  double_t zhi = asdouble ((iz + (1ULL << 31)) & (-1ULL << 32));
   double_t zlo = z - zhi;
   double_t rhi = zhi * invc - 1.0;
   double_t rlo = zlo * invc;
-- 
2.14.1