[0/3] libm: Clean up gamma functions

Message ID 20200826170357.2551683-1-keithp@keithp.com
Headers show
Series
  • libm: Clean up gamma functions
Related show

Message

Torbjorn SVENSSON via Newlib Aug. 26, 2020, 5:03 p.m.
This  series  cleans   up  some  API  oddities   and  inaccuracies  in
errno/exception reporting  from the  various gamma  related functions.
With this series applied, newlib now  matches the POSIX spec (and glibc)
for  tgamma/lgamma  functions. I've  left  the  'gamma' functions  BSD
compatible (tgamma instead of lgamma) to avoid changing the API.

[PATCH 1/3] libm: Fix sign value returned from __ieee754_lgamma*_r(-0)

	Following the C spec, gamma(-0) should be -INFINITY rather than
	+INFINITY, so the *signgamp value in lgamma should be -1
	rather than +1 for this case.

[PATCH 2/3] libm: Remove __ieee754_gamma_r variants

	Back in 2002, a patch changed the various gamma functions so
	that the returned abs(gamma) instead of the ln of gamma. This
	left the _r variants in place, with their extra parameter to
	return the sign of the resulting value. This doesn't make any
	sense to me and looks like it was just a mistake. I've changed
	the __ieee754_gamma functions to apply the sign value locally
	and eliminated the gamma_r versions as those are no longer useful.

[PATCH 3/3] libm: Adjust errno/exception values for gamma/lgamma

	There's a pretty detailed discussion in the commit message
	here, the short version is that tgamma and lgamma are
	specified with different errno/exception behaviour for
	one class of inputs (negative integers).

Comments

Torbjorn SVENSSON via Newlib Aug. 27, 2020, 5:43 p.m. | #1
> ------------------------------

> *From:* Newlib <newlib-bounces@sourceware.org> on behalf of Keith Packard

> via Newlib <newlib@sourceware.org>

> *Sent:* Wednesday, August 26, 2020 1:03 PM

> *To:* newlib@sourceware.org <newlib@sourceware.org>

> *Subject:* [PATCH 0/3] libm: Clean up gamma functions

>

> EXTERNAL EMAIL - This email originated from outside of CACI. Do not click

> any links or attachments unless you recognize and trust the sender.

>

>

>

>

>

> This  series  cleans   up  some  API  oddities   and  inaccuracies  in

> errno/exception reporting  from the  various gamma  related functions.

> With this series applied, newlib now  matches the POSIX spec (and glibc)

> for  tgamma/lgamma  functions. I've  left  the  'gamma' functions  BSD

> compatible (tgamma instead of lgamma) to avoid changing the API.

>

> [PATCH 1/3] libm: Fix sign value returned from __ieee754_lgamma*_r(-0)

>

>         Following the C spec, gamma(-0) should be -INFINITY rather than

>         +INFINITY, so the *signgamp value in lgamma should be -1

>         rather than +1 for this case.

>

What C spec?  Neither C99 nor C11 say so that I see for lgamma().  POSIX
makes the direct statement "If x is a non-positive integer, a pole error
shall occur and lgamma(), lgammaf(), and lgammal() shall return +HUGE_VAL,
+HUGE_VALF, and +HUGE_VALL, respectively."  0, regardless of sign, falls
under that.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/lgamma.html
Additionally, the Linux lgamma man page, using GLIBC, says the same thing
as quoted from POSIX (not word for word, but the identical result).
(POSIX does require the +-0 +-INFINITY for tgamma().  Did these get crossed
up?)
Craig

>

> [PATCH 2/3] libm: Remove __ieee754_gamma_r variants

>

>         Back in 2002, a patch changed the various gamma functions so

>         that the returned abs(gamma) instead of the ln of gamma. This

>         left the _r variants in place, with their extra parameter to

>         return the sign of the resulting value. This doesn't make any

>         sense to me and looks like it was just a mistake. I've changed

>         the __ieee754_gamma functions to apply the sign value locally

>         and eliminated the gamma_r versions as those are no longer useful.

>

> [PATCH 3/3] libm: Adjust errno/exception values for gamma/lgamma

>

>         There's a pretty detailed discussion in the commit message

>         here, the short version is that tgamma and lgamma are

>         specified with different errno/exception behaviour for

>         one class of inputs (negative integers).

>
Torbjorn SVENSSON via Newlib Aug. 27, 2020, 11:59 p.m. | #2
C Howland via Newlib <newlib@sourceware.org> writes:

>>         Following the C spec, gamma(-0) should be -INFINITY rather than

>>         +INFINITY, so the *signgamp value in lgamma should be -1

>>         rather than +1 for this case.

>>

> What C spec?  Neither C99 nor C11 say so that I see for lgamma().  POSIX

> makes the direct statement "If x is a non-positive integer, a pole error

> shall occur and lgamma(), lgammaf(), and lgammal() shall return +HUGE_VAL,

> +HUGE_VALF, and +HUGE_VALL, respectively."  0, regardless of sign, falls

> under that.


Right, C99/C11 doesn't mention signgam at all, so POSIX is free to
define it as it likes. That spec only says that it's value is undefined
when the parameter is a negative integer.

C99 says that , tgamma should return -INFINITY for -0. As newlib makes
'gamma' an alias for tgamma, gamma should probably return the same
thing. To make that work with our implementation of tgamma, (which uses
lgamma and the 'signgam' value), that signgam value should be -1.

> https://pubs.opengroup.org/onlinepubs/9699919799/functions/lgamma.html

> Additionally, the Linux lgamma man page, using GLIBC, says the same thing

> as quoted from POSIX (not word for word, but the identical result).

> (POSIX does require the +-0 +-INFINITY for tgamma().  Did these get crossed

> up?)

> Craig


I think we're probably just confusing gamma between lgamma and tgamma --
newlib defines gamma 'oddly' (fixed in 2/3).

I think the only problematic patch in the series is in changing the
semantics of gamma/gammaf and removing the gamma_r/gammaf_r -- we need
to decide how to recover from what looks like a mistake introduced 18
years ago.

-- 
-keith
Brian Inglis Aug. 28, 2020, 2:03 a.m. | #3
On 2020-08-27 17:59, Keith Packard via Newlib wrote:
> C Howland via Newlib <newlib@sourceware.org> writes:

> 

>>> Following the C spec, gamma(-0) should be -INFINITY rather than 

>>> +INFINITY, so the *signgamp value in lgamma should be -1 rather than +1

>>> for this case.

>>>

>> What C spec?  Neither C99 nor C11 say so that I see for lgamma().  POSIX

>> makes the direct statement "If x is a non-positive integer, a pole error

>> shall occur and lgamma(), lgammaf(), and lgammal() shall return +HUGE_VAL,

>> +HUGE_VALF, and +HUGE_VALL, respectively."  0, regardless of sign, falls

>> under that.

> 

> Right, C99/C11 doesn't mention signgam at all, so POSIX is free to

> define it as it likes. That spec only says that it's value is undefined

> when the parameter is a negative integer.

> 

> C99 says that , tgamma should return -INFINITY for -0. As newlib makes

> 'gamma' an alias for tgamma, gamma should probably return the same

> thing. To make that work with our implementation of tgamma, (which uses

> lgamma and the 'signgam' value), that signgam value should be -1.


Last C17 public FDIS N2176 is available via:

https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf

see 7.12.8.3 lgamma and 7.12.8.4 tgamma on pp181-182, 7.26 tgmath #5 lgamma,
tgamma pp272-273, 7.31.1 complex clgamma, ctgamma p332, F10.5.3 lgamma and
F.10.5.4 tgamma.

>> https://pubs.opengroup.org/onlinepubs/9699919799/functions/lgamma.html

>> Additionally, the Linux lgamma man page, using GLIBC, says the same thing

>> as quoted from POSIX (not word for word, but the identical result).

>> (POSIX does require the +-0 +-INFINITY for tgamma().  Did these get crossed

>> up?)

>> Craig

> 

> I think we're probably just confusing gamma between lgamma and tgamma --

> newlib defines gamma 'oddly' (fixed in 2/3).

> 

> I think the only problematic patch in the series is in changing the

> semantics of gamma/gammaf and removing the gamma_r/gammaf_r -- we need

> to decide how to recover from what looks like a mistake introduced 18

> years ago.


-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in IEC units and prefixes, physical quantities in SI.]
Torbjorn SVENSSON via Newlib Aug. 28, 2020, 3:13 a.m. | #4
Brian Inglis <Brian.Inglis@SystematicSw.ab.ca> writes:

> Last C17 public FDIS N2176 is available via:

>

> https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf

>

> see 7.12.8.3 lgamma and 7.12.8.4 tgamma on pp181-182, 7.26 tgmath #5 lgamma,

> tgamma pp272-273, 7.31.1 complex clgamma, ctgamma p332, F10.5.3 lgamma and

> F.10.5.4 tgamma.


Thanks for the link -- the difference for lgamma is that in C99:

        A range error occurs if x is too large. A range error may occur
        if x is a negative integer or zero.

While in C17:

        A range error occurs if positive x is too large. A pole error
        may occur if x is a negative integer or zero.

Frustratingly, tgamma still returns different errors. In C99:

        A domain error or range error may occur if x is a negative
        integer or zero. A range error may occur if the magnitude of x
        is too large or too small.

While in C17:

        A domain error or pole error may occur if x is a negative
        integer or zero. A range error occurs if the magnitude of x is
        too large and may occur if the magnitude of x is too small.

C17 appears to have added this new 'pole' error, which seems to match
what the IEEE 754 calls a 'divideByZero exception'. Let's see if my
understanding of the various errors is correct:

IEEE            C17             POSIX   errno   Exception
divideByZero    Pole            Pole    ERANGE  FE_DIVBYZERO
invalid         Domain          Domain  EDOM    FE_INVALID
overflow        Range           Range   ERANGE  FE_OVERFLOW

POSIX is more specific than C17, but it does appear to have been updated
to follow that spec, including the term 'Pole Error' which wasn't in
C99.

As my tests all refer to the POSIX spec (which defines the
errno/exception values), I think they're correct. Here's the set of
values I'm testing for both tgamma and lgamma (I'm not testing gamma
currently; we need to sort out what it should be doing first):

Value                 tgamma                               lgamma
         POSIX   result   errno  exception      POSIX   result   errno   exception

+0       Pole   +INFINITY ERANGE FE_DIVBYZERO   Pole   +INFINITY ERANGE FE_DIVBYZERO
-0       Pole   -INFINITY ERANGE FE_DIVBYZERO   Pole   +INFINITY ERANGE FE_DIVBYZERO
-1       Domain    NAN    EDOM   FE_INVALID     Pole   +INFINITY ERANGE FE_DIVBYZERO
+3e38f   Range  +INFINITY ERANGE FE_OVERFLOW    Range  +INFINITY ERANGE FE_OVERFLOW
+1.7e308 Range  +INFINITY ERANGE FE_OVERFLOW    Range  +INFINITY ERANGE FE_OVERFLOW
-3e38f   Domain    NAN    EDOM   FE_INVALID     Pole   +INFINITY ERANGE FE_DIVBYZERO
-1.7e308 Domain    NAN    EDOM   FE_INVALID     Pole   +INFINITY ERANGE FE_DIVBYZERO
+inf     None   +INFINITY 0      0              None   +INFINITY 0      0
-inf     Domain    NAN    EDOM   FE_INVALID     None   +INFINITY 0      0

I have rationalized the differences in this way:

        Domain error means there is no defined limit for the function
        approaching the parameter. Pole error means that there *is* a
        defined limit in this case.

        For non-positive inputs, the limit of the value of the tgamma
        function depends on whether that limit is approached from above
        or below.

        For zero, we can separate out 'approach from below' and
        'approach from above' with the sign -- -0 might well be the
        result of an underflow from below, hence returning the value of
        the function as approached from below makes "sense", in this
        case -INFINITY. Similarly, +0 might well be underflow from
        above, so the function can return +INFINITY. In both cases, as
        the limit is well defined, this represents a Pole error rather
        than a Domain error, hence the differences in errno and exceptions.
        
        For negative integers, as the limit depends on whether the value
        is approached from below or above and we have no way of
        distinguishing these two cases from the input, the limit is not
        well defined, so we have a Domain error and return
        NAN/EDOM/FE_INVALID.

        For the lgamma function, the limit for non-positive integers is
        always +INFINITY because it returns ln(|Γ(x)|). This makes it
        well defined and hence a Pole error rather than a Domain error

        Huge negative values (-1e308) are negative integers, and hence
        treated that way, generating a Domain error for tgamma and Range
        error for lgamma

        +INFINITY input and +INFINITY output generate *no* error
        presumably because there's an assumption that the computation
        which resulted in +INFINITY being provided to these functions
        already raised a FE_OVERFLOW exception, and perhaps an ERANGE
        errno.

Please review the table above to see if you agree about what errno and
exception values each of the provided inputs should generate. I'd really
like to know if you can suggest additional test cases to use to validate
the correctness of these functions.

-- 
-keith
Brian Inglis Aug. 28, 2020, 3:51 a.m. | #5
On 2020-08-27 21:13, Keith Packard wrote:
> Brian Inglis writes:

> 

>> Last C17 public FDIS N2176 is available via:

>>

>> https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf

>>

>> see 7.12.8.3 lgamma and 7.12.8.4 tgamma on pp181-182, 7.26 tgmath #5 lgamma,

>> tgamma pp272-273, 7.31.1 complex clgamma, ctgamma p332, F10.5.3 lgamma and

>> F.10.5.4 tgamma.

> 

> Thanks for the link -- the difference for lgamma is that in C99:

> 

>         A range error occurs if x is too large. A range error may occur

>         if x is a negative integer or zero.

> 

> While in C17:

> 

>         A range error occurs if positive x is too large. A pole error

>         may occur if x is a negative integer or zero.

> 

> Frustratingly, tgamma still returns different errors. In C99:

> 

>         A domain error or range error may occur if x is a negative

>         integer or zero. A range error may occur if the magnitude of x

>         is too large or too small.

> 

> While in C17:

> 

>         A domain error or pole error may occur if x is a negative

>         integer or zero. A range error occurs if the magnitude of x is

>         too large and may occur if the magnitude of x is too small.

> 

> C17 appears to have added this new 'pole' error, which seems to match

> what the IEEE 754 calls a 'divideByZero exception'. Let's see if my

> understanding of the various errors is correct:

> 

> IEEE            C17             POSIX   errno   Exception

> divideByZero    Pole            Pole    ERANGE  FE_DIVBYZERO

> invalid         Domain          Domain  EDOM    FE_INVALID

> overflow        Range           Range   ERANGE  FE_OVERFLOW

> 

> POSIX is more specific than C17, but it does appear to have been updated

> to follow that spec, including the term 'Pole Error' which wasn't in

> C99.

> 

> As my tests all refer to the POSIX spec (which defines the

> errno/exception values), I think they're correct. Here's the set of

> values I'm testing for both tgamma and lgamma (I'm not testing gamma

> currently; we need to sort out what it should be doing first):

> 

> Value                 tgamma                               lgamma

>          POSIX   result   errno  exception      POSIX   result   errno   exception

> 

> +0       Pole   +INFINITY ERANGE FE_DIVBYZERO   Pole   +INFINITY ERANGE FE_DIVBYZERO

> -0       Pole   -INFINITY ERANGE FE_DIVBYZERO   Pole   +INFINITY ERANGE FE_DIVBYZERO

> -1       Domain    NAN    EDOM   FE_INVALID     Pole   +INFINITY ERANGE FE_DIVBYZERO

> +3e38f   Range  +INFINITY ERANGE FE_OVERFLOW    Range  +INFINITY ERANGE FE_OVERFLOW

> +1.7e308 Range  +INFINITY ERANGE FE_OVERFLOW    Range  +INFINITY ERANGE FE_OVERFLOW

> -3e38f   Domain    NAN    EDOM   FE_INVALID     Pole   +INFINITY ERANGE FE_DIVBYZERO

> -1.7e308 Domain    NAN    EDOM   FE_INVALID     Pole   +INFINITY ERANGE FE_DIVBYZERO

> +inf     None   +INFINITY 0      0              None   +INFINITY 0      0

> -inf     Domain    NAN    EDOM   FE_INVALID     None   +INFINITY 0      0

> 

> I have rationalized the differences in this way:

> 

>         Domain error means there is no defined limit for the function

>         approaching the parameter. Pole error means that there *is* a

>         defined limit in this case.

> 

>         For non-positive inputs, the limit of the value of the tgamma

>         function depends on whether that limit is approached from above

>         or below.

> 

>         For zero, we can separate out 'approach from below' and

>         'approach from above' with the sign -- -0 might well be the

>         result of an underflow from below, hence returning the value of

>         the function as approached from below makes "sense", in this

>         case -INFINITY. Similarly, +0 might well be underflow from

>         above, so the function can return +INFINITY. In both cases, as

>         the limit is well defined, this represents a Pole error rather

>         than a Domain error, hence the differences in errno and exceptions.

>         

>         For negative integers, as the limit depends on whether the value

>         is approached from below or above and we have no way of

>         distinguishing these two cases from the input, the limit is not

>         well defined, so we have a Domain error and return

>         NAN/EDOM/FE_INVALID.

> 

>         For the lgamma function, the limit for non-positive integers is

>         always +INFINITY because it returns ln(|Γ(x)|). This makes it

>         well defined and hence a Pole error rather than a Domain error

> 

>         Huge negative values (-1e308) are negative integers, and hence

>         treated that way, generating a Domain error for tgamma and Range

>         error for lgamma

> 

>         +INFINITY input and +INFINITY output generate *no* error

>         presumably because there's an assumption that the computation

>         which resulted in +INFINITY being provided to these functions

>         already raised a FE_OVERFLOW exception, and perhaps an ERANGE

>         errno.

> 

> Please review the table above to see if you agree about what errno and

> exception values each of the provided inputs should generate. I'd really

> like to know if you can suggest additional test cases to use to validate

> the correctness of these functions.


I don't get so don't do maths above my comfort level: but it is possible that
ORs in the specs could mean that some important implementations (e.g BSD and
glibc) may differ in errors reported depending on which, possibly different,
exceptional input values are passed. You could check the BSD and glibc specs and
tests for these functions to see what the expected outputs are for what inputs,
what exceptional values generate which errno values, and whether all
possibilities are allowed by all the specs. POSIX always defers to C where the
latter defines library implementations, but as long as they do not contradict
normative content, POSIX may add additional conditions or requirements.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in IEC units and prefixes, physical quantities in SI.]
Torbjorn SVENSSON via Newlib Aug. 28, 2020, 5:13 p.m. | #6
Brian Inglis <Brian.Inglis@SystematicSw.ab.ca> writes:

> I don't get so don't do maths above my comfort level: but it is possible that

> ORs in the specs could mean that some important implementations (e.g BSD and

> glibc) may differ in errors reported depending on which, possibly different,

> exceptional input values are passed. You could check the BSD and glibc specs and

> tests for these functions to see what the expected outputs are for what inputs,

> what exceptional values generate which errno values, and whether all

> possibilities are allowed by all the specs. POSIX always defers to C where the

> latter defines library implementations, but as long as they do not contradict

> normative content, POSIX may add additional conditions or

> requirements.


Yeah, my testing could be more permissive to allow any valid
implementation to pass. Right now, I'm interested in making newlib
compliant with the relevant specs and, where that doesn't conflict,
compatible with glibc. And that means having hard requirements that
match glibc -- I haven't found any places where glibc gets the wrong
answer at least...

There's one place where that's a bit hard -- the gamma function has
changed meaning over time and the current newlib gamma implementation
is closer to the BSD one than glibc, but implements neither exactly.

-- 
-keith
Joseph Myers Aug. 28, 2020, 6:29 p.m. | #7
On Thu, 27 Aug 2020, Keith Packard via Newlib wrote:

>         +INFINITY input and +INFINITY output generate *no* error

>         presumably because there's an assumption that the computation

>         which resulted in +INFINITY being provided to these functions

>         already raised a FE_OVERFLOW exception, and perhaps an ERANGE

>         errno.


The basic principles for infinite results from IEEE floating-point 
operations and libm functions are that overflow is when infinity is an 
*inexact* result from rounding (or, in some rounding modes, the largest 
finite value results from rounding but rounding with infinite exponent 
range would have produced an exponent that could not be represented in the 
actual format).  Divide by zero is when infinity is an *exact* result from 
*finite* arguments.  If infinity is an exact result and at least one 
argument is infinite, there is no error or exception (unless it's some of 
the new minimum/maximum operations in IEEE 754-2019 and the other argument 
is a signaling NaN).

-- 
Joseph S. Myers
joseph@codesourcery.com
Torbjorn SVENSSON via Newlib Aug. 28, 2020, 7:32 p.m. | #8
Joseph Myers <joseph@codesourcery.com> writes:

> If infinity is an exact result and at least one 

> argument is infinite, there is no error or exception (unless it's some of 

> the new minimum/maximum operations in IEEE 754-2019 and the other argument 

> is a signaling NaN).


Thanks for the clarification. That all seems consistent, and I think I
can now (mostly) understand how the POSIX error values relate back to
the IEEE 754-2019 specification.

I do wonder if POSIX will add the new minimum/maximum operations; they
seem more like what I expect wrt NaN handling than the existing
minimumNumber/fmin maximumNumber/fmax operations.

Any other concerns about the testing values I'm using? Should we adopt
the newlib patches necessary to have it pass those tests?

-- 
-keith
Joseph Myers Aug. 28, 2020, 7:53 p.m. | #9
On Fri, 28 Aug 2020, Keith Packard via Newlib wrote:

> I do wonder if POSIX will add the new minimum/maximum operations; they

> seem more like what I expect wrt NaN handling than the existing

> minimumNumber/fmin maximumNumber/fmax operations.


C bindings for the new operations are given in 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2532.pdf - but the target 
completion date for the next revision of POSIX (2022) is before that for 
C2x (publication aiming for 2023), so the next revision of POSIX is more 
likely to have only features from the 2011/2018 edition of ISO C.

-- 
Joseph S. Myers
joseph@codesourcery.com