Bug 277783 - libc fma() doesn't not return the correct zero sign
Summary: libc fma() doesn't not return the correct zero sign
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 14.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-03-18 13:04 UTC by Victor Stinner
Modified: 2024-06-17 22:41 UTC (History)
4 users (show)

See Also:


Attachments
Example reproducing the fma() zero sign issue (1.01 KB, text/plain)
2024-03-18 13:04 UTC, Victor Stinner
no flags Details
Test code that shows results compared to MPFR (1.35 KB, text/plain)
2024-06-12 19:00 UTC, Steve Kargl
no flags Details
testcase for long double (1.41 KB, text/plain)
2024-06-16 02:08 UTC, Steve Kargl
no flags Details
New patch (1.16 KB, patch)
2024-06-16 02:26 UTC, Steve Kargl
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Victor Stinner 2024-03-18 13:04:09 UTC
Created attachment 249266 [details]
Example reproducing the fma() zero sign issue

Hi,

I exposed the libc fma() function in Python:

* https://github.com/python/cpython/commit/8e3c953b3acd7c656f66696806c9ae917e816492
* https://github.com/python/cpython/pull/116667
* https://github.com/python/cpython/issues/73468

The problem is that the Python test suite fails on FreeBSD x86-64 on tests on the zero sign: test_fma_zero_result()
https://github.com/python/cpython/blob/cd2ed917801b93fb46d1dcf19dd480e5146932d8/Lib/test/test_math.py#L2694-L2748

Example:
---
#include <math.h>
#include <string.h>
#include <stdio.h>

void set_double(double *x, const char *bytes)
{
    memcpy(x, bytes, sizeof(*x));
}

int main(int argc, char **argv)
{
    double x = 0; //float.fromhex('0x1p-500')
    double y = 0; //float.fromhex('0x1p-550')
    double z = 0; //float.fromhex('0x1p-1000')
#if 1
    // condition always true, but trick the compiler, otherwise it will not
    // call fma() at runtime, but just compute the result at build time.
    if (argc) {
        set_double(&x, "\x00\x00\x00\x00\x00\x00\xb0\x20");
    }
    else {
        x = 123;
    }
#else
    set_double(&x, "\x00\x00\x00\x00\x00\x00\xb0\x20");
#endif
    set_double(&y, "\x00\x00\x00\x00\x00\x00\x90\x1d");
    set_double(&z, "\x00\x00\x00\x00\x00\x00p\x01");

    printf("x %g\n", x);
    printf("y %g\n", y);
    printf("z %g\n", z);

    double a, b, c, r;
    a = x-y;
    b = x+y;
    c = -z;
    r = fma(a, b, c);
    printf("fma(x-y, x+y, -z):\n");
    printf("fma(%+g, %+g, %+g) = %+g\n", a, b, c, r);
    return 0;
}
---

Output on x86-64:
---
$ clang x.c -Og -o x -lm && ./x
x 3.05494e-151
y 2.71333e-166
z 9.33264e-302
fma(x-y, x+y, -z):
fma(+3.05494e-151, +3.05494e-151, -9.33264e-302) = +0
---

fma() result is "+0" which is wrong. For example, if you replace "#if 1" with "#if 0", you get:
---
(...)
fma(+3.05494e-151, +3.05494e-151, -9.33264e-302) = -0
---

fma() result is now "-0" which is correct. This time, the result was computed at build time.


Extract of the Python tests which fail on FreeBSD:
---
        # Corner case where rounding the multiplication would
        # give the wrong result.
        x = float.fromhex('0x1p-500')
        y = float.fromhex('0x1p-550')
        z = float.fromhex('0x1p-1000')
        self.assertIsNegativeZero(math.fma(x-y, x+y, -z))
        self.assertIsPositiveZero(math.fma(y-x, x+y, z))
        self.assertIsNegativeZero(math.fma(y-x, -(x+y), -z))
        self.assertIsPositiveZero(math.fma(x-y, -(x+y), z))
---

For example, on Linux x86-64, the example works as expected with GCC and clang:
---
vstinner@mona$ gcc x.c -o x -O2 -lm && ./x
x 3.05494e-151
y 2.71333e-166
z 9.33264e-302
fma(x-y, x+y, -z):
fma(+3.05494e-151, +3.05494e-151, -9.33264e-302) = -0


vstinner@mona$ clang x.c -o x -O2 -lm && ./x
x 3.05494e-151
y 2.71333e-166
z 9.33264e-302
fma(x-y, x+y, -z):
fma(+3.05494e-151, +3.05494e-151, -9.33264e-302) = -0
---

fma() returns -0 which is correct.


On the bug, I selected "kern" component. I expect that the bug is more in the libc, but I cannot see "libc" choice.
Comment 1 Ed Maste freebsd_committer freebsd_triage 2024-03-18 18:33:47 UTC
Thank you for the report and C reproduction case. Do you know which other implementations are being tested? (macOS, musl, bionic?)

Steve would you be able to take an initial look at this?
Comment 2 Steve Kargl freebsd_committer freebsd_triage 2024-03-18 19:50:31 UTC
So, I don't loose a reduced version of the code:

#include <math.h>
#include <stdio.h>

int
main(void)
{
    volatile double x = 0x1p-500, y = 0x1p-550, z = 0x1p-1000;
    double a, b, c, r;

    a = x-y;
    b = x+y;
    c = -z;
    r = fma(a, b, c);
    /*
     * Should print -0 instead of +0.
     */
    printf("fma(%+a, %+a, %+a) = %+g\n", a, b, c, r);

    return 0;
}

% cc -o z -O3 a.c -lm && ./z
fma(+0x1.ffffffffffff8p-501, +0x1.0000000000004p-500, -0x1p-1000) = +0

% cc -o z -O3 a.c -mfma -lm && ./z
fma(+0x1.ffffffffffff8p-501, +0x1.0000000000004p-500, -0x1p-1000) = -0
Comment 3 Steve Kargl freebsd_committer freebsd_triage 2024-03-18 20:12:52 UTC
If one looks in src/s_fma.c, one sees

#ifdef USE_BUILTIN_FMA
double
fma(double x, double y, double z)
{
	return (__builtin_fma(x, y, z));
}
#else

If libm is built with -DUSE_BUILTIN_FMA, then one get -0.
Comment 4 Steve Kargl freebsd_committer freebsd_triage 2024-03-18 21:06:29 UTC
Ok, with the given input to fma, one arrives at line 271

 		return (xy.hi + vzs + ldexp(xy.lo, spread));

(gdb) p vzs
$12 = -0.5
(gdb) p xy.hi
$13 = 0.5
(gdb) p xy.lo
$14 = -3.944304526105059e-31
(gdb) p spread
$15 = -999

Turns out that ldexp(3.944304526105059e-31, -999) = -0x0p+0.  So, you have

0.5 - 0.5 - 0. = 0. - 0. = +0

So, a pessimestic patch is

Index: src/s_fma.c
===================================================================
--- src/s_fma.c	(revision 2834)
+++ src/s_fma.c	(working copy)
@@ -267,7 +267,9 @@
 		 */
 		fesetround(oround);
 		volatile double vzs = zs; /* XXX gcc CSE bug workaround */
-		return (xy.hi + vzs + ldexp(xy.lo, spread));
+		xs = ldexp(xy.lo, spread);
+		xy.hi += vzs;
+		return (xy.hi == 0 ? xs : xy.hi + xs);
 	}
 
 	if (oround != FE_TONEAREST) {
Comment 5 Victor Stinner 2024-03-18 21:40:03 UTC
Ed:
> Thank you for the report and C reproduction case. Do you know which other implementations are being tested? (macOS, musl, bionic?)

Python test_math checks fma() on Windows, Linux, macOS, WASI, FreeBSD, etc. The tests pass on all platforms but FreeBSD and WASI.

Apparently, the WASI libc is based musl, but I failed to reproduce the issue using musl-gcc on Linux.

bionic: Python does not officially support Android yet, see: https://peps.python.org/pep-0738/
Comment 6 Steve Kargl freebsd_committer freebsd_triage 2024-03-18 22:00:06 UTC
(In reply to Steve Kargl from comment #4)

A similar issue is present in src/s_fmal.c.  The patch in comment #4 can be adapted for fmal.
Comment 7 Ed Maste freebsd_committer freebsd_triage 2024-03-19 14:35:04 UTC
Proposed patch for fma and fmal in https://reviews.freebsd.org/D44433
We can switch to BUILTIN_FMA in a later change
Comment 8 Malcolm Smith 2024-03-25 13:57:12 UTC
> Python does not officially support Android yet, see: 
> https://peps.python.org/pep-0738/
I'm working on adding Android support to Python, and I've found that this issue exists on Android x86_64, but ARM64 is OK.
Comment 9 commit-hook freebsd_committer freebsd_triage 2024-06-08 15:57:20 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=888796ade2842486d3167067e8034254c38aadd3

commit 888796ade2842486d3167067e8034254c38aadd3
Author:     Ed Maste <emaste@FreeBSD.org>
AuthorDate: 2024-03-19 14:31:39 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2024-06-08 15:55:36 +0000

    libm: fma: correct zero sign with small inputs

    PR:             277783
    Reported by:    Victor Stinner
    Submitted by:   kargl
    MFC after:      1 week
    Differential Revision: https://reviews.freebsd.org/D44433

 lib/msun/src/s_fma.c  | 4 +++-
 lib/msun/src/s_fmal.c | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)
Comment 10 commit-hook freebsd_committer freebsd_triage 2024-06-12 01:38:53 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e77ad954bb825983b4346b9cc646c9c910b1be24

commit e77ad954bb825983b4346b9cc646c9c910b1be24
Author:     Ed Maste <emaste@FreeBSD.org>
AuthorDate: 2024-06-12 01:34:02 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2024-06-12 01:36:12 +0000

    Revert "libm: fma: correct zero sign with small inputs"

    This change introduced a test failure, so revert until that can be
    addressed.

    This reverts commit 888796ade2842486d3167067e8034254c38aadd3.

    PR:             277783
    Reported by:    rlibby
    Sponsored by:   The FreeBSD Foundation

 lib/msun/src/s_fma.c  | 4 +---
 lib/msun/src/s_fmal.c | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)
Comment 11 Steve Kargl freebsd_committer freebsd_triage 2024-06-12 18:58:03 UTC
Someone smarter than me may need to look at this!  With the original src/s_fma.c, I see

dble (x,y,z): -0x1p+0  0x1p+0  0x1p+0
mpfr (x,y,z): -0x1p+0  0x1p+0  0x1p+0
        mpfr    libm
RNDN:  0x0p+0  0x0p+0
RNDU:  0x0p+0  0x0p+0
RNDD: -0x0p+0 -0x0p+0
RNDZ:  0x0p+0  0x0p+0

dble (x,y,z):  0x1p+0 -0x1p+0  0x1p+0
mpfr (x,y,z):  0x1p+0 -0x1p+0  0x1p+0
        mpfr    libm
RNDN:  0x0p+0  0x0p+0
RNDU:  0x0p+0  0x0p+0
RNDD: -0x0p+0 -0x0p+0
RNDZ:  0x0p+0  0x0p+0

dble (x,y,z): -0x1p+0 -0x1p+0 -0x1p+0
mpfr (x,y,z): -0x1p+0 -0x1p+0 -0x1p+0
        mpfr    libm
RNDN:  0x0p+0  0x0p+0
RNDU:  0x0p+0  0x0p+0
RNDD: -0x0p+0 -0x0p+0
RNDZ:  0x0p+0  0x0p+0

dble (x,y,z):  0x1.ffffffffffff8p-501  0x1.0000000000004p-500 -0x1p-1000
mpfr (x,y,z):  0xf.fffffffffffcp-504  0x1.0000000000004p-500 -0x1p-1000
        mpfr    libm
RNDN: -0x0p+0  0x0p+0
RNDU: -0x0p+0  0x0p+0
RNDD: -0x1p-1074 -0x1p-1074
RNDZ: -0x0p+0  0x0p+0

The first groups of three are fma(-1.,1.,1.), fma(1.,-1.,1.), and fma(-1.,-1.,-1.) with the four rounding modes.  The group is Victor's example where the rounding is wrong.

If one looks in src/s_fma.c, there is a special case for addends that sum to zero.
This is the 'if (r.hi == 0.0) {}' block of code (lines 263-274).  If I comment 
out this block, I get wrong results for fma(-1.,1.,1.), fma(1.,-1.,1.), and
fma(-1.,-1.,-1.), but correct results for Victor's input. :(

dble (x,y,z): -0x1p+0  0x1p+0  0x1p+0
mpfr (x,y,z): -0x1p+0  0x1p+0  0x1p+0
        mpfr    libm
RNDN:  0x0p+0  0x0p+0
RNDU:  0x0p+0  0x0p+0
RNDD: -0x0p+0  0x0p+0
RNDZ:  0x0p+0  0x0p+0

dble (x,y,z):  0x1p+0 -0x1p+0  0x1p+0
mpfr (x,y,z):  0x1p+0 -0x1p+0  0x1p+0
        mpfr    libm
RNDN:  0x0p+0  0x0p+0
RNDU:  0x0p+0  0x0p+0
RNDD: -0x0p+0  0x0p+0
RNDZ:  0x0p+0  0x0p+0

dble (x,y,z): -0x1p+0 -0x1p+0 -0x1p+0
mpfr (x,y,z): -0x1p+0 -0x1p+0 -0x1p+0
        mpfr    libm
RNDN:  0x0p+0  0x0p+0
RNDU:  0x0p+0  0x0p+0
RNDD: -0x0p+0  0x0p+0
RNDZ:  0x0p+0  0x0p+0

dble (x,y,z):  0x1.ffffffffffff8p-501  0x1.0000000000004p-500 -0x1p-1000
mpfr (x,y,z):  0xf.fffffffffffcp-504  0x1.0000000000004p-500 -0x1p-1000
        mpfr    libm
RNDN: -0x0p+0 -0x0p+0
RNDU: -0x0p+0 -0x0p+0
RNDD: -0x1p-1074 -0x1p-1074
RNDZ: -0x0p+0 -0x0p+0
Comment 12 Steve Kargl freebsd_committer freebsd_triage 2024-06-12 19:00:39 UTC
Created attachment 251421 [details]
Test code that shows results compared to MPFR

This is the test code that includes the comparisons for the testsuite and Victor's input under the four rounding modes.  It requires MPFR and GMP to compile and execute.
Comment 13 Steve Kargl freebsd_committer freebsd_triage 2024-06-16 02:08:22 UTC
Created attachment 251488 [details]
testcase for long double

This contains a test that shows the issue for long double (on amd64).
Comment 14 Steve Kargl freebsd_committer freebsd_triage 2024-06-16 02:26:01 UTC
Created attachment 251489 [details]
New patch

The code for fma() and fmal() contain a special case when the addends sum to zero, but it seems that it mishandles small low order bits especially if a subnormal condition occurs.  One of the testcases in msun/tests (that I broke with the original patch) is fma(-1.,1.,1.) with the intermediate results

(x,y,z): -0x1p+0  0x1p+0  0x1p+0
xy = {-0x1p-2, 0x0p+0}
r  = {0x0p+0, 0x0p+0}
spread = 2
zs = 0x1p-2

fma returns (xy.hi + zs +ldexp(xy.lo,spread)), which is fine.

For the case that Victor submitted one has

(x,y,z):  0x1.ffffffffffff8p-501  0x1.0000000000004p-500 -0x1p-1000
xy = {5.000000e-01, -3.944305e-31}
r  = {0.000000e+00 0.000000e+00}
spread = -999 
zs = -5.000000e-01

For the original code, the special leads to ldexp(xy.lo,spread)=ldexp(-3.944305e-31,-999), is not good.

The new patch skips the special case if xy.lo != 0.  This then allows fma() to call its add_and_denormalize() function.
Comment 15 Ed Maste freebsd_committer freebsd_triage 2024-06-17 15:17:48 UTC
(In reply to Steve Kargl from comment #14)

I think the comment may need an update too:
        /*
         * When the addends cancel to 0, ensure that the result has
         * the correct sign.
         */
Comment 16 Ed Maste freebsd_committer freebsd_triage 2024-06-17 15:36:53 UTC
Also for reference I have staged this in my working tree as:

commit fa22569968175ac79486ee467e39f570fe4b09cd (HEAD -> wipbsd)
Author:     Steve Kargl <kargl@FreeBSD.org>
AuthorDate: Sun Jun 16 19:41:38 2024 -0400
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: Mon Jun 17 09:29:51 2024 -0400

    libm: fma: correct zero sign with small inputs
    
    This is a fixed version of 888796ade284.
    
    PR:             277783
    Reported by:    Victor Stinner
    Reviewed by:    emaste
    MFC after:      1 week
Comment 17 Steve Kargl freebsd_committer freebsd_triage 2024-06-17 22:41:42 UTC
Perhaps, the comments needs an update, but I'm not sure to what.

I did look at https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libm/src/s_fma.c?annotate=1.7 which is based on das@ code.  This code does not have the special, but if I read it correctly would fall through to the subnormal case in the lines 174-189 block of code with Victor's values.

netbsd uses essentially freebsd's code with a few minor changes, so likely has the same issue as in this bug report.