Bug 115581

Summary: [Makefile] [patch] -mfancy-math-387 has no effect
Product: Base System Reporter: Šimun Mikecin <numisemis>
Component: amd64Assignee: freebsd-amd64 (Nobody) <amd64>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff none

Description Šimun Mikecin 2007-08-16 17:50:01 UTC
32-bit compatibility libraries on FreeBSD/amd64 are compiled using -mfancy-math-387 gcc option. As stated in gcc(1):

       -mno-fancy-math-387
           Some 387 emulators do not support the "sin", "cos" and "sqrt"
           instructions for the 387.  Specify this option to avoid generating
           those instructions.  This option is the default on FreeBSD, OpenBSD
           and NetBSD.  This option is overridden when -march indicates that
           the target cpu will always have an FPU and so the instruction will
           not need emulation.  As of revision 2.6.1, these instructions are
           not generated unless you also use the -funsafe-math-optimizations
           switch.

So, using just -mfancy-math-387 has no effect. It should be used in combination with -funsafe-math-optimizations or it should not be used.

Fix: Patch attached with submission follows:
How-To-Repeat: double test(double x) {
  return sin(x);
}

Try to compile using:
gcc -m32 -S -mfancy-math-387 -funsafe-math-optimizations -O2 -march=athlon64 test.c

and with:
gcc -m32 -S -mfancy-math-387 -O2 -march=athlon64 test.c

There will be a difference in assembler output. First one will use machine instruction 'fsin' and the second one will use a libm routine called 'sin'.
Comment 1 Bruce Evans freebsd_committer freebsd_triage 2007-08-17 07:49:04 UTC
On Thu, 16 Aug 2007, Simun Mikecin wrote:

> So, using just -mfancy-math-387 has no effect. It should be used in
> combination with -funsafe-math-optimizations or it should not be used.

It should not be used, especially on amd64 systems since basic FP
instructions are relatively fast compared with the fancy instructions
(except for sqrt).  The 64-bit amd64 libm intentionally never uses the
fancy instructions (except for sqrt), partly because they are not much
faster and partly because they are much less accurate.  The fancy
instructions are not used for float precision (unless you pessimize
things using -mfancy-math-387) since they are about 3 times slower
than the library versions on small args.

Is -mno-fancy-math-387 still actually the default on FreeBSD (with FreeBSD's
config/i386/freebsd.h which is quite different (mostly gratuitously different)
from the distribution one)?  FreeBSD hasn't supported the math emulator for
about 10 years.

> --- Makefile.inc1.orig	Tue Jul 10 18:39:36 2007
> +++ Makefile.inc1	Thu Aug 16 18:30:44 2007
> @@ -238,7 +238,7 @@
> .else
> LIB32CPUTYPE=	${TARGET_CPUTYPE}
> .endif
> -LIB32FLAGS=	-m32 -march=${LIB32CPUTYPE} -mfancy-math-387 -DCOMPAT_32BIT \
> +LIB32FLAGS=	-m32 -march=${LIB32CPUTYPE} -mfancy-math-387 -funsafe-math-optimizations -DCOMPAT_32BIT \
> 		-iprefix ${LIB32TMP}/usr/ \
> 		-L${LIB32TMP}/usr/lib32 \
> 		-B${LIB32TMP}/usr/lib32

-unsafe-math-optimizations should be named
-broken-floating-point-optimizations and should almost never be used.  It
should never be used for compiling FreeBSD's math library, since the library
depends on floating point not being very broken.

gcc-4.2 still says the above, but doesn't actually do the above for
sqrt.  It inlines sqrt (but not cos or sin) without
-funsafe-math-optimizations.  I think the difference is just due to
inlining sqrt not actually being unsafe and the documentation of
-ffancy-math-387 being many years out of date (this difference is not
new).  Inlining cos and sin would be safe if the inline code were large
enough to detect the unsafe cases, but the inline code only checks for
the result being a NaN (?), which is more than enough for sqrt but not
enough for cos or sin.

gcc (at least in 4.2) has a -fno-math-errno option which defaults to the
wrong thing for FreeBSD (-fmath-errno) but the correct think on Darwin.
-funsafe-math-optimizations apparently has the apparently-undocumented
effect of turning off on -fno-math-errno.  So -march=pentium4 gives the
following inlining of the fancy functions:

Default:
 	sqrt: inlined, bogus errno handing
 	cos, sin: not inlined
with -fno-math-errno:
 	sqrt: inlined, optimal
 	cos, sin: not inlined
with -funsafe-math-optimizations:
 	sqrt: inlined, optimal
 	cos, sin: inlined, broken (additional breakage only for large args)
 	cosf: inlined, optimal
 	cosf, sinf: inlined, broken (for large args, and small args near a
 		    multiple of pi/2), pessimized (only for small args)

Bruce
Comment 2 Šimun Mikecin 2007-08-21 13:12:29 UTC
On Fri, 17 Aug 2007, Bruce Evans wrote:
> It should not be used, especially on amd64 systems since basic FP
> instructions are relatively fast compared with the fancy instructions
> (except for sqrt). The 64-bit amd64 libm intentionally never uses the
> fancy instructions (except for sqrt), partly because they are not much
> faster and partly because they are much less accurate. The fancy
> instructions are not used for float precision (unless you pessimize
> things using -mfancy-math-387) since they are about 3 times slower
> than the library versions on small args.

This PR is about -mfancy-math-387 usage when compiling 32-bit  
compatibility libraries that are gone be used on FreeBSD/amd64.
As far as I can see FreeBSD's libm on i386 uses those fancy instructions  
(for example /usr/src.current/lib/msun/i387/s_sin.S), so the same libm  
will be used for running 32-bit apps on FreeBSD/amd64.
Is your statement about fancy instructions been 3 times slower also valid  
for FreeBSD/i386 and 32-bit apps running on FreeBSD/amd64?
Comment 3 Bruce Evans freebsd_committer freebsd_triage 2007-08-21 13:47:15 UTC
On Tue, 21 Aug 2007, Simun Mikecin wrote:

> On Fri, 17 Aug 2007, Bruce Evans wrote:
>> It should not be used, especially on amd64 systems since basic FP
>> instructions are relatively fast compared with the fancy instructions
>> (except for sqrt). The 64-bit amd64 libm intentionally never uses the
>> fancy instructions (except for sqrt), partly because they are not much
>> faster and partly because they are much less accurate. The fancy
>> instructions are not used for float precision (unless you pessimize
>> things using -mfancy-math-387) since they are about 3 times slower
>> than the library versions on small args.
>
> This PR is about -mfancy-math-387 usage when compiling 32-bit compatibility 
> libraries that are gone be used on FreeBSD/amd64.
> As far as I can see FreeBSD's libm on i386 uses those fancy instructions (for 
> example /usr/src.current/lib/msun/i387/s_sin.S), so the same libm will be 
> used for running 32-bit apps on FreeBSD/amd64.

Well, that's in asm so it is not affected by compiler flags.  Compiler
flags can cause the library to be not used at all in some cases where
the library is better.

> Is your statement about fancy instructions been 3 times slower also valid for 
> FreeBSD/i386 and 32-bit apps running on FreeBSD/amd64?

In some cases -- not for most cases, but for float precision trig
functions on small args, except possibly on very old CPUs.  The i387
library intentionally doesn't use many hardware transcendental instructions
in float precision since they are slower and/or very inaccurate.  This
includes all trig instructions.

Bruce
Comment 4 Šimun Mikecin 2007-08-21 15:37:41 UTC
--- Bruce Evans <brde@optusnet.com.au> wrote:
> > This PR is about -mfancy-math-387 usage when compiling 32-bit compatibility 
> > libraries that are gone be used on FreeBSD/amd64.
> > As far as I can see FreeBSD's libm on i386 uses those fancy instructions (for 
> > example /usr/src.current/lib/msun/i387/s_sin.S), so the same libm will be 
> > used for running 32-bit apps on FreeBSD/amd64.
> Well, that's in asm so it is not affected by compiler flags.  Compiler
> flags can cause the library to be not used at all in some cases where
> the library is better.

I'm just wondering will it have a positive impact on performance if FreeBSD/i386 libm is changed
not to use that asm code with fancy instructions (just like FreeBSD/amd64 does).

Sime



       
____________________________________________________________________________________
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 
http://mobile.yahoo.com/go?refer=1GNXIC
Comment 5 Bruce Evans freebsd_committer freebsd_triage 2007-08-22 10:50:17 UTC
On Tue, 21 Aug 2007, Simun Mikecin wrote:

> --- Bruce Evans <brde@optusnet.com.au> wrote:
>>> This PR is about -mfancy-math-387 usage when compiling 32-bit compatibility
>>> libraries that are gone be used on FreeBSD/amd64.
>>> As far as I can see FreeBSD's libm on i386 uses those fancy instructions (for
>>> example /usr/src.current/lib/msun/i387/s_sin.S), so the same libm will be
>>> used for running 32-bit apps on FreeBSD/amd64.
>> Well, that's in asm so it is not affected by compiler flags.  Compiler
>> flags can cause the library to be not used at all in some cases where
>> the library is better.
>
> I'm just wondering will it have a positive impact on performance if FreeBSD/i386 libm is changed
> not to use that asm code with fancy instructions (just like FreeBSD/amd64 does).

Maybe someday.  The easy cases have already been looked at and resulted in
removing the asm implementations of functions like sinf() and asin().

Bruce
Comment 6 Andriy Gapon freebsd_committer freebsd_triage 2010-12-05 13:46:41 UTC
Can this PR be simply closed based on the discussion and its age?
Or is there any merit for looking into this again/further?

-- 
Andriy Gapon
Comment 7 Andriy Gapon freebsd_committer freebsd_triage 2010-12-05 16:56:45 UTC
State Changed
From-To: open->closed

Closing per the discussion.