Bug 28966 - [patch] math libraries in linux emulation do not return same results
Summary: [patch] math libraries in linux emulation do not return same results
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 4.3-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: David Schultz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2001-07-14 18:40 UTC by Jim.Pirzyk
Modified: 2005-03-01 04:51 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jim.Pirzyk 2001-07-14 18:40:16 UTC
	Math libraries under linux emulation do not return the same results
	as under native FreeBSD nor under native Linux.  This is independant
	of shared libaries (they are the same under Linux emulation as under
	native linux).

Fix: 

I have yet to find a fix.  It may have to due with the NPX code word
	in linux emulation, but I am not sure.
How-To-Repeat: 	Compile this program on a linux system and run there and on
	the FreeBSD system.  Compare results.

#include <stdio.h>
#include <math.h>
#include <stdlib.h>

int main (int argc, char **argv) {
        double res, x = 53.278500;

        if ( argc == 2 )
                x = atof(argv[1]);

        res = exp(x);

        printf ("x = %lf\n", x);
        printf ("exp(x) = %lf\n", res);

        exit (0);
}
Comment 1 dwmalone 2001-07-14 23:18:30 UTC
On Sat, Jul 14, 2001 at 10:34:09AM -0700, Jim.Pirzyk@disney.com wrote:
> >Description:
> 	Math libraries under linux emulation do not return the same results
> 	as under native FreeBSD nor under native Linux.  This is independant
> 	of shared libaries (they are the same under Linux emulation as under
> 	native linux).

Did you try playing with fpgetround/fpsetround or any of the related
functions on the same man page? I'd suspect you could find out what's
going on with them.

	David.
Comment 2 Jim.Pirzyk 2001-07-15 00:01:13 UTC
On Saturday 14 July 2001 03:18 pm, David Malone wrote:
> On Sat, Jul 14, 2001 at 10:34:09AM -0700, Jim.Pirzyk@disney.com wrote:
> > >Description:
> >
> > 	Math libraries under linux emulation do not return the same results
> > 	as under native FreeBSD nor under native Linux.  This is independant
> > 	of shared libaries (they are the same under Linux emulation as under
> > 	native linux).
>
> Did you try playing with fpgetround/fpsetround or any of the related
> functions on the same man page? I'd suspect you could find out what's
> going on with them.

I just tried them and I see the FreeBSD box is set to 53 bit precision,
but the problem is that these functions do not exist on Linux.  This
means I cannot compile the program to test what the precision is
under linux emulation.  The problem I have is that the linux binaries
running under freebsd gives a different result that the freebsd binary
or the linux binary under linux.

- JimP

-- 
--- @(#) $Id: dot.signature,v 1.10 2001/05/17 23:38:49 Jim.Pirzyk Exp $
    __o   Jim.Pirzyk@disney.com ------------- pirzyk@freebsd.org
 _'\<,_   Senior Systems Engineer, Walt Disney Feature Animation 
(*)/ (*)
Comment 3 Jim.Pirzyk 2001-07-15 00:21:05 UTC
> >Description:
>
> 	Math libraries under linux emulation do not return the same results
> 	as under native FreeBSD nor under native Linux.  This is independant
> 	of shared libaries (they are the same under Linux emulation as under
> 	native linux).
>

What I am finding is that the -OX (where X is not 0) optimizes the
exp() call out of the binary, so it does not call exp() in libm.so.6
This does not happen under FreeBSD's gcc compiler.

-- 
--- @(#) $Id: dot.signature,v 1.10 2001/05/17 23:38:49 Jim.Pirzyk Exp $
    __o   Jim.Pirzyk@disney.com ------------- pirzyk@freebsd.org
 _'\<,_   Senior Systems Engineer, Walt Disney Feature Animation 
(*)/ (*)
Comment 4 Jim.Pirzyk 2001-07-15 01:47:34 UTC
On Saturday 14 July 2001 04:30 pm, Jim Pirzyk wrote:
> The following reply was made to PR kern/28966; it has been noted by GNATS.
>
> From: Jim Pirzyk <Jim.Pirzyk@disney.com>
> To: FreeBSD-gnats-submit@FreeBSD.ORG
> Cc:
> Subject: Re: kern/28966: math libraries in linux emulation do not return
> same results Date: Sat, 14 Jul 2001 16:21:05 -0700
>
>  > >Description:
>  >
>  > 	Math libraries under linux emulation do not return the same results
>  > 	as under native FreeBSD nor under native Linux.  This is independant
>  > 	of shared libaries (they are the same under Linux emulation as under
>  > 	native linux).
>
>  What I am finding is that the -OX (where X is not 0) optimizes the
>  exp() call out of the binary, so it does not call exp() in libm.so.6
>  This does not happen under FreeBSD's gcc compiler.

This is only when I include the exp call in a seperate .o file, not
when I use the one in libm.so.6

So the problem stands that using exp() on a linux box works different
than it does under linux emulation.

- JimP

-- 
--- @(#) $Id: dot.signature,v 1.10 2001/05/17 23:38:49 Jim.Pirzyk Exp $
    __o   Jim.Pirzyk@disney.com ------------- pirzyk@freebsd.org
 _'\<,_   Senior Systems Engineer, Walt Disney Feature Animation 
(*)/ (*)
Comment 5 Jim.Pirzyk 2001-07-15 05:03:13 UTC
So the solution to my problem was to set the __INITIAL_NPXCW__ to
0x37F.  What I can think of is that the freebsd binary sets
the Control Word to this before running but the linux binary 
does not (because it is assumed to already be set by the kernel
at boot time).  So I would think the linux kernel module would need
to set it also.

- JimP

-- 
--- @(#) $Id: dot.signature,v 1.10 2001/05/17 23:38:49 Jim.Pirzyk Exp $
    __o   Jim.Pirzyk@disney.com ------------- pirzyk@freebsd.org
 _'\<,_   Senior Systems Engineer, Walt Disney Feature Animation 
(*)/ (*)
Comment 6 Bruce Evans 2001-07-15 12:06:32 UTC
On Sat, 14 Jul 2001, Jim Pirzyk wrote:

>  So the solution to my problem was to set the __INITIAL_NPXCW__ to
>  0x37F.  What I can think of is that the freebsd binary sets
>  the Control Word to this before running but the linux binary 
>  does not (because it is assumed to already be set by the kernel
>  at boot time).

It's sort of the opposite.  The FreeBSD kernel sets the control
word to __INITIAL_NPXCW__.  Most FreeBSD binaries have never set it.
They depend on the kernel setting it.  Linux C binaries used to set
it to 0x37F in the C startup code (except very old Linux C binaries
set it to 0x272 IIRC, and there at least used to be a linking option
to unmask exceptions (control word 0x372?).  Linux C binaries stopped
setting it a few years ago.

>  So I would think the linux kernel module would need
>  to set it also.

Yes, all emulators have this bug.

Bruce
Comment 7 Bruce Evans 2001-07-15 12:19:42 UTC
On Sat, 14 Jul 2001, Jim Pirzyk wrote:

>  On Saturday 14 July 2001 03:18 pm, David Malone wrote:
>  > Did you try playing with fpgetround/fpsetround or any of the related
>  > functions on the same man page? I'd suspect you could find out what's
>  > going on with them.
>  
>  I just tried them and I see the FreeBSD box is set to 53 bit precision,
>  but the problem is that these functions do not exist on Linux.  This

Some form of them should exist.  They are spelled fegetround/fesetround
in C99.

>  means I cannot compile the program to test what the precision is
>  under linux emulation.  The problem I have is that the linux binaries
>  running under freebsd gives a different result that the freebsd binary
>  or the linux binary under linux.

It is probably a bug for changing the rounding mode via a standard
interface to have any effect on the result of a math function, but in
practice I think you can rely on f[ep]setround() fixing them if it is
used to "restore" the correct default.

Bruce
Comment 8 Jim.Pirzyk 2001-07-17 22:02:10 UTC
On Sunday 15 July 2001 04:06 am, you wrote:
> On Sat, 14 Jul 2001, Jim Pirzyk wrote:
> >  So the solution to my problem was to set the __INITIAL_NPXCW__ to
> >  0x37F.  What I can think of is that the freebsd binary sets
> >  the Control Word to this before running but the linux binary
> >  does not (because it is assumed to already be set by the kernel
> >  at boot time).
>
> It's sort of the opposite.  The FreeBSD kernel sets the control
> word to __INITIAL_NPXCW__.  Most FreeBSD binaries have never set it.
> They depend on the kernel setting it.  Linux C binaries used to set
> it to 0x37F in the C startup code (except very old Linux C binaries
> set it to 0x272 IIRC, and there at least used to be a linking option
> to unmask exceptions (control word 0x372?).  Linux C binaries stopped
> setting it a few years ago.

Not sure if this patch is technically correct in that it does not
save off the existing value of cw, but just sets to to the
default.

*** ./sys/i386/linux/linux.h.orig	Tue Jul 17 13:59:10 2001
--- ./sys/i386/linux/linux.h	Tue Jul 17 13:27:26 2001
***************
*** 167,172 ****
--- 167,175 ----
  #define LINUX_SS_DISABLE	2
  
  
+ /* sigvec */ 
+ #define	__INITAL_LINUX_NPXCW__	0x37F
+ 
  int linux_to_bsd_sigaltstack(int lsa);
  int bsd_to_linux_sigaltstack(int bsa);
  
*** ./sys/i386/linux/linux_sysvec.c.orig	Sat Jul 14 22:32:48 2001
--- ./sys/i386/linux/linux_sysvec.c	Tue Jul 17 13:30:59 2001
***************
*** 429,434 ****
--- 429,441 ----
  
  	bzero(&frame.sf_fpstate, sizeof(struct linux_fpstate));
  
+ 	/* 
+ 	 * Need to set the NXP Code Word to match what linux uses.  This used
+ 	 * to be in each linux binary, but more receintly, it was moved to
+ 	 * the kernel and so we need to emulate that here.
+ 	 */
+ 	frame.sf_fpstate.cw = __INITAL_LINUX_NPXCW__;
+ 
  	for (i = 0; i < (LINUX_NSIG_WORDS-1); i++)
  		frame.sf_extramask[i] = lmask.__bits[i+1];
  	
- JimP

-- 
--- @(#) $Id: dot.signature,v 1.10 2001/05/17 23:38:49 Jim.Pirzyk Exp $
    __o   Jim.Pirzyk@disney.com ------------- pirzyk@freebsd.org
 _'\<,_   Senior Systems Engineer, Walt Disney Feature Animation 
(*)/ (*)
Comment 9 Jim Pirzyk freebsd_committer freebsd_triage 2001-07-19 00:24:20 UTC
Responsible Changed
From-To: freebsd-bugs->pirzyk

Picked up another of my calls
Comment 10 Bruce Evans 2001-07-19 09:51:11 UTC
On Tue, 17 Jul 2001, Jim Pirzyk wrote:

>  Not sure if this patch is technically correct in that it does not
>  save off the existing value of cw, but just sets to to the
>  default.

>  *** ./sys/i386/linux/linux_sysvec.c.orig	Sat Jul 14 22:32:48 2001
>  --- ./sys/i386/linux/linux_sysvec.c	Tue Jul 17 13:30:59 2001
>  ***************
>  *** 429,434 ****
>  --- 429,441 ----
>    
>    	bzero(&frame.sf_fpstate, sizeof(struct linux_fpstate));
>    
>  + 	/* 
>  + 	 * Need to set the NXP Code Word to match what linux uses.  This used
>  + 	 * to be in each linux binary, but more receintly, it was moved to
>  + 	 * the kernel and so we need to emulate that here.
>  + 	 */
>  + 	frame.sf_fpstate.cw = __INITAL_LINUX_NPXCW__;
>  + 
>    	for (i = 0; i < (LINUX_NSIG_WORDS-1); i++)
>    		frame.sf_extramask[i] = lmask.__bits[i+1];
>    	

Er, this only sets it (strictly: schedules setting of it) in linux_sendsig().
I think it only works if the application catches a signal and returns fairly
normally form the signal handler.  Then sigreturn() sets it.

You need to set it setregs() in an emulator-specific way.  NetBSD sets it in
sysent-specific setregs() named sv_setregs.

Bruce
Comment 11 colin.percival 2002-04-14 13:41:15 UTC
   Maybe I'm missing something, but I fail to see why this is considered a 
problem.  Library functions should never be expected to produce identical 
results across platforms; the only requirement is that arithmetic functions 
(with the same precision and rounding modes) are consistent across IEEE 754 
implementations.

Colin Percival
Comment 12 Jim.Pirzyk 2002-04-15 17:18:19 UTC
On Sunday 14 April 2002 05:41 am, Colin Percival wrote:
>    Maybe I'm missing something, but I fail to see why this is considered a
> problem.  Library functions should never be expected to produce identical
> results across platforms; the only requirement is that arithmetic functions
> (with the same precision and rounding modes) are consistent across IEEE 754
> implementations.
>
> Colin Percival

But they are NOT across platforms.  It SHOULD be consistant between
native Linux and the Linuxator (since it to imitate Linux).

- JimP


-- 
--- @(#) $Id: dot.signature,v 1.11 2002/02/15 23:47:51 pirzyk Exp $
    __o   Jim.Pirzyk@disney.com --------------------------------
 _'\<,_   Senior Systems Engineer, Walt Disney Feature Animation 
(*)/ (*)
Comment 13 KAREN THODE 2003-07-03 20:44:04 UTC
Replace the old patch with this one:
void floadcw(unsigned short);  /*at the beginning of linux_sysvec.c*/

floadcw(__INITAL_LINUX_NPXCW__);

and compile this asm source file into the linux emulator.

;floadcw.s
;Lucas Thode
.text
floadcw:
   push bp
   mov sp,bp
   fldcw [bp+4]
   pop bp
   ret
Comment 14 KAREN THODE 2003-07-16 17:29:42 UTC
Has anybody committed my patch yet?

Lucas
Comment 15 Tilman Keskinoz freebsd_committer freebsd_triage 2004-04-23 15:17:09 UTC
Responsible Changed
From-To: pirzyk->freebsd-bugs

Jim Pirzyk returned his commit bit one year ago
Comment 16 Mark Linimon freebsd_committer freebsd_triage 2004-09-01 07:19:02 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-emulation

Over to emulation mailing list -- this appears to be a bug in the Linux 
emulation code.
Comment 17 Gerald Pfeifer freebsd_committer freebsd_triage 2004-10-31 15:20:46 UTC
Responsible Changed
From-To: freebsd-emulation->emulation
Comment 18 David Schultz freebsd_committer freebsd_triage 2005-01-24 14:55:59 UTC
Responsible Changed
From-To: emulation->das

Over to me.
Comment 19 David Schultz freebsd_committer freebsd_triage 2005-02-06 17:29:43 UTC
State Changed
From-To: open->patched

Fixed in -CURRENT, awaiting MFC.
Comment 20 David Schultz freebsd_committer freebsd_triage 2005-03-01 04:51:10 UTC
State Changed
From-To: patched->closed

MFC'd.