Bug 47061

Summary: Conflicting system headers illustrated by build of graphics/cqcam
Product: Base System Reporter: Bernard van Gastel <bvgastel>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   

Description Bernard van Gastel 2003-01-14 18:20:07 UTC
	Conflicting headers (/usr/include/machine/cpufunc.h and /usr/include/strings.h)
	while building graphics/cpcam port.
	Each declare the function `int ffs(int)', in cpufunc.h this function is extern,
	but in strings it's a inline function. I didn't have time to look if it's the system
	headers error or the port's error, so I classified it als medium

Fix: 

Commented out the declaration in string.h
How-To-Repeat: 	build ports/graphics/cqcam
Comment 1 Tom Hukins freebsd_committer freebsd_triage 2003-01-15 11:15:54 UTC
Responsible Changed
From-To: gnats-admin->freebsd-ports

Reassign misfiled Ports PR.
Comment 2 Kris Kennaway freebsd_committer freebsd_triage 2003-08-18 00:23:20 UTC
State Changed
From-To: open->suspended

Known problem, awaiting fix
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2003-12-24 00:58:02 UTC
Responsible Changed
From-To: freebsd-ports-bugs->linimon

I guess no one is going to fix this unless I do it ...
Comment 4 Mark Linimon 2003-12-24 04:24:20 UTC
This is really a kernel problem.  I am going to go ahead and commit a
workaround for this and the one or two other ports with this problem --
but the workaround is basically unacceptable.

The underlying problem is that machine/cpufunc.h for i386 has had
a definition for a machine function 'ffs' for, oh, say, about 9 years
now.  However, man ffs will show you that there is an ffs(3) function
as well.  Even after reading the source it's not clear to me if these
are supposed to have the same purpose -- someone with a more intimate
knowledge of i386 arch is going to have to rule for certain.

Back in 2002 a commit was done to create 'strings.h' to provide
better adherance to POSIX.  When this was done, a prototype for
ffs() was introduced for ffs(3).  These prototypes fight with each
other.  From user code, there appears to be no way (to me) to
allow access to both.  However, this port, among others, wishes
to use the strings.h definitions _and_ the inb() and outb()
functions which only cpufunc.h provides.

The only way to (correctly) fix this has to do with changes to
the include files, and that's outside the charter of the ports folks.

In the meantime, I'm going to hold my nose and commit an include
file to the port that is merely the inb/outb functions.  This is
clearly a hack that should go away once a "correct" solution is found.

mcl
Comment 5 Mark Linimon freebsd_committer freebsd_triage 2003-12-24 04:24:45 UTC
State Changed
From-To: suspended->open

This is really a kernel problem that can only be worked around 
in the ports collection with terrible hackery.  I'm going to go 
ahead and do that to unbreak graphics/cqcam and one or two other 
ports that suffer from this, but this is by no means an acceptable 
long-term solution for many reasons.  See extensive comments in 
the audit trail. 


Comment 6 Mark Linimon freebsd_committer freebsd_triage 2003-12-24 04:24:45 UTC
Responsible Changed
From-To: linimon->freebsd-bugs
Comment 7 Bruce Evans 2003-12-24 10:14:55 UTC
On Tue, 23 Dec 2003, Mark Linimon wrote:

>  This is really a kernel problem.  I am going to go ahead and commit a
>  workaround for this and the one or two other ports with this problem --
>  but the workaround is basically unacceptable.

Er, this is really a port[s] problem.  <machine/cpufunc.h> is not intended
to be included by applications.  There was never any conflict with <string.h>
in the kernel because the kernel never included <string.h>, and the kernel
now avoids bogus conflicts, if any, with gcc's builtin ffs() using
-fno-builtin.

>  The underlying problem is that machine/cpufunc.h for i386 has had
>  a definition for a machine function 'ffs' for, oh, say, about 9 years
>  now.  However, man ffs will show you that there is an ffs(3) function
>  as well.  Even after reading the source it's not clear to me if these
>  are supposed to have the same purpose -- someone with a more intimate
>  knowledge of i386 arch is going to have to rule for certain.

They are the same.  Last time I checked (less than a year ago), the gcc
builtin was still slower than the kernel inline except possibly when the
latter can use non-base-arch instructions like cmov.  amd64's always have
cmov and always use the builtin.

... I checked again.  With the following slightly too simple test:

%%%
#include <sys/types.h>
#include <machine/cpufunc.h>

int z[4096];

main()
{
	volatile int v;
	int i, j;

	for (i = 0; i < 4096; i++)
		z[i] = 1 << rand();	/* Yes, this is sloppy. */
	for (j = 0; j < 100000; j++)
		for (i = 0; i < 4096; i++)
#ifdef NOBUILTIN
			v = ffs(z[i]);
#else
			v = __builtin_ffs(z[i]);
#endif
}
%%%

Times on an Athlon XP1600 overclocked by 146/133:

cc -O -mcpu=pentiumpro -o foo foo.c (default from bsd.cpu.mk)
        3.49 real         3.47 user         0.00 sys
cc -O -mcpu=pentiumpro -DNOBUILTIN -o foo foo.c (default + kernel ffs())
        3.21 real         3.21 user         0.00 sys
cc -O -march=pentiumpro -o foo foo.c (gives cmov and works on Athlon XP too):
        3.21 real         3.21 user         0.00 sys

Here using cmov[e] gives the same amount of optimization as the kernel ffs()
gets by using a simple conditional branch instead of a slow instruction
sequence starting with "set"[e].  Mispredicted branches are expensive on
some arches, but apparently they aren't on Athlons.  The rand() in the
test was intended to cause mispredicted branches as well as lengthy
searches, but it doesn't actually.  The branch is never taken since
z[i] is never 0.  On changing the initialization of z[i] so that the
branch is taken every second time:

		if (i & 1)
			z[i] = 1 << rand();

the kernel version becomes much faster:

        2.01 real         2.00 user         0.00 sys

and the other times don't change significantly.  This is presumably
because the Athlon predicts taking the branch every second time
perfectly.  The bit-search instruction is very expensive (and always
takes the same time??) and by branching over it every second time the
cost per iteration is almost halved.

A better benchmark might randomize the branches, but this might be
evey further from real applications since an arg of 0 may be very
unlikely (or very likely).

Times on a Celeron 366:
gcc builtin without cmov (very slow!):
       15.78 real        15.68 user         0.00
gcc builtin with cmov:
        5.64 real         5.61 user         0.00
kernel ffs():
        5.85 real         5.81 user         0.00
kernel ffs() with alternating 0's (again, others not affected by alternating):
        5.62 real         5.58 user         0.00

Times on an amd64 (sledge = Opteron 244 1804 MHz)

gcc builtin with cmov:
        2.73 real         2.72 user         0.00 sys
old kernel ffs():
        3.42 real         3.39 user         0.01 sys
kernel ffs() with alternating 0's (again, builtin affected by alternating):
        1.82 real         1.82 user         0.00 sys

So using cmov is actually significtly better than a simple branch on
amd64's, but only if the arg isn't often 0.

>  In the meantime, I'm going to hold my nose and commit an include
>  file to the port that is merely the inb/outb functions.  This is
>  clearly a hack that should go away once a "correct" solution is found.

This is approximately correct, not a hack.  The system could provide
a header that implements inb() and outb() functions for userland (*),
but <machine/cpufunc.h> is not this header.  It's just a bit much for
multiple applications to have to duplicate these interfaces.

(*) They shouldn't exist in the kernel.  Bus-space should be used.

Bruce
Comment 8 Mark Linimon 2003-12-24 10:54:12 UTC
>
>
>Er, this is really a port[s] problem [...]
>
>The system could provide
>a header that implements inb() and outb() functions for userland (*),
>but <machine/cpufunc.h> is not this header.
>
Other than duplicating the inb/outb code into places in the ports 
collection,
there is no way that I can see for the ports collection to fix this problem;
it involves some kind of change to the system headers.  So, I'm saying that
I agree with point (2) but that IMHO (2) is necessarily in conflict with 
(1).

If you have some other suggestion about getting inb/outb functionality into
the ports, please make it.  (Fair warning: "rewrite or delete the apps" 
is not
what I'm looking for :-) ... unless you're also willing to replace the junky
old parallel-port peripherals that these ports talk to).
Comment 9 Mark Linimon freebsd_committer freebsd_triage 2004-08-22 08:39:39 UTC
State Changed
From-To: open->closed

A workaround for this problem was committed around 2 months ago.