Bug 19245

Summary: -fexpensive-optimizations buggy (even with -O)
Product: Base System Reporter: Mikhail Teterin <mi>
Component: i386Assignee: David E. O'Brien <obrien>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.0-STABLE   
Hardware: Any   
OS: Any   

Description Mikhail Teterin 2000-06-13 17:30:00 UTC
	The attached piece of code, when compiled with
	``-O -fexpensive-optimizations'', produces incorrect
	binary on FreeBSD-4.0 .

	I tested the same compiler line on Mandrake Linux (an
	identical machine hardware-wise) and it compiles correctly.

	Mandrake's cc is the same as on FreeBSD:

		Reading specs from
			/usr/lib/gcc-lib/i586-mandrake-linux/2.95.2/specs
		gcc version 2.95.2 19991024 (release)
			vs. our
		Using builtin specs.
		gcc version 2.95.2 19991024 (release)

	But their assembler is newer:

		GNU assembler version 2.9.5 (i686-pc-linux-gnu)
		using BFD version 2.9.5.0.16
			vs. our
		GNU assembler version 2.9.1 (i386-unknown-freebsdelf),
		using BFD version 2.9.1

Fix: 

Get the new assembler/binutils and add -fno-expensive-optimizations
	to all CFLAGS in the meantime. Anything else?
How-To-Repeat: 
	Save the C-code below into a file bug.c. Then compile it with
		cc -O -fexpensive-optimizations bug.c -o bug

	As you can see from the code, the hostname output by both
	printfs shoud be the same, and on Linux and on FreeBSD without
	the -fexpensive-optimizations flag it is:

	Calling rfc1035QuestionPack with hostname 0xbffffe32 (./bug)
	In rfc1035QuestionPack: hostname is 0xbffffe32 (./bug)

	Yet, with the -fexpensive-optimizations flag, the hostname
	argument is passed in the register, which, apparently, is
	sometimes not loaded with the value and remains zero, resulting
	in:

	Calling rfc1035QuestionPack with hostname 0xbfbff8f0 (./bug)
	In rfc1035QuestionPack: hostname is 0x0 ((null))

	The code is stripped from the squid23's lib/rfc1035.c (I found
	this because squid was crashing on every request and restarting)
	-- I tried to reduce it to the bare minimum needed to reproduce
	the bug.

	/* beginning of end.c */

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <assert.h>
#include <strings.h>

static off_t
rfc1035QuestionPack(char *buf,
    size_t sz,
    const char *hostname,
    unsigned short class
    )
{
    off_t off = 0;
    unsigned short s;
    printf("In rfc1035QuestionPack: hostname is %p (%s)\n",
	hostname, hostname);
    s = htons(class);
    memcpy(buf + off, &s, sizeof(s));
    off += sizeof(s);
    assert(off <= sz);
    return off;
}

static unsigned short
rfc1035BuildAQuery(const char *hostname, char *buf, size_t sz)
{
    off_t offset = 0;
    printf("Calling rfc1035QuestionPack with hostname %p (%s)\n",
	hostname, hostname);
    offset += rfc1035QuestionPack(buf + offset,
	sz - offset,
	hostname,
	1
    );
    return 0;
}

int main(int argc, char *argv[]) {
	char buf[1024];
	rfc1035BuildAQuery(argv[argc - 1], buf, 1024);
	return 0;
}
	/* end of bug.c */
Comment 1 Bruce Evans 2000-06-14 10:37:28 UTC
On Tue, 13 Jun 2000, Mikhail Teterin wrote:

> >Description:
> 
> 	The attached piece of code, when compiled with
> 	``-O -fexpensive-optimizations'', produces incorrect
> 	binary on FreeBSD-4.0 .
> 
> 	I tested the same compiler line on Mandrake Linux (an
> 	identical machine hardware-wise) and it compiles correctly.

This is hard to explain, since the bug shown by your example is in gcc
(2.95.2), not in the assembler or linker.

> static off_t
> rfc1035QuestionPack(char *buf,
>     size_t sz,
>     const char *hostname,
>     unsigned short class
>     )
> {
>     off_t off = 0;
>     unsigned short s;
>     printf("In rfc1035QuestionPack: hostname is %p (%s)\n",
> 	hostname, hostname);
>     s = htons(class);
>     memcpy(buf + off, &s, sizeof(s));
>     off += sizeof(s);
>     assert(off <= sz);
>     return off;
> }

gcc -O -fexpensive-optimizations reuses the stack space for `hostname' and
`class', and zeros this space to initialize `off' before loading `hostname'
or `class'.

> 	Yet, with the -fexpensive-optimizations flag, the hostname
> 	argument is passed in the register, which, apparently, is
> 	sometimes not loaded with the value and remains zero, resulting
> 	in:

No, -fexpensive-optimizations doesn't affect the function call protocol.
Args are still passed on the stack.

> >Fix:
> 	Get the new assembler/binutils and add -fno-expensive-optimizations
> 	to all CFLAGS in the meantime. Anything else?

Don't use -O2 (which enables -fexpensive-optimizations) unless you want to
find bugs like this :-).

Bruce
Comment 2 Mikhail Teterin 2000-06-14 17:08:17 UTC
On 14 Jun, Bruce Evans wrote:

= >  The   attached   piece   of   code,   when   compiled   with   ``-O
= >  -fexpensive-optimizations'',    produces   incorrect    binary   on
= >  FreeBSD-4.0.
= > 
= >  I tested  the same  compiler line on  Mandrake Linux  (an identical
= >  machine hardware-wise) and it compiles correctly.
= 
= This is hard to explain, since the bug shown by your example is in gcc
= (2.95.2), not in the assembler or linker.

Oh,  I  know  so  little...  But  I can  give  an  interested  party  an
account  on  the  machine  to  verify this...  May  be,  it  the  specs?

AFAIK, the function calls are done differently on Linux -- could that be
the reason (with gcc mostly developed  on Linux (right?) -- they may not
have cought all the bugs on other OSes)?

= Don't  use -O2  (which enables  -fexpensive-optimizations) unless  you
= want to find bugs like this :-).

That's  the  thing  --  I  only  asked for  -O  but  with  the  explicit
-fexpensive-optimizations. Will -ON ever be reliable for N>1 ?

	-mi
Comment 3 Mike Barcroft freebsd_committer freebsd_triage 2001-07-22 04:12:42 UTC
State Changed
From-To: open->closed


GCC has major problems with optimisation.  See bde's comments for 
details.
Comment 4 Mike Barcroft freebsd_committer freebsd_triage 2001-07-23 21:07:29 UTC
On Mon, Jul 23, 2001 at 12:21:58PM -0400, Mikhail Teterin wrote:
> On 21 Jul, mike@FreeBSD.org wrote:
> > Synopsis: -fexpensive-optimizations buggy (even with -O)
> > 
> > State-Changed-From-To: open->closed
> > State-Changed-By: mike
> > State-Changed-When: Sat Jul 21 20:12:42 PDT 2001
> > State-Changed-Why: 
> > 
> > GCC has major problems with optimisation.  See bde's comments for
> > details.
> 
> First, BDE did not close the PR himself, despite his comments. Second,
> the same "troublesome" GCC produces correct code on Linux, which was
> one of the major points of my PR -- something was/is wrong with the
> FreeBSD-specific part of gcc-2.95.2... But, it seems, you are on a
> PR-closing trip...

Sorry if this was made unclear.  GCC is known to do bad things when
using anything but -O for optimisation on FreeBSD.  This has been
discussed numerous times on the mailing lists.  Also, we directly
import vendor code, so this problem must be fixed by the GCC people.
A quick search of the GCC Gnats database reveals many bug reports
involving optimisation problems.  If your problem isn't addressed
in one of those PR's, may I recommend you open a new one there?

There could be any number of reasons why Bruce didn't close your
PR then and there.  Bruce is a very busy person and might not have
had time to close this PR.  I'm CC'ing Bruce on this message to
see if I missed something in his comments that would require this
PR to be left open.

It may seem as though I'm on a "PR-closing trip" to you, but I
assure you my intentions are to determine if problems from years
ago are still problems today.  There are many PR's in our datebase
that should have been closed a long time ago.  I feel this PR is
one of them.

Best regards,
Mike Barcroft
Comment 5 Bruce Evans 2001-08-29 13:27:24 UTC
On Mon, 23 Jul 2001, Mike Barcroft wrote:

> On Mon, Jul 23, 2001 at 12:21:58PM -0400, Mikhail Teterin wrote:
> > On 21 Jul, mike@FreeBSD.org wrote:
> > > GCC has major problems with optimisation.  See bde's comments for
> > > details.
> >
> > First, BDE did not close the PR himself, despite his comments. Second,
> > the same "troublesome" GCC produces correct code on Linux, which was
> > one of the major points of my PR -- something was/is wrong with the
> > FreeBSD-specific part of gcc-2.95.2... But, it seems, you are on a
> > PR-closing trip...
>
> Sorry if this was made unclear.  GCC is known to do bad things when
> using anything but -O for optimisation on FreeBSD.  This has been
> ...
> There could be any number of reasons why Bruce didn't close your
> PR then and there.  Bruce is a very busy person and might not have
> had time to close this PR.  I'm CC'ing Bruce on this message to
> see if I missed something in his comments that would require this

The main reason reason is that I wasn't sure that it was not a FreeBSD
bug.  There is now a near-duplicate of this PR (gnu/30181) which says
that the bug is in both the FreeBSD port and original GNU version of
gcc-2.95.3, and analyses generated code to locate the wrong instructions.
It's clear that it is a gcc bug.

I don't plan to fix this.  PR 30181 suggests mentioning the problem in
the release notes.  This seems reasonable.  I think the PRs should be
cross-refernced, and closed after doing this.

Bruce
Comment 6 Mikhail T. 2001-08-29 23:07:06 UTC
 
> The main reason reason is that I wasn't sure that it was not a FreeBSD
> bug. There is  now a near-duplicate of this PR  (gnu/30181) which says
> that the  bug is  in both  the FreeBSD port  and original  GNU version
> of  gcc-2.95.3,  and  analyses  generated code  to  locate  the  wrong
> instructions. It's clear that it is a gcc bug.

But on  the Mandrake the same  code compiles correctly --  that's what I
don't understand. In fact, that's the only  reason I filed the PR at all
-- I know about optimization problems in gcc in general.

Did Mandrake patch the compiler somehow? Can we incorporate their fix?

	-mi
Comment 7 Bruce Evans 2001-08-30 14:45:16 UTC
On Wed, 29 Aug 2001, Mikhail Teterin wrote:

> > The main reason reason is that I wasn't sure that it was not a FreeBSD
> > bug. There is  now a near-duplicate of this PR  (gnu/30181) which says
> > that the  bug is  in both  the FreeBSD port  and original  GNU version
> > of  gcc-2.95.3,  and  analyses  generated code  to  locate  the  wrong
> > instructions. It's clear that it is a gcc bug.
>
> But on  the Mandrake the same  code compiles correctly --  that's what I
> don't understand. In fact, that's the only  reason I filed the PR at all
> -- I know about optimization problems in gcc in general.
>
> Did Mandrake patch the compiler somehow? Can we incorporate their fix?

Maybe they just use obrien's hack to silently force -O2 down to -O1 :-).
(WANT_FORCE_OPTIMIZATION_DOWNGRADE=1 in /etc/make.conf in -current.)

Bruce
Comment 8 David E. O'Brien freebsd_committer freebsd_triage 2001-08-30 21:08:10 UTC
Responsible Changed
From-To: freebsd-bugs->obrien

Uggg.. someone really should have made me aware of this PR.
Comment 9 David E. O'Brien freebsd_committer freebsd_triage 2001-08-30 21:16:51 UTC
State Changed
From-To: closed->open

Uh, this should have NEVER have been closed, but rather assigned to me.
Comment 10 ashp freebsd_committer freebsd_triage 2002-01-18 03:21:31 UTC
State Changed
From-To: open->closed

Now that obrien has stepped down as GCC maintainer, I feel this should be 
closed again.  We don't have the manpower to fix bugs within GCC itself.
Comment 11 David E. O'Brien freebsd_committer freebsd_triage 2002-01-23 12:19:16 UTC
State Changed
From-To: closed->open

DO NOT FSCKING EVER CLOSE A PR ASSIGNED TO ME W/O ASKING ME FIRST!!
Comment 12 Matteo Riondato freebsd_committer freebsd_triage 2005-08-27 13:09:26 UTC
Can this PR be closed?
-- 
Matteo Riondato
FreeBSD Volunteer (http://www.FreeBSD.org/)
GUFI Staff Member (http://www.GUFI.org/)
FreeSBIE Developer (http://www.FreeSBIE.org/)
Comment 13 Ceri Davies freebsd_committer freebsd_triage 2006-10-08 18:55:30 UTC
Is the problem in this PR still outstanding?  If not, can we close the
PR please?
Comment 14 Remko Lodder freebsd_committer freebsd_triage 2007-06-28 19:17:40 UTC
State Changed
From-To: open->closed

No reply since a long long time, I am closing this PR. If you have 
feedback obrien please post it and consider reopening it, if not then 
the right thing is done :) hat: bugmeister