Bug 163710 - setjump in userboot.so causes stack corruption
Summary: setjump in userboot.so causes stack corruption
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: amd64 (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-amd64 (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-30 03:20 UTC by Russell Cattelan
Modified: 2020-10-22 13:15 UTC (History)
1 user (show)

See Also:


Attachments
file.diff (1.11 KB, patch)
2011-12-30 03:20 UTC, Russell Cattelan
no flags Details | Diff
cattelan.vcf (152 bytes, text/x-vcard; charset=utf-8)
2012-03-15 21:38 UTC, cattelan
no flags Details
cattelan.vcf (152 bytes, text/x-vcard; charset=utf-8)
2012-03-16 17:12 UTC, cattelan
no flags Details
cattelan.vcf (152 bytes, text/x-vcard; charset=utf-8)
2012-03-16 21:50 UTC, cattelan
no flags Details
cattelan.vcf (152 bytes, text/x-vcard; charset=utf-8)
2012-03-16 22:54 UTC, cattelan
no flags Details
cattelan.vcf (152 bytes, text/x-vcard; charset=utf-8)
2012-03-16 22:57 UTC, cattelan
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Russell Cattelan 2011-12-30 03:20:10 UTC
For some reason the forth interpreter is built and linked as 32bit even 
on amd64. 

The specific problem is that the header files used when compiling 32 are
the i386 variant which define the jmp_buf as such

#define	_JBLEN	11		/* Size of the jmp_buf on x86. */

typedef	struct _jmp_buf { int _jb[_JBLEN + 1]; } jmp_buf[1];

which results in a struct of 48 bytes

unfortunately libstand has a specific version of setjmp that wants to store
12 - 8 byte values

ENTRY(_setjmp)
	movq	%rdi,%rax
	movq	0(%rsp),%rdx		/* retval */
	movq	%rdx, 0(%rax)		/* 0; retval */
	movq	%rbx, 8(%rax)		/* 1; rbx */
	movq	%rsp,16(%rax)		/* 2; rsp */
	movq	%rbp,24(%rax)		/* 3; rbp */
	movq	%r12,32(%rax)		/* 4; r12 */
	movq	%r13,40(%rax)		/* 5; r13 */
	movq	%r14,48(%rax)		/* 6; r14 */
	movq	%r15,56(%rax)		/* 7; r15 */
	fnstcw	64(%rax)		/* 8; fpu cw */
	stmxcsr	68(%rax)		/*    and mxcsr */
	xorq	%rax,%rax
	ret
END(_setjmp)

What is happening then in the function ficlExecC 
the saveTib was getting corrupted due to the jmpbuf stack variable
getting over written. 
This causes bogus values to be places in the saveTib structure 
which eventually results in a fault.

This fix is to compile the forth interpreter as 64 bit when running on amd64

Fix: commit ad5840c9e38ba05bdcab3c9efea1c73272bbd9b7
Author: Russell Cattelan <cattelan@digitalelves.com>
Date:   Thu Dec 29 20:50:57 2011 -0600

    Change the build of forth interpreter to 64bit.
    The main problem is that the jmp_buf structure is
    half the size it needs to be 12 * 4 vs 12 * 8
    The setjump functions was then causing stack corruption
    since the buffer was not large enough.
Comment 1 peter 2011-12-30 07:33:37 UTC
On Thu, Dec 29, 2011 at 7:16 PM, Russell Cattelan
<cattelan@digitalelves.com> wrote:

>>Description:
> For some reason the forth interpreter is built and linked as 32bit even
> on amd64.

That's the catch. We use the same 32 bit loader on i386 and amd64.
The common loader understands both kernel formats.  This unfortunately
has meant that the libstand and sys/boot environment has had to be 32
bit.

-- 
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell
Comment 2 Kostik Belousov 2011-12-30 08:57:08 UTC
> For some reason the forth interpreter is built and linked as 32bit even
> on amd64.


The native amd64 boot loader is 32bit. Small trampoline is used to
switch to long mode before the control is passed to the booted kernel.

> What is happening then in the function ficlExecC
> the saveTib was getting corrupted due to the jmpbuf stack variable
> getting over written.
> This causes bogus values to be places in the saveTib structure
> which eventually results in a fault.


How do you build ? It seems that userboot build uses its own ficl
build from sys/boot/userboot/ficl.

> @@ -38,10 +37,6 @@ SOFTWORDS=   softcore.fr jhlocal.fr marker.fr freebsd.fr ficllocal.fr \
>  # Optional OO extension softwords
>  #SOFTWORDS+=   oo.fr classes.fr
>
> -.if ${MACHINE_CPUARCH} == "amd64"
> -CFLAGS+=   -m32 -march=i386 -I.
> -.endif
> -

I assume you never did even the buildworld with the change ?
Comment 3 cattelan 2011-12-30 20:24:49 UTC
On 12/30/11 1:33 AM, Peter Wemm wrote:
> On Thu, Dec 29, 2011 at 7:16 PM, Russell Cattelan
> <cattelan@digitalelves.com> wrote:
>
>>> Description:
>> For some reason the forth interpreter is built and linked as 32bit even
>> on amd64.
> That's the catch. We use the same 32 bit loader on i386 and amd64.
> The common loader understands both kernel formats.  This unfortunately
> has meant that the libstand and sys/boot environment has had to be 32
> bit.
>
Yes this is bit of an odd situation.
the loader is linked 32bit but userboot.so / libstand is built 64bit
but pulls in the 32bit ficl lib.
The 64bit libstand provides the setjmp call that needs 96 bytes
of space but since ficl is built with the 32bit jump_buf (48 byte )
structure
things go sideways.

I'm not sure the right overall fix is here.
I figured I would file the bug and see if anybody more
familiar with the loader/userboot build has any ideas.

The patch I posted fixes the issue with the forth interpreter crashing
when running under userboot.so, but yes it does break loader build.

-Russell
Comment 4 peter 2011-12-31 03:16:10 UTC
On Fri, Dec 30, 2011 at 12:24 PM, Russell Cattelan <cattelan@thebarn.com> w=
rote:
> On 12/30/11 1:33 AM, Peter Wemm wrote:
>> On Thu, Dec 29, 2011 at 7:16 PM, Russell Cattelan
>> <cattelan@digitalelves.com> wrote:
>>
>>>> Description:
>>> For some reason the forth interpreter is built and linked as 32bit even
>>> on amd64.
>> That's the catch. We use the same 32 bit loader on i386 and amd64.
>> The common loader understands both kernel formats. =A0This unfortunately
>> has meant that the libstand and sys/boot environment has had to be 32
>> bit.
>>
> Yes this is bit of an odd situation.
> the loader is linked 32bit but userboot.so / libstand is built 64bit
> but pulls in the 32bit ficl lib.
> The 64bit libstand provides the setjmp call that needs 96 bytes
> of space but since ficl is built with the 32bit jump_buf (48 byte )
> structure
> things go sideways.

We link the loader with the 32 bit libstand though.  It cannot work
any other way.  BTX is 32 bit only.

--=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell
Comment 5 cattelan 2012-02-11 04:04:52 UTC
So here is a suggestion.

So the default build does do the right thing.

The problem is when doing userboot development the
symlink from the loader build ends up including the wrong
files.

the userboot build of fill does not need a sym link so make
sure the link does not exist when doing the build.

This make sure the right header files are used for userboot builds=20

diff --git a/sys/boot/userboot/ficl/Makefile =
b/sys/boot/userboot/ficl/Makefile
index 829ae1e..6249eb2 100644
--- a/sys/boot/userboot/ficl/Makefile
+++ b/sys/boot/userboot/ficl/Makefile
@@ -61,6 +61,10 @@ softcore.c: ${SOFTWORDS} softcore.awk
  (cd ${.CURDIR}/../../ficl/softwords; cat ${SOFTWORDS} \
      | awk -f softcore.awk -v datestamp=3D"`LC_ALL=3DC date`") > =
${.TARGET}
=20
+beforedepend ${OBJS}: no-machine
+
+no-machine:
+ rm -f   ${.CURDIR}/../../ficl/machine
 #.if ${MACHINE_CPUARCH} =3D=3D "amd64"
 #${SRCS:M*.c:R:S/$/.o/g}: machine
 #
Comment 6 cattelan 2012-03-15 21:38:59 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Does the last patch seem acceptable?

Can we close this issue out?

- -Russell
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk9iYXMACgkQNRmM+OaGhBjX/QCdH2dzTGuYJdu3U7ZOAJKHnCwf
Oh8AmgL3EbpsbpI6+/qENoRaEokxn3Jc
=FvLp
-----END PGP SIGNATURE-----
Comment 7 peter 2012-03-16 16:56:55 UTC
On Thu, Mar 15, 2012 at 2:40 PM, Russell Cattelan <cattelan@thebarn.com> wr=
ote:
> The following reply was made to PR amd64/163710; it has been noted by GNA=
TS.
[..]
> =A0Does the last patch seem acceptable?
>
> =A0Can we close this issue out?

Sadly not,

 +no-machine:
 + rm -f   ${.CURDIR}/../../ficl/machine

.. this is definitely bogus no matter what. This attempts to modify
the source tree which may be read only, and should never even have a
"machine->..." symlink in it to remove in the first place.

I see sys/boot/userboot/ficl/Makefile has commented out the code that
sets up the ./machine links in its ${.OBJDIR} and there's -I paths all
over the place so my guess is that it's picking up some of the i386
machine links rather than setting up its own.  You probably need to
look at the userboot/ficl/Makefile code and make sure its setting up
the correct links rather than accidently using one belonging to
something else.

Or your source tree is contaminated somehow with a machine-> link
somewhere that it isn't supposed to be.
--=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell
Comment 8 cattelan 2012-03-16 17:12:27 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/16/12 11:56 AM, Peter Wemm wrote:
> On Thu, Mar 15, 2012 at 2:40 PM, Russell Cattelan
> <cattelan@thebarn.com> wrote:
>> The following reply was made to PR amd64/163710; it has been
>> noted by GNATS.
> [..]
>> Does the last patch seem acceptable?
>> 
>> Can we close this issue out?
> 
> Sadly not,
> 
> +no-machine: + rm -f   ${.CURDIR}/../../ficl/machine
> 
> .. this is definitely bogus no matter what. This attempts to
> modify the source tree which may be read only, and should never
> even have a "machine->..." symlink in it to remove in the first
> place.
The sym link is created by the build of ficl for the loader.
See: boot/ficl/Makefile
machine:
	ln -sf ${.CURDIR}/../../i386/include machine

Are you suggesting that is incorrect and should be fixed?

> 
> I see sys/boot/userboot/ficl/Makefile has commented out the code
> that sets up the ./machine links in its ${.OBJDIR} and there's -I
> paths all over the place so my guess is that it's picking up some
> of the i386 machine links rather than setting up its own.  You
> probably need to look at the userboot/ficl/Makefile code and make
> sure its setting up the correct links rather than accidently using
> one belonging to something else.
Well let me explain this again.
If the build is done from scratch things work because
boot/userboot/ficl is built before boot/ficl.
If an incremental build is done (e.g. when doing devel on the userboot
lib) boot/userboot/ficl will end up picking up i386 header files due
to the symlink that was created by boot/ficl/Makefile

I'll will grant you this bug isn't hit by a normal full build due
to way the build it ordered.
The problem is incremental builds, that IMHO shouldn't be creating
a bad userboot.so lib.


> 
> Or your source tree is contaminated somehow with a machine-> link 
> somewhere that it isn't supposed to be.
The problems exists in a stock freebsd src tree this is not a problem
created by anything I'm working on.

- -Russell


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk9jdHsACgkQNRmM+OaGhBiKTQCdHdqVxq1KMr0sPEE2epxOZrHx
uTgAn0cDP90aJvsK6aFBZYtuAFpPWzEd
=oYfT
-----END PGP SIGNATURE-----
Comment 9 peter 2012-03-16 20:51:18 UTC
2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 3/16/12 11:56 AM, Peter Wemm wrote:
>> On Thu, Mar 15, 2012 at 2:40 PM, Russell Cattelan
>> <cattelan@thebarn.com> wrote:
>>> The following reply was made to PR amd64/163710; it has been
>>> noted by GNATS.
>> [..]
>>> Does the last patch seem acceptable?
>>>
>>> Can we close this issue out?
>>
>> Sadly not,
>>
>> +no-machine: + rm -f =A0 ${.CURDIR}/../../ficl/machine
>>
>> .. this is definitely bogus no matter what. This attempts to
>> modify the source tree which may be read only, and should never
>> even have a "machine->..." symlink in it to remove in the first
>> place.
> The sym link is created by the build of ficl for the loader.
> See: boot/ficl/Makefile
> machine:
> =A0 =A0 =A0 =A0ln -sf ${.CURDIR}/../../i386/include machine
>
> Are you suggesting that is incorrect and should be fixed?

No, you're reading it wrong:
"ln -sf ${.CURDIR}/../../i386/include machine"
creates ${.OBJDIR}/machine"

Your patch does a "rm -f   ${.CURDIR}/../../ficl/machine" which is in
the source tree, not the obj tree, so it would never exist.  And if it
does, then something is wrong with your build environment.

--=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell
Comment 10 cattelan 2012-03-16 21:50:19 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/16/12 3:51 PM, Peter Wemm wrote:
> 2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>> 
>> On 3/16/12 11:56 AM, Peter Wemm wrote:
>>> On Thu, Mar 15, 2012 at 2:40 PM, Russell Cattelan 
>>> <cattelan@thebarn.com> wrote:
>>>> The following reply was made to PR amd64/163710; it has been 
>>>> noted by GNATS.
>>> [..]
>>>> Does the last patch seem acceptable?
>>>> 
>>>> Can we close this issue out?
>>> 
>>> Sadly not,
>>> 
>>> +no-machine: + rm -f   ${.CURDIR}/../../ficl/machine
>>> 
>>> .. this is definitely bogus no matter what. This attempts to 
>>> modify the source tree which may be read only, and should
>>> never even have a "machine->..." symlink in it to remove in the
>>> first place.
>> The sym link is created by the build of ficl for the loader. See:
>> boot/ficl/Makefile machine: ln -sf ${.CURDIR}/../../i386/include
>> machine
>> 
>> Are you suggesting that is incorrect and should be fixed?
> 
> No, you're reading it wrong: "ln -sf ${.CURDIR}/../../i386/include
> machine" creates ${.OBJDIR}/machine"
> 
> Your patch does a "rm -f   ${.CURDIR}/../../ficl/machine" which is
> in the source tree, not the obj tree, so it would never exist.  And
> if it does, then something is wrong with your build environment.
> 
This is pretty easy to reproduce.
cd /sys/boot
make

there will be a symlink in /sys/boot/ficl/machine that points to
i386/include.

If that link exists and userboot is rebuilt.
e.g.
cd /sys/boot/userboot
make

will end up with a userboot.so with an ficl that has been built with
32bit headers and thus have the wrong size structure for setjmp.

- -Russell





-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk9jtZsACgkQNRmM+OaGhBgVZACggjJYocX+OfI/5Fh2s4nuKFAJ
xXQAnRXKoKqx1eM3enbv/ebTMIU7UIuQ
=GSzJ
-----END PGP SIGNATURE-----
Comment 11 peter 2012-03-16 22:35:14 UTC
2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 3/16/12 3:51 PM, Peter Wemm wrote:
>> 2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>
>>> On 3/16/12 11:56 AM, Peter Wemm wrote:
>>>> On Thu, Mar 15, 2012 at 2:40 PM, Russell Cattelan
>>>> <cattelan@thebarn.com> wrote:
>>>>> The following reply was made to PR amd64/163710; it has been
>>>>> noted by GNATS.
>>>> [..]
>>>>> Does the last patch seem acceptable?
>>>>>
>>>>> Can we close this issue out?
>>>>
>>>> Sadly not,
>>>>
>>>> +no-machine: + rm -f =A0 ${.CURDIR}/../../ficl/machine
>>>>
>>>> .. this is definitely bogus no matter what. This attempts to
>>>> modify the source tree which may be read only, and should
>>>> never even have a "machine->..." symlink in it to remove in the
>>>> first place.
>>> The sym link is created by the build of ficl for the loader. See:
>>> boot/ficl/Makefile machine: ln -sf ${.CURDIR}/../../i386/include
>>> machine
>>>
>>> Are you suggesting that is incorrect and should be fixed?
>>
>> No, you're reading it wrong: "ln -sf ${.CURDIR}/../../i386/include
>> machine" creates ${.OBJDIR}/machine"
>>
>> Your patch does a "rm -f =A0 ${.CURDIR}/../../ficl/machine" which is
>> in the source tree, not the obj tree, so it would never exist. =A0And
>> if it does, then something is wrong with your build environment.
>>
> This is pretty easy to reproduce.
> cd /sys/boot
> make

You don't do that without a 'make obj' first.

> there will be a symlink in /sys/boot/ficl/machine that points to
> i386/include.

And this is user error.  Don't do that.

--=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell
Comment 12 peter 2012-03-16 22:49:36 UTC
On Fri, Mar 16, 2012 at 3:35 PM, Peter Wemm <peter@wemm.org> wrote:
> 2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 3/16/12 3:51 PM, Peter Wemm wrote:
>>> 2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
>>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>>
>>>> On 3/16/12 11:56 AM, Peter Wemm wrote:
>>>>> On Thu, Mar 15, 2012 at 2:40 PM, Russell Cattelan
>>>>> <cattelan@thebarn.com> wrote:
>>>>>> The following reply was made to PR amd64/163710; it has been
>>>>>> noted by GNATS.
>>>>> [..]
>>>>>> Does the last patch seem acceptable?
>>>>>>
>>>>>> Can we close this issue out?
>>>>>
>>>>> Sadly not,
>>>>>
>>>>> +no-machine: + rm -f =A0 ${.CURDIR}/../../ficl/machine
>>>>>
>>>>> .. this is definitely bogus no matter what. This attempts to
>>>>> modify the source tree which may be read only, and should
>>>>> never even have a "machine->..." symlink in it to remove in the
>>>>> first place.
>>>> The sym link is created by the build of ficl for the loader. See:
>>>> boot/ficl/Makefile machine: ln -sf ${.CURDIR}/../../i386/include
>>>> machine
>>>>
>>>> Are you suggesting that is incorrect and should be fixed?
>>>
>>> No, you're reading it wrong: "ln -sf ${.CURDIR}/../../i386/include
>>> machine" creates ${.OBJDIR}/machine"
>>>
>>> Your patch does a "rm -f =A0 ${.CURDIR}/../../ficl/machine" which is
>>> in the source tree, not the obj tree, so it would never exist. =A0And
>>> if it does, then something is wrong with your build environment.
>>>
>> This is pretty easy to reproduce.
>> cd /sys/boot
>> make
>
> You don't do that without a 'make obj' first.
>
>> there will be a symlink in /sys/boot/ficl/machine that points to
>> i386/include.
>
> And this is user error. =A0Don't do that.

More specifically..  You can't put code in the Makefiles that modifies
the source tree explicitly.

If sys/boot/userboot/ficl needs a different "#include <machine/*>"
then sys/boot/userboot/ficl needs to create the machine link there and
set the -I paths so that one has precedence.

But reaching over and trying to modify another tool's build area is
always wrong, aside from the obvious problem of it trying to modify
the source tree.  If you're pulling the wrong include files into
userboot/ficl, then that's because machine and -I aren't being set
correctly in userboot/ficl/Makefile.

--=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell
Comment 13 cattelan 2012-03-16 22:54:56 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/16/12 5:35 PM, Peter Wemm wrote:
> 2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>> 
>> On 3/16/12 3:51 PM, Peter Wemm wrote:
>>> 2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
>>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>> 
>>>> On 3/16/12 11:56 AM, Peter Wemm wrote:
>>>>> On Thu, Mar 15, 2012 at 2:40 PM, Russell Cattelan 
>>>>> <cattelan@thebarn.com> wrote:
>>>>>> The following reply was made to PR amd64/163710; it has
>>>>>> been noted by GNATS.
>>>>> [..]
>>>>>> Does the last patch seem acceptable?
>>>>>> 
>>>>>> Can we close this issue out?
>>>>> 
>>>>> Sadly not,
>>>>> 
>>>>> +no-machine: + rm -f   ${.CURDIR}/../../ficl/machine
>>>>> 
>>>>> .. this is definitely bogus no matter what. This attempts
>>>>> to modify the source tree which may be read only, and
>>>>> should never even have a "machine->..." symlink in it to
>>>>> remove in the first place.
>>>> The sym link is created by the build of ficl for the loader.
>>>> See: boot/ficl/Makefile machine: ln -sf
>>>> ${.CURDIR}/../../i386/include machine
>>>> 
>>>> Are you suggesting that is incorrect and should be fixed?
>>> 
>>> No, you're reading it wrong: "ln -sf
>>> ${.CURDIR}/../../i386/include machine" creates
>>> ${.OBJDIR}/machine"
>>> 
>>> Your patch does a "rm -f   ${.CURDIR}/../../ficl/machine" which
>>> is in the source tree, not the obj tree, so it would never
>>> exist.  And if it does, then something is wrong with your build
>>> environment.
>>> 
>> This is pretty easy to reproduce. cd /sys/boot make
> 
> You don't do that without a 'make obj' first.
So this is the only supported build model?


> 
>> there will be a symlink in /sys/boot/ficl/machine that points to 
>> i386/include.
> 
> And this is user error.  Don't do that.
if an inplace build is not supported then shouldn't it just
be flat out disabled?!!
What is wrong with making the build more robust to odd error are
not possible vs says "ohh ya don't do that"?


- -Russell

fwiw this is what happens if ficl/machine exists


(gdb) print sizeof(jmp_buf)
$1 = 48
(gdb) ptype jmp_buf
type = struct _jmp_buf {
    int _jb[12];
} [1]
(gdb)

vs
(gdb) ptype jmp_buf
type = struct _jmp_buf {
    long int _jb[12];
} [1]
(gdb) print sizeof(jmp_buf)
$1 = 96



> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk9jxMAACgkQNRmM+OaGhBiYTgCePFkTFRB78B9l/zgJ3xcV8JTe
5GgAn2yRdVK/vEWTSSRswCyU1E6j0jq5
=2JNI
-----END PGP SIGNATURE-----
Comment 14 cattelan 2012-03-16 22:57:23 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/16/12 5:49 PM, Peter Wemm wrote:
> On Fri, Mar 16, 2012 at 3:35 PM, Peter Wemm <peter@wemm.org>
> wrote:
>> 2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>> 
>>> On 3/16/12 3:51 PM, Peter Wemm wrote:
>>>> 2012/3/16 Russell Cattelan <cattelan@thebarn.com>:
>>>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>>> 
>>>>> On 3/16/12 11:56 AM, Peter Wemm wrote:
>>>>>> On Thu, Mar 15, 2012 at 2:40 PM, Russell Cattelan 
>>>>>> <cattelan@thebarn.com> wrote:
>>>>>>> The following reply was made to PR amd64/163710; it has
>>>>>>> been noted by GNATS.
>>>>>> [..]
>>>>>>> Does the last patch seem acceptable?
>>>>>>> 
>>>>>>> Can we close this issue out?
>>>>>> 
>>>>>> Sadly not,
>>>>>> 
>>>>>> +no-machine: + rm -f   ${.CURDIR}/../../ficl/machine
>>>>>> 
>>>>>> .. this is definitely bogus no matter what. This attempts
>>>>>> to modify the source tree which may be read only, and
>>>>>> should never even have a "machine->..." symlink in it to
>>>>>> remove in the first place.
>>>>> The sym link is created by the build of ficl for the
>>>>> loader. See: boot/ficl/Makefile machine: ln -sf
>>>>> ${.CURDIR}/../../i386/include machine
>>>>> 
>>>>> Are you suggesting that is incorrect and should be fixed?
>>>> 
>>>> No, you're reading it wrong: "ln -sf
>>>> ${.CURDIR}/../../i386/include machine" creates
>>>> ${.OBJDIR}/machine"
>>>> 
>>>> Your patch does a "rm -f   ${.CURDIR}/../../ficl/machine"
>>>> which is in the source tree, not the obj tree, so it would
>>>> never exist.  And if it does, then something is wrong with
>>>> your build environment.
>>>> 
>>> This is pretty easy to reproduce. cd /sys/boot make
>> 
>> You don't do that without a 'make obj' first.
>> 
>>> there will be a symlink in /sys/boot/ficl/machine that points
>>> to i386/include.
>> 
>> And this is user error.  Don't do that.
> 
> More specifically..  You can't put code in the Makefiles that
> modifies the source tree explicitly.
> 
> If sys/boot/userboot/ficl needs a different "#include <machine/*>" 
> then sys/boot/userboot/ficl needs to create the machine link there
> and set the -I paths so that one has precedence.
userboot/ficl does build correctly without sym links

It sounds like what you are saying it ficl/Makefile should
be fixed to *NOT* create the symlink like it is stand currently.


- -Russell

> 
> But reaching over and trying to modify another tool's build area
> is always wrong, aside from the obvious problem of it trying to
> modify the source tree.  If you're pulling the wrong include files
> into userboot/ficl, then that's because machine and -I aren't being
> set correctly in userboot/ficl/Makefile.
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk9jxVMACgkQNRmM+OaGhBgfnACfbUGvWuaEoo/35nTUmWgNV4yz
vCcAoIKVu0/DnZiahKN9gZWnk9JeUzXR
=d1jj
-----END PGP SIGNATURE-----
Comment 15 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:59:11 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 16 DontWorry 2020-10-22 13:15:17 UTC
(In reply to peter from comment #7)
On Thu, Mar 15, 2012 at 2:40 PM, Russell Cattelan <cattelan@thebarn.com> wr=
ote:
> The following reply was made to PR amd64/163710; it has been noted by GNA=

TS.
[..]
> =A0Does the last patch seem acceptable?

>
> =A0Can we close this issue out?


Sadly not,

 +no-machine:
 + rm -f   ${.CURDIR}/../../ficl/machine

.. this is definitely bogus no matter what. This attempts to modify
the source tree which may be read only, and should never even have a
"machine->..." symlink in it to remove in the first place.

I see sys/boot/userboot/ficl/Makefile has commented out the code that
sets up the ./machine links in its ${.OBJDIR} and there's -I paths all
over the place so my guess is that it's picking up some of the i386
machine links rather than setting up its own.  You probably need to
look at the userboot/ficl/Makefile code and make sure its setting up
the correct links rather than accidently using one belonging to
something else.

Or your source tree is contaminated somehow with a machine-> link
somewhere that it isn't supposed to be.
--=20
Peter Wemm - peter@wemm.org; https://www.manganelo.link/; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell
---------------------------------------------------------------------------


Well let me explain this again.
If the build is done from scratch things work because
boot/userboot/ficl is built before boot/ficl.
If an incremental build is done (e.g. when doing devel on the userboot
lib) boot/userboot/ficl will end up picking up i386 header files due
to the symlink that was created by boot/ficl/Makefile

I'll will grant you this bug isn't hit by a normal full build due
to way the build it ordered.