Bug 260303 - lang/sdcc: seg fault during build (ASLR fallout)
Summary: lang/sdcc: seg fault during build (ASLR fallout)
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords:
Depends on:
Blocks: 259968
  Show dependency treegraph
 
Reported: 2021-12-10 07:28 UTC by Daniel O'Connor
Modified: 2022-02-16 18:00 UTC (History)
9 users (show)

See Also:


Attachments
Mark sdcc with elfctl do disable ASLR otherwise it blows up (647 bytes, patch)
2021-12-11 05:20 UTC, Daniel O'Connor
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel O'Connor 2021-12-10 07:28:25 UTC
I got a notice from a port builder it was crashing and reproduced the problem in a VM.

It crashes at:
   ../../bin/sdcc -I../../device/include -I../../device/include/mcs51 -mds390 --nostdinc --std-c11 -c _schar2fs.c -o ds390/_schar2fs.rel

It appears to be caused by ASLR - with it disabled via proccontrol the above command runs reliably all the time, without it crashes ~90% of the time.

I had a look at the core dumb but it seems quite uninformative:
[freebsd14 7:06] /usr/ports/lang/sdcc/work/sdcc-4.0.0/device/lib >sudo gdb ../../bin/sdcc sdcc.core
GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ../../bin/sdcc...
[New LWP 100122]
Core was generated by `../../bin/sdcc -I../../device/include -I../../device/include/mcs51 -mds390 --nos'.
Program terminated with signal SIGSEGV, Segmentation fault.
Invalid permissions for mapped object.
#0  0x0000000804e3fbc0 in setrlimit () from /lib/libc.so.7
(gdb) info thread
  Id   Target Id         Frame
* 1    LWP 100122        0x0000000804e3fbc0 in setrlimit () from /lib/libc.so.7
(gdb) bt
#0  0x0000000804e3fbc0 in setrlimit () from /lib/libc.so.7
Backtrace stopped: Cannot access memory at address 0x7fffff87fd08
quit)
Comment 1 Dimitry Andric freebsd_committer freebsd_triage 2021-12-10 14:15:11 UTC
See also bug 253208, and commit https://cgit.freebsd.org/src/commit/?id=889b56c8cd84c9a9f2d9e3b019c154d6f14d9021 .

The problem appears to be the setrlimit() call in SDCCmain.c:

/*-------------------------------------------------------------*/
/* setStackSize - set the stack size of a running sdcc process */
/*-------------------------------------------------------------*/
static void                                                                                                                                                                                                                          setStackSize (void)
{
#if defined (HAVE_SETRLIMIT) && defined (RLIMIT_STACK)
  struct rlimit rl = {4 * 1024 * 1024, 4 * 1024 * 1024};
  setrlimit (RLIMIT_STACK, &rl);
#endif
}

It segfaults immediately upon return from the setrlimit() syscall wrapper.

Doing:

elfctl -e +noaslrstkgap ${WRKSRC}/bin/sdcc

makes the problem go away.
Comment 2 Dawid Gorecki 2021-12-10 14:45:04 UTC
Yeah, this problem is caused by stack gap. The program crashes immediately after calling setrlimit because it limits the stack to a very low value(4M). The stack gap in FreeBSD is often larger than that, for amd64 it can be by default as large as 15M. If the stack gap is larger than stack resource limit then you can see what happens.

The commit Dimitry linked is related to this issue. However, while we take into account the size of the stack gap when calculating stack limit, we only do so for rlim_cur, rlim_max acts as a hard limit, which is not adjusted. The only way to fix this issue currently is by either disabling the stack gap or by setting rlim_max to a larger value. 20M should be enough. In that situation rlim_cur would automatically adjust itself to 4M + stack_gap.
Comment 3 Dimitry Andric freebsd_committer freebsd_triage 2021-12-10 14:51:49 UTC
I'm testing Andrew's suggestion of:

diff --git a/sys/kern/kern_resource.c b/sys/kern/kern_resource.c
index 793ded63d91c..8ee98473159c 100644
--- a/sys/kern/kern_resource.c
+++ b/sys/kern/kern_resource.c
@@ -672,8 +672,12 @@ kern_proc_setrlimit(struct thread *td, struct proc *p, u_int which,
        if (limp->rlim_max < 0)
                limp->rlim_max = RLIM_INFINITY;

-       if (which == RLIMIT_STACK && limp->rlim_cur != RLIM_INFINITY)
-               limp->rlim_cur += p->p_vmspace->vm_stkgap;
+       if (which == RLIMIT_STACK) {
+               if (limp->rlim_cur != RLIM_INFINITY)
+                       limp->rlim_cur += p->p_vmspace->vm_stkgap;
+               if (limp->rlim_max != RLIM_INFINITY)
+                       limp->rlim_max += p->p_vmspace->vm_stkgap;
+       }

        oldssiz.rlim_cur = 0;
        newlim = lim_alloc();
Comment 4 Mark Johnston freebsd_committer freebsd_triage 2021-12-10 20:41:23 UTC
(In reply to Dawid Gorecki from comment #2)
I did some digging into this earlier this week, and I think the problems go deeper.  I believe the stack gap randomization should be disabled in main until this is sorted out properly.

Right now on amd64 we have

---------  <---- top of user memory
VDSO page
---------  <---- address returned by KERN_USRSTACK sysctl
ps_strings
auxv
---------
stack gap
---------  <---- address returned by KERN_STACKTOP sysctl
argv, envv
main process stack
...
---------  <---- 
bottom of stack

Some problems:

1. The stack gap is still mapped, just unused.  So mlockall() will wire the stack gap region, wasting megabytes of memory.

2. The default RLIMIT_STACK value of 512 MB includes the stack gap.  But setrlimit(RLIMIT_STACK) adds the size of the gap to the soft limit.  So:
2a. The hard limit is clamped to the soft limit, so setting rlim_cur = rlim_max doesn't work.  That is the problem reported in this PR.
2b. The stack gap adjustment is visible to userspace.  The simplest example,

    getrlimit(RLIMIT_STACK, &lim);
    setrlimit(RLIMIT_STACK, &lim);

    does not work as expected, i.e., it actually changes the stored limit (assuming 2a) is fixed.

3. The changes made recently to singlethread_map_stacks_exec() and init_private() in libthr don't make sense to me.  In the former, we grab the KERN_STACKTOP and then mprotect() a region of size RLIMIT_STACK, but part of that region is unmapped (or mapped for something other than the stack) since RLIMIT_STACK includes the stack gap but KERN_STACKTOP is already adjusted by the stack gap.  Meanwhile, __libc_map_stacks_exec() was not similarly modified.  Moreover, old copies of libthr.so will continue to use KERN_USRSTACK.

Rather than including the gap in RLIMIT_STACK, can we map the full 512MB stack starting at the (randomized) argv/envv location (creating a separate mapping for ps_strings/auxv), or include ps_strings/auxv at the beginning of the randomized region?  The latter is more work, since the kernel uses the ABI-defined ps_strings address in some places.  It would break very old binaries that hard-code the ps_strings address, but aren't they effectively all 32-bit binaries anyway?

In any case, I propose disabling the stack gap randomization for now to address the regressions.  sdcc is not the only affected program.
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2021-12-10 20:48:19 UTC
(In reply to Mark Johnston from comment #4)
I also don't really like the co-existence of KERN_STACKTOP and KERN_USRSTACK and would like to remove the former.  The need for both implies that any userland code using KERN_USRSTACK is going to be subtly broken when running on a new kernel.
Comment 6 Daniel O'Connor 2021-12-11 05:20:38 UTC
Created attachment 230037 [details]
Mark sdcc with elfctl do disable ASLR otherwise it blows up
Comment 7 Daniel O'Connor 2021-12-11 05:30:28 UTC
(In reply to Daniel O'Connor from comment #6)
This patch papers over the issue to the extent that the build finishes fine for me and it compiles the 8051 code $work uses.
Comment 8 commit-hook freebsd_committer freebsd_triage 2021-12-14 21:20:30 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=bfd451210e5972c1f2ec5200b6ca6ca70a9f24ae

commit bfd451210e5972c1f2ec5200b6ca6ca70a9f24ae
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-12-14 21:15:06 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-12-14 21:19:40 +0000

    imgact_elf: Disable the stack gap for now

    The integration with RLIMIT_STACK is still causing problems for some
    programs such as lang/sdcc and syzkaller's executor.  Until this is
    resolved by some work currently in progress, disable the stack gap by
    default.

    PR:             260303
    Reviewed by:    kib, emaste
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D33438

 sys/kern/imgact_elf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 9 Mark Johnston freebsd_committer freebsd_triage 2021-12-20 14:12:58 UTC
As a follow-up, I've been working on a change to the way the stack gap works: rather than maintaining a fixed address for the top of the stack and including the gap against the RLIMIT_STACK limit, I changed the ELF image activator and exec_copyout_strings() to locate the stack below a randomized gap.  This way, the RLIMIT_STACK implementation requires no knowledge of the stack gap.
Comment 10 Eric van Gyzen freebsd_committer freebsd_triage 2022-01-12 16:13:55 UTC
(In reply to Mark Johnston from comment #9)

> This way, the RLIMIT_STACK implementation requires no knowledge of the stack gap.

I like this.  Tag me on the review.
Comment 11 Mark Johnston freebsd_committer freebsd_triage 2022-01-12 16:14:53 UTC
(In reply to Eric van Gyzen from comment #10)
https://reviews.freebsd.org/D33704 for the record.
Comment 12 commit-hook freebsd_committer freebsd_triage 2022-01-17 21:13:31 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1811c1e957ee1250b08b3246fc0db37ddf64b736

commit 1811c1e957ee1250b08b3246fc0db37ddf64b736
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-01-17 16:42:56 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-01-17 21:12:36 +0000

    exec: Reimplement stack address randomization

    The approach taken by the stack gap implementation was to insert a
    random gap between the top of the fixed stack mapping and the true top
    of the main process stack.  This approach was chosen so as to avoid
    randomizing the previously fixed address of certain process metadata
    stored at the top of the stack, but had some shortcomings.  In
    particular, mlockall(2) calls would wire the gap, bloating the process'
    memory usage, and RLIMIT_STACK included the size of the gap so small
    (< several MB) limits could not be used.

    There is little value in storing each process' ps_strings at a fixed
    location, as only very old programs hard-code this address; consumers
    were converted decades ago to use a sysctl-based interface for this
    purpose.  Thus, this change re-implements stack address randomization by
    simply breaking the convention of storing ps_strings at a fixed
    location, and randomizing the location of the entire stack mapping.
    This implementation is simpler and avoids the problems mentioned above,
    while being unlikely to break compatibility anywhere the default ASLR
    settings are used.

    The kern.elfN.aslr.stack_gap sysctl is renamed to kern.elfN.aslr.stack,
    and is re-enabled by default.

    PR:             260303
    Reviewed by:    kib
    Discussed with: emaste, mw
    MFC after:      1 month
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D33704

 share/man/man7/security.7     | 16 ++++----
 sys/i386/linux/imgact_linux.c |  4 ++
 sys/kern/imgact_aout.c        |  4 ++
 sys/kern/imgact_elf.c         | 27 +++++++++-----
 sys/kern/kern_exec.c          | 86 ++++++++++++++++++++++++++++++-------------
 sys/sys/exec.h                |  3 +-
 sys/sys/imgact.h              |  1 +
 sys/vm/vm_map.c               |  4 +-
 sys/vm/vm_map.h               |  9 +++--
 9 files changed, 103 insertions(+), 51 deletions(-)
Comment 13 commit-hook freebsd_committer freebsd_triage 2022-01-17 21:13:34 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=5a8413e779a2258b02f92bad69a03d283c4bf2a6

commit 5a8413e779a2258b02f92bad69a03d283c4bf2a6
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-01-17 16:42:13 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-01-17 16:42:13 +0000

    setrlimit: Remove special handling for RLIMIT_STACK with a stack gap

    This will not be required with a forthcoming reimplementation of ASLR
    stack randomization.  Moreover, this change was not sufficient to enable
    the use of a stack size limit smaller than the stack gap itself.

    PR:             260303
    Reviewed by:    kib
    MFC after:      1 week
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D33704

 sys/kern/kern_resource.c | 3 ---
 1 file changed, 3 deletions(-)
Comment 14 Mark Johnston freebsd_committer freebsd_triage 2022-01-17 21:36:05 UTC
I was able to build lang/sdcc several times back-to-back on main with stack randomization re-enabled, so I believe this is fixed.
Comment 15 Daniel O'Connor 2022-01-18 02:20:28 UTC
(In reply to Mark Johnston from comment #14)
Thanks Mark, now to wait for beefy18 to be updated so it stops spamming me :)
Comment 16 commit-hook freebsd_committer freebsd_triage 2022-01-24 14:21:42 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1a97674b4637d07189fca0d4d8cea29bcf0d0efa

commit 1a97674b4637d07189fca0d4d8cea29bcf0d0efa
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-01-17 16:42:13 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-01-24 14:16:52 +0000

    setrlimit: Remove special handling for RLIMIT_STACK with a stack gap

    This will not be required with a forthcoming reimplementation of ASLR
    stack randomization.  Moreover, this change was not sufficient to enable
    the use of a stack size limit smaller than the stack gap itself.

    PR:             260303
    Reviewed by:    kib
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit 5a8413e779a2258b02f92bad69a03d283c4bf2a6)

 sys/kern/kern_resource.c | 3 ---
 1 file changed, 3 deletions(-)
Comment 17 commit-hook freebsd_committer freebsd_triage 2022-02-16 18:00:14 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=5fa005e91560785dad5183db080209447afde3c2

commit 5fa005e91560785dad5183db080209447afde3c2
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-01-17 16:42:56 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-02-16 16:55:03 +0000

    exec: Reimplement stack address randomization

    The approach taken by the stack gap implementation was to insert a
    random gap between the top of the fixed stack mapping and the true top
    of the main process stack.  This approach was chosen so as to avoid
    randomizing the previously fixed address of certain process metadata
    stored at the top of the stack, but had some shortcomings.  In
    particular, mlockall(2) calls would wire the gap, bloating the process'
    memory usage, and RLIMIT_STACK included the size of the gap so small
    (< several MB) limits could not be used.

    There is little value in storing each process' ps_strings at a fixed
    location, as only very old programs hard-code this address; consumers
    were converted decades ago to use a sysctl-based interface for this
    purpose.  Thus, this change re-implements stack address randomization by
    simply breaking the convention of storing ps_strings at a fixed
    location, and randomizing the location of the entire stack mapping.
    This implementation is simpler and avoids the problems mentioned above,
    while being unlikely to break compatibility anywhere the default ASLR
    settings are used.

    The kern.elfN.aslr.stack_gap sysctl is renamed to kern.elfN.aslr.stack,
    and is re-enabled by default.

    PR:             260303
    Reviewed by:    kib
    Discussed with: emaste, mw
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit 1811c1e957ee1250b08b3246fc0db37ddf64b736)

 share/man/man7/security.7     | 16 ++++-----
 sys/i386/linux/imgact_linux.c |  4 +++
 sys/kern/imgact_aout.c        |  4 +++
 sys/kern/imgact_elf.c         | 27 ++++++++------
 sys/kern/kern_exec.c          | 84 ++++++++++++++++++++++++++++++-------------
 sys/sys/exec.h                |  3 +-
 sys/sys/imgact.h              |  1 +
 sys/vm/vm_map.c               |  4 +--
 sys/vm/vm_map.h               |  9 ++---
 9 files changed, 102 insertions(+), 50 deletions(-)