I got a notice from a port builder it was crashing and reproduced the problem in a VM. It crashes at: ../../bin/sdcc -I../../device/include -I../../device/include/mcs51 -mds390 --nostdinc --std-c11 -c _schar2fs.c -o ds390/_schar2fs.rel It appears to be caused by ASLR - with it disabled via proccontrol the above command runs reliably all the time, without it crashes ~90% of the time. I had a look at the core dumb but it seems quite uninformative: [freebsd14 7:06] /usr/ports/lang/sdcc/work/sdcc-4.0.0/device/lib >sudo gdb ../../bin/sdcc sdcc.core GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD] Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-portbld-freebsd14.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ../../bin/sdcc... [New LWP 100122] Core was generated by `../../bin/sdcc -I../../device/include -I../../device/include/mcs51 -mds390 --nos'. Program terminated with signal SIGSEGV, Segmentation fault. Invalid permissions for mapped object. #0 0x0000000804e3fbc0 in setrlimit () from /lib/libc.so.7 (gdb) info thread Id Target Id Frame * 1 LWP 100122 0x0000000804e3fbc0 in setrlimit () from /lib/libc.so.7 (gdb) bt #0 0x0000000804e3fbc0 in setrlimit () from /lib/libc.so.7 Backtrace stopped: Cannot access memory at address 0x7fffff87fd08 quit)
See also bug 253208, and commit https://cgit.freebsd.org/src/commit/?id=889b56c8cd84c9a9f2d9e3b019c154d6f14d9021 . The problem appears to be the setrlimit() call in SDCCmain.c: /*-------------------------------------------------------------*/ /* setStackSize - set the stack size of a running sdcc process */ /*-------------------------------------------------------------*/ static void setStackSize (void) { #if defined (HAVE_SETRLIMIT) && defined (RLIMIT_STACK) struct rlimit rl = {4 * 1024 * 1024, 4 * 1024 * 1024}; setrlimit (RLIMIT_STACK, &rl); #endif } It segfaults immediately upon return from the setrlimit() syscall wrapper. Doing: elfctl -e +noaslrstkgap ${WRKSRC}/bin/sdcc makes the problem go away.
Yeah, this problem is caused by stack gap. The program crashes immediately after calling setrlimit because it limits the stack to a very low value(4M). The stack gap in FreeBSD is often larger than that, for amd64 it can be by default as large as 15M. If the stack gap is larger than stack resource limit then you can see what happens. The commit Dimitry linked is related to this issue. However, while we take into account the size of the stack gap when calculating stack limit, we only do so for rlim_cur, rlim_max acts as a hard limit, which is not adjusted. The only way to fix this issue currently is by either disabling the stack gap or by setting rlim_max to a larger value. 20M should be enough. In that situation rlim_cur would automatically adjust itself to 4M + stack_gap.
I'm testing Andrew's suggestion of: diff --git a/sys/kern/kern_resource.c b/sys/kern/kern_resource.c index 793ded63d91c..8ee98473159c 100644 --- a/sys/kern/kern_resource.c +++ b/sys/kern/kern_resource.c @@ -672,8 +672,12 @@ kern_proc_setrlimit(struct thread *td, struct proc *p, u_int which, if (limp->rlim_max < 0) limp->rlim_max = RLIM_INFINITY; - if (which == RLIMIT_STACK && limp->rlim_cur != RLIM_INFINITY) - limp->rlim_cur += p->p_vmspace->vm_stkgap; + if (which == RLIMIT_STACK) { + if (limp->rlim_cur != RLIM_INFINITY) + limp->rlim_cur += p->p_vmspace->vm_stkgap; + if (limp->rlim_max != RLIM_INFINITY) + limp->rlim_max += p->p_vmspace->vm_stkgap; + } oldssiz.rlim_cur = 0; newlim = lim_alloc();
(In reply to Dawid Gorecki from comment #2) I did some digging into this earlier this week, and I think the problems go deeper. I believe the stack gap randomization should be disabled in main until this is sorted out properly. Right now on amd64 we have --------- <---- top of user memory VDSO page --------- <---- address returned by KERN_USRSTACK sysctl ps_strings auxv --------- stack gap --------- <---- address returned by KERN_STACKTOP sysctl argv, envv main process stack ... --------- <---- bottom of stack Some problems: 1. The stack gap is still mapped, just unused. So mlockall() will wire the stack gap region, wasting megabytes of memory. 2. The default RLIMIT_STACK value of 512 MB includes the stack gap. But setrlimit(RLIMIT_STACK) adds the size of the gap to the soft limit. So: 2a. The hard limit is clamped to the soft limit, so setting rlim_cur = rlim_max doesn't work. That is the problem reported in this PR. 2b. The stack gap adjustment is visible to userspace. The simplest example, getrlimit(RLIMIT_STACK, &lim); setrlimit(RLIMIT_STACK, &lim); does not work as expected, i.e., it actually changes the stored limit (assuming 2a) is fixed. 3. The changes made recently to singlethread_map_stacks_exec() and init_private() in libthr don't make sense to me. In the former, we grab the KERN_STACKTOP and then mprotect() a region of size RLIMIT_STACK, but part of that region is unmapped (or mapped for something other than the stack) since RLIMIT_STACK includes the stack gap but KERN_STACKTOP is already adjusted by the stack gap. Meanwhile, __libc_map_stacks_exec() was not similarly modified. Moreover, old copies of libthr.so will continue to use KERN_USRSTACK. Rather than including the gap in RLIMIT_STACK, can we map the full 512MB stack starting at the (randomized) argv/envv location (creating a separate mapping for ps_strings/auxv), or include ps_strings/auxv at the beginning of the randomized region? The latter is more work, since the kernel uses the ABI-defined ps_strings address in some places. It would break very old binaries that hard-code the ps_strings address, but aren't they effectively all 32-bit binaries anyway? In any case, I propose disabling the stack gap randomization for now to address the regressions. sdcc is not the only affected program.
(In reply to Mark Johnston from comment #4) I also don't really like the co-existence of KERN_STACKTOP and KERN_USRSTACK and would like to remove the former. The need for both implies that any userland code using KERN_USRSTACK is going to be subtly broken when running on a new kernel.
Created attachment 230037 [details] Mark sdcc with elfctl do disable ASLR otherwise it blows up
(In reply to Daniel O'Connor from comment #6) This patch papers over the issue to the extent that the build finishes fine for me and it compiles the 8051 code $work uses.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=bfd451210e5972c1f2ec5200b6ca6ca70a9f24ae commit bfd451210e5972c1f2ec5200b6ca6ca70a9f24ae Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2021-12-14 21:15:06 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2021-12-14 21:19:40 +0000 imgact_elf: Disable the stack gap for now The integration with RLIMIT_STACK is still causing problems for some programs such as lang/sdcc and syzkaller's executor. Until this is resolved by some work currently in progress, disable the stack gap by default. PR: 260303 Reviewed by: kib, emaste Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D33438 sys/kern/imgact_elf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
As a follow-up, I've been working on a change to the way the stack gap works: rather than maintaining a fixed address for the top of the stack and including the gap against the RLIMIT_STACK limit, I changed the ELF image activator and exec_copyout_strings() to locate the stack below a randomized gap. This way, the RLIMIT_STACK implementation requires no knowledge of the stack gap.
(In reply to Mark Johnston from comment #9) > This way, the RLIMIT_STACK implementation requires no knowledge of the stack gap. I like this. Tag me on the review.
(In reply to Eric van Gyzen from comment #10) https://reviews.freebsd.org/D33704 for the record.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=1811c1e957ee1250b08b3246fc0db37ddf64b736 commit 1811c1e957ee1250b08b3246fc0db37ddf64b736 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-01-17 16:42:56 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-01-17 21:12:36 +0000 exec: Reimplement stack address randomization The approach taken by the stack gap implementation was to insert a random gap between the top of the fixed stack mapping and the true top of the main process stack. This approach was chosen so as to avoid randomizing the previously fixed address of certain process metadata stored at the top of the stack, but had some shortcomings. In particular, mlockall(2) calls would wire the gap, bloating the process' memory usage, and RLIMIT_STACK included the size of the gap so small (< several MB) limits could not be used. There is little value in storing each process' ps_strings at a fixed location, as only very old programs hard-code this address; consumers were converted decades ago to use a sysctl-based interface for this purpose. Thus, this change re-implements stack address randomization by simply breaking the convention of storing ps_strings at a fixed location, and randomizing the location of the entire stack mapping. This implementation is simpler and avoids the problems mentioned above, while being unlikely to break compatibility anywhere the default ASLR settings are used. The kern.elfN.aslr.stack_gap sysctl is renamed to kern.elfN.aslr.stack, and is re-enabled by default. PR: 260303 Reviewed by: kib Discussed with: emaste, mw MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D33704 share/man/man7/security.7 | 16 ++++---- sys/i386/linux/imgact_linux.c | 4 ++ sys/kern/imgact_aout.c | 4 ++ sys/kern/imgact_elf.c | 27 +++++++++----- sys/kern/kern_exec.c | 86 ++++++++++++++++++++++++++++++------------- sys/sys/exec.h | 3 +- sys/sys/imgact.h | 1 + sys/vm/vm_map.c | 4 +- sys/vm/vm_map.h | 9 +++-- 9 files changed, 103 insertions(+), 51 deletions(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=5a8413e779a2258b02f92bad69a03d283c4bf2a6 commit 5a8413e779a2258b02f92bad69a03d283c4bf2a6 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-01-17 16:42:13 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-01-17 16:42:13 +0000 setrlimit: Remove special handling for RLIMIT_STACK with a stack gap This will not be required with a forthcoming reimplementation of ASLR stack randomization. Moreover, this change was not sufficient to enable the use of a stack size limit smaller than the stack gap itself. PR: 260303 Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D33704 sys/kern/kern_resource.c | 3 --- 1 file changed, 3 deletions(-)
I was able to build lang/sdcc several times back-to-back on main with stack randomization re-enabled, so I believe this is fixed.
(In reply to Mark Johnston from comment #14) Thanks Mark, now to wait for beefy18 to be updated so it stops spamming me :)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=1a97674b4637d07189fca0d4d8cea29bcf0d0efa commit 1a97674b4637d07189fca0d4d8cea29bcf0d0efa Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-01-17 16:42:13 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-01-24 14:16:52 +0000 setrlimit: Remove special handling for RLIMIT_STACK with a stack gap This will not be required with a forthcoming reimplementation of ASLR stack randomization. Moreover, this change was not sufficient to enable the use of a stack size limit smaller than the stack gap itself. PR: 260303 Reviewed by: kib Sponsored by: The FreeBSD Foundation (cherry picked from commit 5a8413e779a2258b02f92bad69a03d283c4bf2a6) sys/kern/kern_resource.c | 3 --- 1 file changed, 3 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=5fa005e91560785dad5183db080209447afde3c2 commit 5fa005e91560785dad5183db080209447afde3c2 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-01-17 16:42:56 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-02-16 16:55:03 +0000 exec: Reimplement stack address randomization The approach taken by the stack gap implementation was to insert a random gap between the top of the fixed stack mapping and the true top of the main process stack. This approach was chosen so as to avoid randomizing the previously fixed address of certain process metadata stored at the top of the stack, but had some shortcomings. In particular, mlockall(2) calls would wire the gap, bloating the process' memory usage, and RLIMIT_STACK included the size of the gap so small (< several MB) limits could not be used. There is little value in storing each process' ps_strings at a fixed location, as only very old programs hard-code this address; consumers were converted decades ago to use a sysctl-based interface for this purpose. Thus, this change re-implements stack address randomization by simply breaking the convention of storing ps_strings at a fixed location, and randomizing the location of the entire stack mapping. This implementation is simpler and avoids the problems mentioned above, while being unlikely to break compatibility anywhere the default ASLR settings are used. The kern.elfN.aslr.stack_gap sysctl is renamed to kern.elfN.aslr.stack, and is re-enabled by default. PR: 260303 Reviewed by: kib Discussed with: emaste, mw Sponsored by: The FreeBSD Foundation (cherry picked from commit 1811c1e957ee1250b08b3246fc0db37ddf64b736) share/man/man7/security.7 | 16 ++++----- sys/i386/linux/imgact_linux.c | 4 +++ sys/kern/imgact_aout.c | 4 +++ sys/kern/imgact_elf.c | 27 ++++++++------ sys/kern/kern_exec.c | 84 ++++++++++++++++++++++++++++++------------- sys/sys/exec.h | 3 +- sys/sys/imgact.h | 1 + sys/vm/vm_map.c | 4 +-- sys/vm/vm_map.h | 9 ++--- 9 files changed, 102 insertions(+), 50 deletions(-)