Both fail with a "too much recursion" error message during start up. I'm guessing the feature confuses them about the depth of the stack somehow. Tested on 12.0-STABLE r351060. They both work fine with the stackgap sysctls set to 0. And they both have been working fine with the other ASLR features on ever since this was committed to 12-STABLE BTW.
hm, according to https://wiki.freebsd.org/ASLR the base ntpd also doesn't like stackgap..
Do we know how to make it working by using /usr/bin/proccontrol -m stackgap -s disable firefox This command still get the too much recursion error. Currently I could either disable aslr completely for firefox, or just globally not enabling stackgap and keep aslr.
(In reply to Thibault Payet from comment #2) Same problem here. Looks like the proccontrol stackgap toggle only affects the stack "guard page" (handled by vm_map_stack_locked() in sys/vm/vm_map.c), not the ASLR randomized stackgap. This patch makes it affect the ASLR stackgap too and that makes firefox work with proccontrol. diff --git a/sys/kern/imgact_elf.c b/sys/kern/imgact_elf.c index fe71acabe0b..56623f29d4e 100644 --- a/sys/kern/imgact_elf.c +++ b/sys/kern/imgact_elf.c @@ -2766,6 +2766,9 @@ __elfN(stackgap)(struct image_params *imgp, uintptr_t *stack_base) if ((imgp->map_flags & MAP_ASLR) == 0) return; + if ((imgp->proc->p_flag2 & P2_STKGAP_DISABLE) != 0 || + (imgp->proc->p_fctl0 & NT_FREEBSD_FCTL_STKGAP_DISABLE) != 0) + return; pct = __elfN(aslr_stack_gap); if (pct == 0) return; Also if you mark firefox's binary with the new ELF feature flag to disable stackgap like so: # elfctl -e +stackgap /usr/local/bin/firefox Then firefox just works without needing to start with it proccontrol.
A commit references this bug: Author: kib Date: Fri Dec 18 23:14:41 UTC 2020 New revision: 368772 URL: https://svnweb.freebsd.org/changeset/base/368772 Log: Add ELF flag to disable ASLR stack gap. Also centralize and unify checks to enable ASLR stack gap in a new helper exec_stackgap(). PR: 239873 Sponsored by: The FreeBSD Foundation MFC after: 1 week Changes: head/sys/compat/freebsd32/freebsd32_misc.c head/sys/kern/imgact_elf.c head/sys/kern/kern_exec.c head/sys/sys/elf_common.h head/sys/sys/imgact.h head/usr.bin/elfctl/elfctl.c
(In reply to sigsys from comment #3) These are different stack gaps. One that is controlled by STKFGAP_DISABLE is at the bottom of the stacks and prevent stack overflow from stomping on the nearby mappings. ASLR stack gap is at top, and it only makes an impression of being useful. I just added an ELF feature control bit to disable it. I considered adding procctl(2) but decided that it is not very useful, ELF flag should be enough.
(In reply to Konstantin Belousov from comment #5) Ok. Yeah... I see it's much better to disable just the ASLR stackgap separately like this (since the "bottom" one isn't the problem). Firefox/Thunderbird work fine with this new "aslrstkgap" elfctl(1) toggle. Thanks for doing this! Now all that would be missing is to have the ports set it automatically on the installed binaries. Am I right to assume that setting unknown ELF flags like this should be harmless for older FreeBSD versions that don't support them?
(In reply to sigsys from comment #6) Unknown flags are ignored.
This does it for all USE_GECKO ports (www/firefox, www/firefox-esr and mail/thunderbird). diff --git a/Mk/bsd.commands.mk b/Mk/bsd.commands.mk index f1a229d04948..0d38d7b321bb 100644 --- a/Mk/bsd.commands.mk +++ b/Mk/bsd.commands.mk @@ -36,6 +36,7 @@ DIALOG4PORTS?= ${LOCALBASE}/bin/dialog4ports DIFF?= /usr/bin/diff DIRNAME?= /usr/bin/dirname EGREP?= /usr/bin/egrep +ELFCTL?= /usr/bin/elfctl EXPR?= /bin/expr FALSE?= false # Shell builtin FILE?= /usr/bin/file diff --git a/Mk/bsd.gecko.mk b/Mk/bsd.gecko.mk index b58e697c52a9..1881080a9d87 100644 --- a/Mk/bsd.gecko.mk +++ b/Mk/bsd.gecko.mk @@ -110,6 +110,7 @@ PLISTF?= ${WRKDIR}/plist_files MOZCONFIG?= ${WRKSRC}/.mozconfig MOZILLA_PLIST_DIRS?= bin lib share/pixmaps share/applications +MOZILLA_ELFCTLFIX_BINS?= lib/${MOZILLA}/${MOZILLA} lib/${MOZILLA}/${MOZILLA_BIN} # Adjust -C target-cpu if -march/-mcpu is set by bsd.cpu.mk .if ${ARCH} == amd64 || ${ARCH} == i386 @@ -376,7 +377,14 @@ pre-configure-script: @${SETENV} CC="${CC}" OPSYS="${OPSYS}" OSVERSION="${OSVERSION}" WRKDIR="${WRKDIR}" \ ${SH} ${SCRIPTSDIR}/rust-compat11-canary.sh -post-install-script: gecko-create-plist +post-install-script: gecko-elfctlfix gecko-create-plist + +gecko-elfctlfix: +# Avoids "too much recursion" errors when the ASLR "stackgap" is globally enabled. +.for bin in ${MOZILLA_ELFCTLFIX_BINS} + @if test -x ${ELFCTL} && ${ELFCTL} -l | ${GREP} -q aslrstkgap; then \ + ${ELFCTL} -e +aslrstkgap ${STAGEDIR}${PREFIX}/${bin}; fi +.endfor gecko-create-plist: # Create the plist
See also https://reviews.freebsd.org/D28139 where I propose prefixing the disable flags with "no" I've proposed an additional review to accept the flag with and without the "no" prefix for now, https://reviews.freebsd.org/D28140
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3dfcb70b6ae9bcb9fd6a66721bebdb8c6a53c329 commit 3dfcb70b6ae9bcb9fd6a66721bebdb8c6a53c329 Author: Ed Maste <emaste@FreeBSD.org> AuthorDate: 2021-01-13 19:21:38 +0000 Commit: Ed Maste <emaste@FreeBSD.org> CommitDate: 2021-01-14 20:09:08 +0000 elfctl: add backwards compatibility for "no" prefixes I am going to prefix opt-out ELF feature flag names with "no" to make their meaning more clear (review D28139), but there are some uses of the existing names already (e.g., the PR referenced below). For now accept the older, unprefixed name as well, and emit a warning. We can revert this after FreeBSD 13 branches. % elfctl -e +aslr foo elfctl: interpreting aslr as noaslr; please specify noaslr PR: 239873 (related) MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28140 usr.bin/elfctl/elfctl.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=49d3dcb041f058880486e3489ca79c9476ac7abf commit 49d3dcb041f058880486e3489ca79c9476ac7abf Author: Ed Maste <emaste@FreeBSD.org> AuthorDate: 2021-01-13 19:21:38 +0000 Commit: Ed Maste <emaste@FreeBSD.org> CommitDate: 2021-01-26 14:43:42 +0000 elfctl: add backwards compatibility for "no" prefixes I am going to prefix opt-out ELF feature flag names with "no" to make their meaning more clear (review D28139), but there are some uses of the existing names already (e.g., the PR referenced below). For now accept the older, unprefixed name as well, and emit a warning. We can revert this after FreeBSD 13 branches. % elfctl -e +aslr foo elfctl: interpreting aslr as noaslr; please specify noaslr PR: 239873 (related) MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28140 (cherry picked from commit 3dfcb70b6ae9bcb9fd6a66721bebdb8c6a53c329) usr.bin/elfctl/elfctl.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
I just came across this issues updating 4 of desktop systems, thanks for the fixes, everyone. (In reply to sigsys from comment #8) Any chance this will be committed?
^Triage: - What's left for this to proceed to resolution? - Who can/should be Assignee to coordinate resolution? Please self-assign, thanks!
Well, here's the patch updated to use the new feature names with the "no" prefix. Not using the new elfctl -i flag (which would have been perfect for this) because older 12.X might have an elfctl that doesn't support it yet. And I tested it again to confirm that firefox/thunderbird still don't work with the stackgap (same error as before) and that when built and installed with this change they do work. This seems very unlikely to cause any problems to me. This is the first use of elfctl in the ports but not the first use of ELF tampering (brandelf is used for example). diff --git a/Mk/bsd.commands.mk b/Mk/bsd.commands.mk index 620c62eb1533..221a04e4b8f2 100644 --- a/Mk/bsd.commands.mk +++ b/Mk/bsd.commands.mk @@ -34,6 +34,7 @@ DIALOG4PORTS?= ${LOCALBASE}/bin/dialog4ports DIFF?= /usr/bin/diff DIRNAME?= /usr/bin/dirname EGREP?= /usr/bin/egrep +ELFCTL?= /usr/bin/elfctl EXPR?= /bin/expr FALSE?= false # Shell builtin FILE?= /usr/bin/file diff --git a/Mk/bsd.gecko.mk b/Mk/bsd.gecko.mk index 3a48e802ea40..81ed6b7fc6dd 100644 --- a/Mk/bsd.gecko.mk +++ b/Mk/bsd.gecko.mk @@ -111,6 +111,7 @@ PLISTF?= ${WRKDIR}/plist_files MOZCONFIG?= ${WRKSRC}/.mozconfig MOZILLA_PLIST_DIRS?= bin lib share/pixmaps share/applications +MOZILLA_ELFCTLFIX_BINS?= lib/${MOZILLA}/${MOZILLA} lib/${MOZILLA}/${MOZILLA_BIN} # Adjust -C target-cpu if -march/-mcpu is set by bsd.cpu.mk .if ${ARCH} == amd64 || ${ARCH} == i386 @@ -385,7 +386,14 @@ pre-configure-script: @${SETENV} CC="${CC}" OPSYS="${OPSYS}" OSVERSION="${OSVERSION}" WRKDIR="${WRKDIR}" \ ${SH} ${SCRIPTSDIR}/rust-compat11-canary.sh -post-install-script: gecko-create-plist +post-install-script: gecko-elfctlfix gecko-create-plist + +gecko-elfctlfix: +# Avoids "too much recursion" errors when the ASLR "stackgap" is globally enabled. +.for bin in ${MOZILLA_ELFCTLFIX_BINS} + @if test -x ${ELFCTL} && ${ELFCTL} -l | ${GREP} -q noaslrstkgap; then \ + ${ELFCTL} -e +noaslrstkgap ${STAGEDIR}${PREFIX}/${bin}; fi +.endfor gecko-create-plist: # Create the plist
Kernel with these patches fixes this here on 14-CURRENT/amd64: https://reviews.freebsd.org/D31516 https://reviews.freebsd.org/D31692
Hi, I abandoned D31692 and created an updated solution, you can see all the patches needed to resolve the problem in revisions below: https://reviews.freebsd.org/D31516 https://reviews.freebsd.org/D31897 https://reviews.freebsd.org/D31898 I think those patches should resolve most, if not all, of the issues currently seen with stack gap.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=78df56ccfcb40013a3e6904bd6d39836220c3550 commit 78df56ccfcb40013a3e6904bd6d39836220c3550 Author: Dawid Gorecki <dgr@semihalf.com> AuthorDate: 2021-10-13 19:06:05 +0000 Commit: Marcin Wojtas <mw@FreeBSD.org> CommitDate: 2021-10-15 08:21:56 +0000 libthr: Use kern.stacktop for thread stack calculation. Use the new kern.stacktop sysctl to retrieve the address of stack top instead of kern.usrstack. kern.usrstack does not have any knowledge of the stack gap, so this can cause problems with thread stacks. Using kern.stacktop sysctl should fix most of those problems. kern.usrstack is used as a fallback when kern.stacktop cannot be read. Rename usrstack variables to stacktop to reflect this change. Fixes problems with firefox and thunderbird not starting with stack gap enabled. PR: 239873 Reviewed by: kib Obtained from: Semihalf Sponsored by: Stormshield MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D31898 lib/libthr/thread/thr_init.c | 19 +++++++++++-------- lib/libthr/thread/thr_private.h | 2 +- lib/libthr/thread/thr_stack.c | 22 ++++++++++++---------- 3 files changed, 24 insertions(+), 19 deletions(-)
Nice to see a complete fix for this. No issues that I can tell with firefox (or any other programs so far) with stackgap enabled after a few hours of use. It just works now.
(In reply to sigsys from comment #18) Thank you for testing and the follow-up!
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d8247df24c1d3bacadd60a6633d76fb6cb56fd0e commit d8247df24c1d3bacadd60a6633d76fb6cb56fd0e Author: Dawid Gorecki <dgr@semihalf.com> AuthorDate: 2021-10-13 19:06:05 +0000 Commit: Marcin Wojtas <mw@FreeBSD.org> CommitDate: 2021-12-30 15:25:25 +0000 libthr: Use kern.stacktop for thread stack calculation. Use the new kern.stacktop sysctl to retrieve the address of stack top instead of kern.usrstack. kern.usrstack does not have any knowledge of the stack gap, so this can cause problems with thread stacks. Using kern.stacktop sysctl should fix most of those problems. kern.usrstack is used as a fallback when kern.stacktop cannot be read. Rename usrstack variables to stacktop to reflect this change. Fixes problems with firefox and thunderbird not starting with stack gap enabled. PR: 239873 Reviewed by: kib Obtained from: Semihalf Sponsored by: Stormshield MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D31898 (cherry picked from commit 78df56ccfcb40013a3e6904bd6d39836220c3550) lib/libthr/thread/thr_init.c | 19 +++++++++++-------- lib/libthr/thread/thr_private.h | 2 +- lib/libthr/thread/thr_stack.c | 22 ++++++++++++---------- 3 files changed, 24 insertions(+), 19 deletions(-)