Bug 239873 - www/firefox and mail/thunderbird don't like the new ASLR "stackgap" feature
Summary: www/firefox and mail/thunderbird don't like the new ASLR "stackgap" feature
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-gecko (Nobody)
URL:
Keywords: needs-qa
Depends on:
Blocks: 252629
  Show dependency treegraph
 
Reported: 2019-08-15 09:39 UTC by sigsys
Modified: 2021-12-30 15:37 UTC (History)
13 users (show)

See Also:
bugzilla: maintainer-feedback? (gecko)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description sigsys 2019-08-15 09:39:44 UTC
Both fail with a "too much recursion" error message during start up.

I'm guessing the feature confuses them about the depth of the stack somehow.

Tested on 12.0-STABLE r351060.

They both work fine with the stackgap sysctls set to 0.

And they both have been working fine with the other ASLR features on ever since this was committed to 12-STABLE BTW.
Comment 1 Val Packett 2020-05-08 12:22:12 UTC
hm, according to https://wiki.freebsd.org/ASLR the base ntpd also doesn't like stackgap..
Comment 2 Thibault Payet 2020-08-11 17:46:57 UTC
Do we know how to make it working by using 

/usr/bin/proccontrol -m stackgap -s disable firefox

This command still get the too much recursion error.

Currently I could either disable aslr completely for firefox, or just globally not enabling stackgap and keep aslr.
Comment 3 sigsys 2020-08-12 20:15:04 UTC
(In reply to Thibault Payet from comment #2)
Same problem here.

Looks like the proccontrol stackgap toggle only affects the stack "guard page" (handled by vm_map_stack_locked() in sys/vm/vm_map.c), not the ASLR randomized stackgap.

This patch makes it affect the ASLR stackgap too and that makes firefox work with proccontrol.

diff --git a/sys/kern/imgact_elf.c b/sys/kern/imgact_elf.c
index fe71acabe0b..56623f29d4e 100644
--- a/sys/kern/imgact_elf.c
+++ b/sys/kern/imgact_elf.c
@@ -2766,6 +2766,9 @@ __elfN(stackgap)(struct image_params *imgp, uintptr_t *stack_base)
 
 	if ((imgp->map_flags & MAP_ASLR) == 0)
 		return;
+	if ((imgp->proc->p_flag2 & P2_STKGAP_DISABLE) != 0 ||
+	    (imgp->proc->p_fctl0 & NT_FREEBSD_FCTL_STKGAP_DISABLE) != 0)
+		return;
 	pct = __elfN(aslr_stack_gap);
 	if (pct == 0)
 		return;

Also if you mark firefox's binary with the new ELF feature flag to disable stackgap like so:

# elfctl -e +stackgap /usr/local/bin/firefox

Then firefox just works without needing to start with it proccontrol.
Comment 4 commit-hook freebsd_committer freebsd_triage 2020-12-18 23:14:49 UTC
A commit references this bug:

Author: kib
Date: Fri Dec 18 23:14:41 UTC 2020
New revision: 368772
URL: https://svnweb.freebsd.org/changeset/base/368772

Log:
  Add ELF flag to disable ASLR stack gap.

  Also centralize and unify checks to enable ASLR stack gap in a new
  helper exec_stackgap().

  PR:	239873
  Sponsored by:	The FreeBSD Foundation
  MFC after:	1 week

Changes:
  head/sys/compat/freebsd32/freebsd32_misc.c
  head/sys/kern/imgact_elf.c
  head/sys/kern/kern_exec.c
  head/sys/sys/elf_common.h
  head/sys/sys/imgact.h
  head/usr.bin/elfctl/elfctl.c
Comment 5 Konstantin Belousov freebsd_committer freebsd_triage 2020-12-18 23:17:24 UTC
(In reply to sigsys from comment #3)
These are different stack gaps.  One that is controlled by STKFGAP_DISABLE is at
the bottom of the stacks and prevent stack overflow from stomping on the
nearby mappings.

ASLR stack gap is at top, and it only makes an impression of being useful.
I just added an ELF feature control bit to disable it.

I considered adding procctl(2) but decided that it is not very useful,
ELF flag should be enough.
Comment 6 sigsys 2020-12-19 01:19:44 UTC
(In reply to Konstantin Belousov from comment #5)
Ok. Yeah... I see it's much better to disable just the ASLR stackgap separately like this (since the "bottom" one isn't the problem).

Firefox/Thunderbird work fine with this new "aslrstkgap" elfctl(1) toggle. Thanks for doing this!

Now all that would be missing is to have the ports set it automatically on the installed binaries.  Am I right to assume that setting unknown ELF flags like this should be harmless for older FreeBSD versions that don't support them?
Comment 7 Konstantin Belousov freebsd_committer freebsd_triage 2020-12-19 01:51:20 UTC
(In reply to sigsys from comment #6)
Unknown flags are ignored.
Comment 8 sigsys 2020-12-19 04:10:54 UTC
This does it for all USE_GECKO ports (www/firefox, www/firefox-esr and mail/thunderbird).

diff --git a/Mk/bsd.commands.mk b/Mk/bsd.commands.mk
index f1a229d04948..0d38d7b321bb 100644
--- a/Mk/bsd.commands.mk
+++ b/Mk/bsd.commands.mk
@@ -36,6 +36,7 @@ DIALOG4PORTS?=		${LOCALBASE}/bin/dialog4ports
 DIFF?=			/usr/bin/diff
 DIRNAME?=		/usr/bin/dirname
 EGREP?=			/usr/bin/egrep
+ELFCTL?=		/usr/bin/elfctl
 EXPR?=			/bin/expr
 FALSE?=			false	# Shell builtin
 FILE?=			/usr/bin/file
diff --git a/Mk/bsd.gecko.mk b/Mk/bsd.gecko.mk
index b58e697c52a9..1881080a9d87 100644
--- a/Mk/bsd.gecko.mk
+++ b/Mk/bsd.gecko.mk
@@ -110,6 +110,7 @@ PLISTF?=	${WRKDIR}/plist_files
 
 MOZCONFIG?=		${WRKSRC}/.mozconfig
 MOZILLA_PLIST_DIRS?=	bin lib share/pixmaps share/applications
+MOZILLA_ELFCTLFIX_BINS?=	lib/${MOZILLA}/${MOZILLA} lib/${MOZILLA}/${MOZILLA_BIN}
 
 # Adjust -C target-cpu if -march/-mcpu is set by bsd.cpu.mk
 .if ${ARCH} == amd64 || ${ARCH} == i386
@@ -376,7 +377,14 @@ pre-configure-script:
 	@${SETENV} CC="${CC}" OPSYS="${OPSYS}" OSVERSION="${OSVERSION}" WRKDIR="${WRKDIR}" \
 		${SH} ${SCRIPTSDIR}/rust-compat11-canary.sh
 
-post-install-script: gecko-create-plist
+post-install-script: gecko-elfctlfix gecko-create-plist
+
+gecko-elfctlfix:
+# Avoids "too much recursion" errors when the ASLR "stackgap" is globally enabled.
+.for bin in ${MOZILLA_ELFCTLFIX_BINS}
+	@if test -x ${ELFCTL} && ${ELFCTL} -l | ${GREP} -q aslrstkgap; then \
+		${ELFCTL} -e +aslrstkgap ${STAGEDIR}${PREFIX}/${bin}; fi
+.endfor
 
 gecko-create-plist:
 # Create the plist
Comment 9 Ed Maste freebsd_committer freebsd_triage 2021-01-13 19:26:28 UTC
See also https://reviews.freebsd.org/D28139 where I propose prefixing the disable flags with "no"

I've proposed an additional review to accept the flag with and without the "no" prefix for now, https://reviews.freebsd.org/D28140
Comment 10 commit-hook freebsd_committer freebsd_triage 2021-01-14 20:12:11 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=3dfcb70b6ae9bcb9fd6a66721bebdb8c6a53c329

commit 3dfcb70b6ae9bcb9fd6a66721bebdb8c6a53c329
Author:     Ed Maste <emaste@FreeBSD.org>
AuthorDate: 2021-01-13 19:21:38 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2021-01-14 20:09:08 +0000

    elfctl: add backwards compatibility for "no" prefixes

    I am going to prefix opt-out ELF feature flag names with "no" to make
    their meaning more clear (review D28139), but there are some uses of the
    existing names already (e.g., the PR referenced below).

    For now accept the older, unprefixed name as well, and emit a warning.
    We can revert this after FreeBSD 13 branches.

    % elfctl -e +aslr foo
    elfctl: interpreting aslr as noaslr; please specify noaslr

    PR:             239873 (related)
    MFC after:      1 week
    Sponsored by:   The FreeBSD Foundation
    Differential Revision: https://reviews.freebsd.org/D28140

 usr.bin/elfctl/elfctl.c | 10 ++++++++++
 1 file changed, 10 insertions(+)
Comment 11 commit-hook freebsd_committer freebsd_triage 2021-01-26 14:45:52 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=49d3dcb041f058880486e3489ca79c9476ac7abf

commit 49d3dcb041f058880486e3489ca79c9476ac7abf
Author:     Ed Maste <emaste@FreeBSD.org>
AuthorDate: 2021-01-13 19:21:38 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2021-01-26 14:43:42 +0000

    elfctl: add backwards compatibility for "no" prefixes

    I am going to prefix opt-out ELF feature flag names with "no" to make
    their meaning more clear (review D28139), but there are some uses of the
    existing names already (e.g., the PR referenced below).

    For now accept the older, unprefixed name as well, and emit a warning.
    We can revert this after FreeBSD 13 branches.

    % elfctl -e +aslr foo
    elfctl: interpreting aslr as noaslr; please specify noaslr

    PR:             239873 (related)
    MFC after:      1 week
    Sponsored by:   The FreeBSD Foundation
    Differential Revision: https://reviews.freebsd.org/D28140

    (cherry picked from commit 3dfcb70b6ae9bcb9fd6a66721bebdb8c6a53c329)

 usr.bin/elfctl/elfctl.c | 10 ++++++++++
 1 file changed, 10 insertions(+)
Comment 12 Alastair Hogge 2021-03-05 01:43:43 UTC
I just came across this issues updating 4 of desktop systems, thanks for the fixes, everyone.

(In reply to sigsys from comment #8)
Any chance this will be committed?
Comment 13 Kubilay Kocak freebsd_committer freebsd_triage 2021-06-09 01:26:44 UTC
^Triage:

 - What's left for this to proceed to resolution?
 - Who can/should be Assignee to coordinate resolution? Please self-assign, thanks!
Comment 14 sigsys 2021-06-11 22:28:46 UTC
Well, here's the patch updated to use the new feature names with the "no" prefix. Not using the new elfctl -i flag (which would have been perfect for this) because older 12.X might have an elfctl that doesn't support it yet.

And I tested it again to confirm that firefox/thunderbird still don't work with the stackgap (same error as before) and that when built and installed with this change they do work.

This seems very unlikely to cause any problems to me.  This is the first use of elfctl in the ports but not the first use of ELF tampering (brandelf is used for example).


diff --git a/Mk/bsd.commands.mk b/Mk/bsd.commands.mk
index 620c62eb1533..221a04e4b8f2 100644
--- a/Mk/bsd.commands.mk
+++ b/Mk/bsd.commands.mk
@@ -34,6 +34,7 @@ DIALOG4PORTS?=		${LOCALBASE}/bin/dialog4ports
 DIFF?=			/usr/bin/diff
 DIRNAME?=		/usr/bin/dirname
 EGREP?=			/usr/bin/egrep
+ELFCTL?=		/usr/bin/elfctl
 EXPR?=			/bin/expr
 FALSE?=			false	# Shell builtin
 FILE?=			/usr/bin/file
diff --git a/Mk/bsd.gecko.mk b/Mk/bsd.gecko.mk
index 3a48e802ea40..81ed6b7fc6dd 100644
--- a/Mk/bsd.gecko.mk
+++ b/Mk/bsd.gecko.mk
@@ -111,6 +111,7 @@ PLISTF?=	${WRKDIR}/plist_files
 
 MOZCONFIG?=		${WRKSRC}/.mozconfig
 MOZILLA_PLIST_DIRS?=	bin lib share/pixmaps share/applications
+MOZILLA_ELFCTLFIX_BINS?=	lib/${MOZILLA}/${MOZILLA} lib/${MOZILLA}/${MOZILLA_BIN}
 
 # Adjust -C target-cpu if -march/-mcpu is set by bsd.cpu.mk
 .if ${ARCH} == amd64 || ${ARCH} == i386
@@ -385,7 +386,14 @@ pre-configure-script:
 	@${SETENV} CC="${CC}" OPSYS="${OPSYS}" OSVERSION="${OSVERSION}" WRKDIR="${WRKDIR}" \
 		${SH} ${SCRIPTSDIR}/rust-compat11-canary.sh
 
-post-install-script: gecko-create-plist
+post-install-script: gecko-elfctlfix gecko-create-plist
+
+gecko-elfctlfix:
+# Avoids "too much recursion" errors when the ASLR "stackgap" is globally enabled.
+.for bin in ${MOZILLA_ELFCTLFIX_BINS}
+	@if test -x ${ELFCTL} && ${ELFCTL} -l | ${GREP} -q noaslrstkgap; then \
+		${ELFCTL} -e +noaslrstkgap ${STAGEDIR}${PREFIX}/${bin}; fi
+.endfor
 
 gecko-create-plist:
 # Create the plist
Comment 15 Evgenii Khramtsov 2021-09-02 18:44:09 UTC
Kernel with these patches fixes this here on 14-CURRENT/amd64:
https://reviews.freebsd.org/D31516
https://reviews.freebsd.org/D31692
Comment 16 Dawid Gorecki 2021-09-10 15:44:27 UTC
Hi,

I abandoned D31692 and created an updated solution, you can see all the patches needed to resolve the problem in revisions below:
https://reviews.freebsd.org/D31516
https://reviews.freebsd.org/D31897
https://reviews.freebsd.org/D31898
I think those patches should resolve most, if not all, of the issues currently seen with stack gap.
Comment 17 commit-hook freebsd_committer freebsd_triage 2021-10-15 08:24:20 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=78df56ccfcb40013a3e6904bd6d39836220c3550

commit 78df56ccfcb40013a3e6904bd6d39836220c3550
Author:     Dawid Gorecki <dgr@semihalf.com>
AuthorDate: 2021-10-13 19:06:05 +0000
Commit:     Marcin Wojtas <mw@FreeBSD.org>
CommitDate: 2021-10-15 08:21:56 +0000

    libthr: Use kern.stacktop for thread stack calculation.

    Use the new kern.stacktop sysctl to retrieve the address of stack top
    instead of kern.usrstack. kern.usrstack does not have any knowledge
    of the stack gap, so this can cause problems with thread stacks.
    Using kern.stacktop sysctl should fix most of those problems.
    kern.usrstack is used as a fallback when kern.stacktop cannot be read.

    Rename usrstack variables to stacktop to reflect this change.

    Fixes problems with firefox and thunderbird not starting with
    stack gap enabled.

    PR: 239873
    Reviewed by: kib
    Obtained from: Semihalf
    Sponsored by: Stormshield
    MFC after: 1 month
    Differential Revision: https://reviews.freebsd.org/D31898

 lib/libthr/thread/thr_init.c    | 19 +++++++++++--------
 lib/libthr/thread/thr_private.h |  2 +-
 lib/libthr/thread/thr_stack.c   | 22 ++++++++++++----------
 3 files changed, 24 insertions(+), 19 deletions(-)
Comment 18 sigsys 2021-10-18 18:15:32 UTC
Nice to see a complete fix for this. No issues that I can tell with firefox (or any other programs so far) with stackgap enabled after a few hours of use. It just works now.
Comment 19 Ed Maste freebsd_committer freebsd_triage 2021-10-19 20:03:05 UTC
(In reply to sigsys from comment #18)
Thank you for testing and the follow-up!
Comment 20 commit-hook freebsd_committer freebsd_triage 2021-12-30 15:37:44 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d8247df24c1d3bacadd60a6633d76fb6cb56fd0e

commit d8247df24c1d3bacadd60a6633d76fb6cb56fd0e
Author:     Dawid Gorecki <dgr@semihalf.com>
AuthorDate: 2021-10-13 19:06:05 +0000
Commit:     Marcin Wojtas <mw@FreeBSD.org>
CommitDate: 2021-12-30 15:25:25 +0000

    libthr: Use kern.stacktop for thread stack calculation.

    Use the new kern.stacktop sysctl to retrieve the address of stack top
    instead of kern.usrstack. kern.usrstack does not have any knowledge
    of the stack gap, so this can cause problems with thread stacks.
    Using kern.stacktop sysctl should fix most of those problems.
    kern.usrstack is used as a fallback when kern.stacktop cannot be read.

    Rename usrstack variables to stacktop to reflect this change.

    Fixes problems with firefox and thunderbird not starting with
    stack gap enabled.

    PR: 239873
    Reviewed by: kib
    Obtained from: Semihalf
    Sponsored by: Stormshield
    MFC after: 1 month
    Differential Revision: https://reviews.freebsd.org/D31898

    (cherry picked from commit 78df56ccfcb40013a3e6904bd6d39836220c3550)

 lib/libthr/thread/thr_init.c    | 19 +++++++++++--------
 lib/libthr/thread/thr_private.h |  2 +-
 lib/libthr/thread/thr_stack.c   | 22 ++++++++++++----------
 3 files changed, 24 insertions(+), 19 deletions(-)