Bug 250707 - mail/rspamd: core dump on 12.2-RELEASE
Summary: mail/rspamd: core dump on 12.2-RELEASE
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Many People
Assignee: Kyle Evans
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-29 01:51 UTC by Thomas Morper
Modified: 2020-10-29 23:55 UTC (History)
5 users (show)

See Also:
bugzilla: maintainer-feedback? (vsevolod)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Morper 2020-10-29 01:51:50 UTC
On 12.2-RELEASE rspamd coredumps on startup due to an issue with luajit.

There's a detailed bug report in https://github.com/rspamd/rspamd/issues/3386

Rebuilding luajit on 12.2 solves the issue, but the official package repository is still built on 12.1 and will be for some time, resulting in a broken rspamd package.

While the issue roots in luajit I've been unable to reproduce it with anything else than rspamd.
Comment 1 Andrew "RhodiumToad" Gierth 2020-10-29 04:26:11 UTC
I have a reproducer without rspamd:

#include <lua.h>
#include <lauxlib.h>

int main(void)
{
    lua_State *L = luaL_newstate();
    if (!luaL_loadstring(L, "error([[foo!]])"))
        lua_pcall(L, 0, 0, 0);
    return 0;
}

cc -g -I/usr/local/include/luajit-2.0 luaexc.c -L/usr/local/lib -lluajit-5.1

This segfaults if built against the 12.1-built luajit package, even when building on 12.2. But it works fine if built against a 12.2-built luajit.

The problem here is an alignment restriction on the _Unwind_Exception structure. The compiler (both on 12.1 and 12.2) and luajit itself declare this structure to require 16-byte alignment, but for whatever reason, the 12.1 build when running on 12.2 is getting an alignment of only 8 bytes from the thread-local storage pointer returned by __tls_get_addr. The misalignment causes a bus error when a Lua error is thrown. When luajit is rebuilt on 12.2, the alignment is correct.

Possibly relevantly, this is an entry from objdump of the program headers of the 12.1-built libluajit.so:

     TLS off    0x0000000000084578 vaddr 0x0000000000084580 paddr 0x0000000000084580 align 2**4
         filesz 0x0000000000000000 memsz 0x0000000000000020 flags r--

notice the offset is not 16-byte aligned, I have no idea if this even matters, while the 12.2-built one has this:

     TLS off    0x0000000000082470 vaddr 0x0000000000083470 paddr 0x0000000000083470 align 2**4
         filesz 0x0000000000000000 memsz 0x0000000000000020 flags r--
Comment 2 Kyle Evans freebsd_committer freebsd_triage 2020-10-29 04:30:29 UTC
Tagging kib@ as well for his wealth of knowledge on TLS/ELF stuff.
Comment 3 Kyle Evans freebsd_committer freebsd_triage 2020-10-29 05:50:52 UTC
On a hunch, I reverted rtld r360067 ("
Make p_vaddr % p_align == p_offset % p_align for (some) TLS segments.") and the reproducer no longer gets hit by a SIGBUS. Did this commit require some compiler support that isn't in llvm 8.0.1?
Comment 4 Konstantin Belousov freebsd_committer freebsd_triage 2020-10-29 11:46:05 UTC
(In reply to Kyle Evans from comment #3)
It is a linker mis-feature (or bug).  See the long story in the LLVM review
mentioned in r359634, https://reviews.llvm.org/D64930.

I.e. phdr requested 16 bytes alignment, with offset 8 mod 16.  I do not see how
could we detect such broken binaries automatically.  Of course I can add a knob
to manually request pre-r359634 behavior, but I am not sure that users could
easily determine that this is the workaround for their issue.
Comment 5 Kyle Evans freebsd_committer freebsd_triage 2020-10-29 16:41:11 UTC
(In reply to Konstantin Belousov from comment #4)

I think there's no reason to add a knob until we come across some hard case where recompiling isn't an option at all. In this case, I suspect our best option is to tack an llvm10 dependency on luajit for FreeBSD/12.1 to force a rebuild with external linker that in theory won't produce a binary broken in this fashion.
Comment 6 commit-hook freebsd_committer freebsd_triage 2020-10-29 23:53:49 UTC
A commit references this bug:

Author: kevans
Date: Thu Oct 29 23:53:02 UTC 2020
New revision: 553656
URL: https://svnweb.freebsd.org/changeset/ports/553656

Log:
  lang/luajit: switch to LLVM10 from ports for 12.1/amd64

  12.1 shipped with LLVM 8.0.1 which links libluajit with a bogus (improperly
  aligned) TLS segment offset. Notably, this breaks under 12.2 rtld and causes
  a SIGBUS when an error is raised.

  Since the issue is technically a broken binary, the attached patch pins
  12.1/amd64 builds of luajit to devel/llvm10 so that they can be rebuilt with
  a linker that will handle this properly and stop breaking luajit-dependant
  applications on 12.2 while the packages are still built on 12.1. This will
  naturally fall away when portmgr goes to axe conditionals solely for FreeBSD
  12.1 after it goes EOL.

  The src/Makefile patch has been dropped in this version in favor of just
  supplying the variables it was unsetting via Make arguments as a minor
  cleanup.

  PR:		250707, 250726
  Reported by:	many
  Investigation by:	Andrew Gierth <andrew tao11 riddles org uk>
  Confirmation from:	kib
  Approved by:	osa (maintainer)
  MFH:		2020Q4 (blanket: runtime fix)

Changes:
  head/lang/luajit/Makefile
  head/lang/luajit/files/patch-src_Makefile
Comment 7 commit-hook freebsd_committer freebsd_triage 2020-10-29 23:53:53 UTC
A commit references this bug:

Author: kevans
Date: Thu Oct 29 23:53:36 UTC 2020
New revision: 553657
URL: https://svnweb.freebsd.org/changeset/ports/553657

Log:
  MFH: r553656

  lang/luajit: switch to LLVM10 from ports for 12.1/amd64

  12.1 shipped with LLVM 8.0.1 which links libluajit with a bogus (improperly
  aligned) TLS segment offset. Notably, this breaks under 12.2 rtld and causes
  a SIGBUS when an error is raised.

  Since the issue is technically a broken binary, the attached patch pins
  12.1/amd64 builds of luajit to devel/llvm10 so that they can be rebuilt with
  a linker that will handle this properly and stop breaking luajit-dependant
  applications on 12.2 while the packages are still built on 12.1. This will
  naturally fall away when portmgr goes to axe conditionals solely for FreeBSD
  12.1 after it goes EOL.

  The src/Makefile patch has been dropped in this version in favor of just
  supplying the variables it was unsetting via Make arguments as a minor
  cleanup.

  PR:		250707, 250726
  Reported by:	many
  Investigation by:	Andrew Gierth <andrew tao11 riddles org uk>
  Confirmation from:	kib
  Approved by:	osa (maintainer)

  Approved by:	ports-secteam (implicit, runtime fix)

Changes:
_U  branches/2020Q4/
  branches/2020Q4/lang/luajit/Makefile
  branches/2020Q4/lang/luajit/files/patch-src_Makefile
Comment 8 Kyle Evans freebsd_committer freebsd_triage 2020-10-29 23:55:56 UTC
The new package will be available within ~days, thanks for the report and triage work!