Created attachment 199211 [details] test program Hi, I'm having the following crash in rtld on aarch64 when a program uses dlopen, pthread and tls variables with the test program available at [1] : Program terminated with signal SIGSEGV, Segmentation fault. #0 free_tls (tcb=0x4028e010, tcbsize=16, tcbalign=16) at /usr/src/libexec/rtld-elf/rtld.c:4842 4842 dtvsize = dtv[1]; (gdb) bt #0 free_tls (tcb=0x4028e010, tcbsize=16, tcbalign=16) at /usr/src/libexec/rtld-elf/rtld.c:4842 #1 0x0000000040235910 in _rtld_free_tls (tcb=0x4028e010, tcbsize=16, tcbalign=<optimized out>) at /usr/src/libexec/rtld-elf/rtld.c:5062 #2 0x00000000402acde4 in _thr_free (curthread=0x406c4000, thread=0x406c4500) at /usr/src/lib/libthr/thread/thr_list.c:199 #3 0x00000000402accf0 in _thr_gc (curthread=0x406c4000) at /usr/src/lib/libthr/thread/thr_list.c:129 #4 0x00000000402ad164 in _thr_alloc (curthread=0x406c4000) at /usr/src/lib/libthr/thread/thr_list.c:141 #5 0x00000000402a2124 in _pthread_create (thread=0xffffffffe948, attr=0x0, start_routine=0x406d906c <do_something>, arg=0x0) at /usr/src/lib/libthr/thread/thr_create.c:81 #6 0x0000000000210364 in main () (gdb) p *0x4028e010 $1 = 666 The tcb points to my __thread variable which seems wrong. I don't have the knowledge to debug this problem further so any help will be greatly appreciated. It crashes on 11.2-RELEASE and 13.0-CURRENT r340197. [1] http://mikael.urankar.free.fr/FreeBSD/aarch64/test.c http://mikael.urankar.free.fr/FreeBSD/aarch64/test_lib.c
Is this strictly dlopen related? I'm seeing a weird segfault in LDC: https://github.com/ldc-developers/druntime/pull/146#issuecomment-412674323 — doesn't look similar, the crash is not *in* rtld, but it's still something related to aarch64+TLS
Yep, TLS relocation for dlopened libraries is broken on Aarch64. Can you test (still WIP) patch? https://github.com/strejda/freebsd/commit/981459604061136fc68c020ff6124fab0d1196aa
(In reply to Michal Meloun from comment #2) Unfortunately it still crashes, the backtrace is different: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00000000406d9094 in do_something (arg=0x0) at test_lib.c:8 8 a = 666; [Current thread is 1 (LWP 100642)] (gdb) bt #0 0x00000000406d9094 in do_something (arg=0x0) at test_lib.c:8 #1 0x00000000402a28c4 in thread_start (curthread=0x406c5900) at /usr/src/lib/libthr/thread/thr_create.c:292 #2 0x00000000402a2470 in _pthread_create (thread=0xffffffffe988, attr=<optimized out>, start_routine=<optimized out>, arg=<optimized out>) at /usr/src/lib/libthr/thread/thr_create.c:188 #3 0x0000000000210394 in main () at test.c:33
your current wip patch fixes bug #232149 :)
Final (and much more complex) version of this patch is under review now: https://reviews.freebsd.org/D18417 Michal
A commit references this bug: Author: mmel Date: Sat Dec 15 10:38:10 UTC 2018 New revision: 342113 URL: https://svnweb.freebsd.org/changeset/base/342113 Log: Improve R_AARCH64_TLSDESC relocation. The original code did not support dynamically loaded libraries and used suboptimal access to TLS variables. New implementation removes lazy resolving of TLS relocation - due to flaw in TLSDESC design is impossible to switch resolver function at runtime without expensive locking. Due to this, 3 specialized resolvers are implemented: - load time resolver for TLS relocation from libraries loaded with main executable (thus with known TLS offset). - resolver for undefined thread weak symbols. - slower lazy resolver for dynamically loaded libraries with fast path for already resolved symbols. PR: 228892, 232149, 233204, 232311 MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18417 Changes: head/libexec/rtld-elf/aarch64/reloc.c head/libexec/rtld-elf/aarch64/rtld_start.S head/libexec/rtld-elf/amd64/reloc.c head/libexec/rtld-elf/arm/reloc.c head/libexec/rtld-elf/i386/reloc.c head/libexec/rtld-elf/mips/reloc.c head/libexec/rtld-elf/powerpc/reloc.c head/libexec/rtld-elf/powerpc64/reloc.c head/libexec/rtld-elf/riscv/reloc.c head/libexec/rtld-elf/rtld.c head/libexec/rtld-elf/rtld.h head/libexec/rtld-elf/sparc64/reloc.c
Is the issue solved after the commit? Than please close the PR. Currently it blocks 232311 which is already solved by this commit. But it can not be closed with this issue being closed. ;)