w and uptime are broken on i386 gdb trace: (gdb) r Starting program: /usr/obj/usr/src/usr.bin/w/w.full Program received signal SIGSEGV, Segmentation fault. ifree (tsd=0x28000000) at arena.h:799 799 return (*mapbitsp); Current language: auto; currently minimal (gdb) where #0 ifree (tsd=0x28000000) at arena.h:799 #1 0x28155316 in __free (ptr=0x280601ef) at tsd.h:716 #2 0x28095b07 in xo_do_emit_fields () at /usr/src/contrib/libxo/libxo/libxo.c:6419 #3 0x28093a1c in xo_do_emit (xop=<value optimized out>, flags=<value optimized out>, fmt=0x804ad4d "{:time-of-day/%s} ") at /usr/src/contrib/libxo/libxo/libxo.c:6470 #4 0x28093b61 in xo_emit (fmt=0x804ad4d "{:time-of-day/%s} ") at /usr/src/contrib/libxo/libxo/libxo.c:6541 #5 0x08049f50 in main (argc=<value optimized out>, argv=<value optimized out>) at /usr/src/usr.bin/w/w.c:475 (gdb)
What I can see is that only on my i386 it coredumps. Not on my amd64 Last Changed Rev: 331722 Last Changed Date: 2018-03-29 04:50:57 +0200 (Thu, 29 Mar 2018)
currently tried a 11.2-PRERELEASE Memstick https://download.freebsd.org/ftp/snapshots/i386/i386/ISO-IMAGES/11.2/FreeBSD-11.2-PRERELEASE-i386-20180420-r332802-memstick.img.xz And, also here w and uptime are broken. JUST FYI
Nothing useful to add to diagnosis, just "bump" and "me too".
its still there... look: Starting program: /usr/obj/usr/src/usr.bin/w/w.full Program received signal SIGSEGV, Segmentation fault. ifree (tsd=0x28000000) at arena.h:799 799 return (*mapbitsp); Current language: auto; currently minimal (gdb) bt #0 ifree (tsd=0x28000000) at arena.h:799 #1 0x28155506 in __free (ptr=0x280601ef) at tsd.h:716 #2 0x28095b07 in xo_do_emit_fields() at /usr/src/contrib/libxo/libxo/libxo.c:6419 #3 0x28093a1c in xo_do_emit (xop=<value optimized out>, flags=<value optimized out>, fmt=0x804ad4d "{:time-of-day/%s} ") at /usr/src/contrib/libxo/libxo/libxo.c:6470 #4 0x28093b61 in xo_emit (fmt=0x804ad4d "{:time-of-day/%s} ") at /usr/src/contrib/libxo/libxo/libxo.c:6541 #5 0x08049f50 in main (argc=<value optimized out>, argv=<value optimized out>) at /usr/src/usr.bin/w/w.c:475 and xo_do_emit_fields is from libxo, or ? and w.c is so complex meanwile, not only because of libxo. So what.....
Confirmed. Looks like the problem is in libxo. Either xo_default_handle is not properly zero-initialized or there is a memory corruption
(In reply to Oleksandr Tymoshenko from comment #7) Not in libxo itself, there were no recent changes there but the bug manifests itself in libxo
xo_default_handle is passed as an argument to xo_init_handle so I added a breakpoint and checked its content. Since it's static it's supposed to be zero-initialized but instead there are a lot of garbage values. xo_default_handle is thread-local variable so it might be a contributing factor. root@freebsd:/home/gonzo # /usr/local/bin/gdb w GNU gdb (GDB) 8.1 [GDB v8.1 for FreeBSD] Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i386-portbld-freebsd11.1". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from w...Reading symbols from /usr/lib/debug//usr/bin/w.debug...done. done. (gdb) break xo_init_handle Function "xo_init_handle" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (xo_init_handle) pending. (gdb) run Starting program: /usr/bin/w Breakpoint 1, xo_init_handle (xop=0x2806aff0) at /usr/src/contrib/libxo/libxo/libxo.c:640 640 xop->xo_opaque = stdout; (gdb) p *xop $1 = {xo_flags = 0, xo_iflags = 2883994737386192896, xo_style = 45072, xo_indent = 10246, xo_indent_by = 41472, xo_write = 0x1, xo_close = 0x280601ef <__pthread_cleanup_push_imp_int+31>, xo_flush = 0x2806b020, xo_formatter = 0x2806a400, xo_checkpointer = 0x5d, xo_opaque = 0x280601ef <__pthread_cleanup_push_imp_int+31>, xo_data = {xb_bufp = 0x2806b030 "@\260\006(", xb_curp = 0x2806a600 "z\270P\325\001", xb_size = 161}, xo_fmt = {xb_bufp = 0x280601ef <__pthread_cleanup_push_imp_int+31> "\377\220\344\001", xb_curp = 0x2806b040 "", xb_size = 671524864}, xo_attrs = { xb_bufp = 0x147 <error: Cannot access memory at address 0x147>, xb_curp = 0x280601ef <__pthread_cleanup_push_imp_int+31> "\377\220\344\001", xb_size = 0}, xo_predicate = {xb_bufp = 0x2806aa00 "z\270P\325\001", xb_curp = 0x164 <error: Cannot access memory at address 0x164>, xb_size = 671482351}, xo_stack = 0x2806c000, xo_depth = 0, xo_stack_size = 671506032, xo_info = 0x280601ef <__pthread_cleanup_push_imp_int+31>, xo_info_count = 671527024, xo_vap = 0x2806ac00 "z\270P\325\001", xo_leading_xpath = 0x421 <error: Cannot access memory at address 0x421>, xo_mbstate = { __mbstate8 = "\357\001\006(\000\000\000\000\000\252\006(-\004\000\000\357\001\006(\000\000\000\000\000\252\006(\377\001\000\000\357\001\006(\240\260\006(\000\250\006(f\t\000\000\357\001\006(\000\000\000\000\000\252\006(s\t\000\000\357\001\006(\000\000\000\000\000\252\006(\030\n\000\000\357\001\006(\000\000\000\000\000\252\006(q\005\000\000\357\001\006(\340\260\006(\000\240\006(\000\000\000\000\357\001\006(\360\260\006(\000\242\006(\000\000\000", _mbstateL = 671482351}, xo_anchor_offset = 671482351, xo_anchor_columns = 671527168, xo_anchor_min_width = 671523840, xo_units_offset = 0, xo_columns = 671482351, xo_color_map_fg = "\020\261\006(\000\246\006(", xo_color_map_bg = "\000\000\000\357\001\006( \261", xo_colors = {xoc_effects = 6 '\006', xoc_col_fg = 40 '(', xoc_col_bg = 0 '\000'}, xo_color_buf = {xb_bufp = 0x0, xb_curp = 0x280601ef <__pthread_cleanup_push_imp_int+31> "\377\220\344\001", xb_size = 671527216}, xo_version = 0x2806aa00 "z\270P\325\001", xo_errno = 0, xo_gt_domain = 0x280601ef <__pthread_cleanup_push_imp_int+31> "\377\220\344\001", xo_encoder = 0x0, xo_private = 0x2806ac00} (gdb)
(In reply to Oleksandr Tymoshenko from comment #9) > xo_default_handle is passed as an argument to xo_init_handle so I added a > breakpoint and checked its content. Since it's static it's supposed to be > zero-initialized but instead there are a lot of garbage values. > xo_default_handle is thread-local variable so it might be a contributing > factor. ... > (gdb) p *xop > $1 = {xo_flags = 0, xo_iflags = 2883994737386192896, xo_style = 45072, > xo_indent = 10246, xo_indent_by = 41472, xo_write = 0x1, > xo_close = 0x280601ef <__pthread_cleanup_push_imp_int+31>, xo_flush = It definitely seems to have something to do with TLS. The libxo.so.0 file shipped in the FreeBSD-11.2-PRERELEASE-i386-20180420-r332802 snapshot has: Program Header: LOAD off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12 filesz 0x00017160 memsz 0x00017160 flags r-x LOAD off 0x00017160 vaddr 0x00018160 paddr 0x00018160 align 2**12 filesz 0x00000604 memsz 0x00000654 flags rw- DYNAMIC off 0x00017264 vaddr 0x00018264 paddr 0x00018264 align 2**2 filesz 0x000000d8 memsz 0x000000d8 flags rw- TLS off 0x00017160 vaddr 0x00018764 paddr 0x00018764 align 2**3 filesz 0x00000000 memsz 0x00000050 flags r-- STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**2 filesz 0x00000000 memsz 0x00000000 flags rw- but if I install this snapshot onto a machine, check out stable/11 r332802 and rebuild lib/libxo, the resulting libxo.so.0 has: Program Header: LOAD off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12 filesz 0x00017160 memsz 0x00017160 flags r-x LOAD off 0x00017160 vaddr 0x00018160 paddr 0x00018160 align 2**12 filesz 0x00000604 memsz 0x00000654 flags rw- DYNAMIC off 0x00017264 vaddr 0x00018264 paddr 0x00018264 align 2**2 filesz 0x000000d8 memsz 0x000000d8 flags rw- TLS off 0x00017160 vaddr 0x00018160 paddr 0x00018160 align 2**3 filesz 0x00000000 memsz 0x00000658 flags r-- STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**2 filesz 0x00000000 memsz 0x00000000 flags rw- E.g. the shipped version has a TLS section of just 0x50 bytes, while the recompiled version has 0x658 bytes. The recompiled version also works just fine, with every test I throw at it. I don't know how the shipped versions are built, but I suspect there is something off there.
I'm now not so sure anymore about the TLS section being a problem. On a stable/11 i386 box with r332318 (as of 2018-04-09), I do *not* see crashes in w or uptime, even though the TLS section appears to be 0x50 bytes: $ ldd /usr/bin/uptime /usr/bin/uptime: libkvm.so.7 => /lib/libkvm.so.7 (0x28070000) libsbuf.so.6 => /lib/libsbuf.so.6 (0x2807d000) libxo.so.0 => /lib/libxo.so.0 (0x28080000) libutil.so.9 => /lib/libutil.so.9 (0x28099000) libc.so.7 => /lib/libc.so.7 (0x280ab000) libelf.so.2 => /lib/libelf.so.2 (0x2820a000) $ uptime 2:36PM up 21 mins, 1 user, load averages: 0.32, 0.26, 0.23 $ readelf -l /lib/libxo.so.0 | grep 'Type\|TLS' Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align TLS 0x017160 0x00018764 0x00018764 0x00000 0x00050 R 0x8 $ readelf -l /usr/obj/usr/src/lib/libxo/libxo.so.0.full | grep 'Type\|TLS' Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align TLS 0x017160 0x00018160 0x00018160 0x00000 0x00658 R 0x8 $ readelf -l /usr/obj/usr/src/lib/libxo/libxo.so.0 | grep 'Type\|TLS' Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align TLS 0x017160 0x00018160 0x00018160 0x00000 0x00654 R 0x8 So, libxo.so.0.full, the actual output of the link stage, has a TLS MemSize of 0x658 bytes, libxo.so.0, which is produced by: objcopy --strip-debug --add-gnu-debuglink=libxo.so.0.debug libxo.so.0.full libxo.so.0 has a TLS MemSize of 0x654 bytes, and the final version installed by installworld, and stripped during that time, has a TLS MemSize of 0x50 bytes. However, at this revision, r332318, it does not crash.
I bisected, and it turns out r331838 (the merge of clang 6.0.0 and follow-up fixes) is the first revision with those segfaults: # ulimit -c 0; for i in /jail/test-r*; do echo "Using jail: $i"; chroot $i /usr/bin/w; done Using jail: /jail/test-r331837 12:00PM up 13:47, 0 users, load averages: 0.23, 0.24, 0.60 USER TTY FROM LOGIN@ IDLE WHAT Using jail: /jail/test-r331838 Segmentation fault Since all of the jail in r331837 has been compiled with clang 5.0.1, and all of r331838 with clang 6.0.0, it is hard to say what is the exact cause. Interestingly, moving around the libraries used by w seems to influence the crash, at least for me. So for example: $ ldd /usr/bin/w /usr/bin/w: libkvm.so.7 => /lib/libkvm.so.7 (0x28070000) libsbuf.so.6 => /lib/libsbuf.so.6 (0x2807d000) libxo.so.0 => /lib/libxo.so.0 (0x28080000) libutil.so.9 => /lib/libutil.so.9 (0x28099000) libc.so.7 => /lib/libc.so.7 (0x280ab000) libelf.so.2 => /lib/libelf.so.2 (0x2820a000) $ /usr/bin/w 2:05PM up 13:53, 2 users, load averages: 2.31, 0.76, 0.66 USER TTY FROM LOGIN@ IDLE WHAT dim pts/2 coleburn.home.andric.com 2:02PM - w $ mkdir ~/foo $ cp /lib/libkvm.so.7 /lib/libsbuf.so.6 /lib/libxo.so.0 /lib/libutil.so.9 /lib/libc.so.7 /lib/libelf.so.2 ~/foo $ LD_LIBRARY_PATH=~/foo /usr/bin/w Segmentation fault (core dumped) Meaning, the exact same .so files, but in a different path, crash! Currently, I'm thinking that this may be something in the dynamic linker, but I'm still not sure.
Note that libc has a 16 byte aligned TLS section because of JEMALLOC_ALIGNED(16) in contrib/jemalloc/include/jemalloc/internal/tsd.h while the size of the TLS section is not a multiple of 16. I reported a problem with this when that was added. I suspect that rtld doesn't allocate enough extra bytes if it needs to realign the section causing overlap between sections, but I never investigated that and simply made the jemalloc struct 8 byte aligned.
(In reply to Tijl Coosemans from comment #13) > Note that libc has a 16 byte aligned TLS section because of > JEMALLOC_ALIGNED(16) in contrib/jemalloc/include/jemalloc/internal/tsd.h > while the size of the TLS section is not a multiple of 16. I reported a > problem with this when that was added. I suspect that rtld doesn't allocate > enough extra bytes if it needs to realign the section causing overlap > between sections, but I never investigated that and simply made the jemalloc > struct 8 byte aligned. Were there any updates to rtld in head for this alignment stuff, that you recall?
(In reply to Dimitry Andric from comment #14) Not that I recall, but I just tried to reproduce the problem I had back then and everything seems fine now, so it's possible that it was fixed.
Not sure if this sheds light on bug, but tracked it down to release #331838. #331837 uptime/w both work fine. Something in the Clang/LLVM update?
(In reply to Paul Boehmer from comment #16) Derp, didn't notice comment 12. Apologies for the noise.
Adding my recent email to freebsd-arch@: From: Phil Shafer <phil@juniper.net> To: <freebsd-arch@freebsd.org> Subject: initialization problem w/ thread-specific .tbss data on i386 Date: Mon, 07 May 2018 17:27:03 -0400 I have a problem reported with libxo-based applications running under FreeBSD-11-stable on i386 boxes that I think is related to rtld: When I breakpoint on main() and dump the contents of my uninitialized thread-specific variable, it has not been initialized to zeroes. I don't see this problem on 64-bit systems, only on i386 ones. When I look at the rtld code, it appears to memset the .tbss to zero (/usr/src/libexec/rtld-elf/rtld.c:allocate_tls) in the non-arch-specific code so the arch shouldn't matter, but something is not working right. So I'm looking for a helpful clue, such as how to debug rtld to see why this isn't being zeroed. I thought I'd use: gdb /libexec/ld-elf.so.1 run /usr/bin/uptime for this doesn't work for me (SEGV with a callstack that doesn't make sense). For this instance, the work around is to initialize the contents of xo_default_handle to zero so it's not in the .tbss, but I'd like to understand what's failing. In truth, I just have a hard time blaming rtld, even though this is issue is an obscure intersection of weird things (.tbbs on i386). Perhaps it's something wrong with how the library is built or similar. But given that it's not zeroed when main() get control, something's clearly broken. Details follow: I declare my variable as: #define THREAD_LOCAL(_x) __thread _x ... static THREAD_LOCAL(xo_handle_t) xo_default_handle; To help debug this issue, I made the following change to the sources to help with gdb's inability to show thread-local variables ("Cannot find thread-local variables on this target"): --- contrib/libxo/libxo/libxo.c.save 2018-05-04 17:26:29.079500000 -0400 +++ contrib/libxo/libxo/libxo.c 2018-05-04 17:28:06.570875000 -0400 @@ -8349,3 +8349,11 @@ xop->xo_style = XO_STYLE_ENCODER; xop->xo_encoder = encoder; } + +void xo_print_handle (void); +void +xo_print_handle (void) +{ + fprintf(stderr, "xo_default_handle: %p %d\n", + &xo_default_handle, sizeof(xo_handle_t)); +} When I run the failing command (uptime) under gdb and breakpoint on main, my thread-local variable is not set to zeroes: % gdb uptime GNU gdb 6.1.1 [FreeBSD] ... This GDB was configured as "i386-marcel-freebsd"... (gdb) b main Breakpoint 1 at 0x8049be5: file /usr/src/usr.bin/w/w.c, line 145. (gdb) run Starting program: /usr/home/phil/work/lib/uptime Breakpoint 1, main (argc=1, argv=0xbfbfe60c) at /usr/src/usr.bin/w/w.c:145 145 (void)setlocale(LC_ALL, ""); Current language: auto; currently minimal (gdb) call xo_print_handle() xo_default_handle: 0x2806aff0 328 $1 = 34 (gdb) x/82x 0x2806aff0 0x2806aff0: 0x00000000 0x00000000 0x00000000 0x280601ef 0x2806b000: 0x2806b010 0x2806a200 0x00000001 0x280601ef 0x2806b010: 0x2806b020 0x2806a400 0x0000005d 0x280601ef 0x2806b020: 0x2806b030 0x2806a600 0x000000a1 0x280601ef 0x2806b030: 0x2806b040 0x2806a800 0x00000147 0x280601ef 0x2806b040: 0x00000000 0x2806aa00 0x00000164 0x280601ef 0x2806b050: 0x2806c000 0x00000000 0x28065e70 0x280601ef 0x2806b060: 0x2806b070 0x2806ac00 0x00000421 0x280601ef 0x2806b070: 0x00000000 0x2806aa00 0x0000042d 0x280601ef 0x2806b080: 0x00000000 0x2806aa00 0x000001ff 0x280601ef 0x2806b090: 0x2806b0a0 0x2806a800 0x00000976 0x280601ef 0x2806b0a0: 0x00000000 0x2806aa00 0x00000983 0x280601ef 0x2806b0b0: 0x00000000 0x2806aa00 0x00000a18 0x280601ef 0x2806b0c0: 0x00000000 0x2806aa00 0x00000571 0x280601ef 0x2806b0d0: 0x2806b0e0 0x2806a000 0x00000000 0x280601ef 0x2806b0e0: 0x2806b0f0 0x2806a200 0x00000000 0x280601ef 0x2806b0f0: 0x2806b100 0x2806a400 0x00000000 0x280601ef 0x2806b100: 0x2806b110 0x2806a600 0x00000000 0x280601ef 0x2806b110: 0x2806b120 0x2806a800 0x00000000 0x280601ef 0x2806b120: 0x2806b130 0x2806aa00 0x00000000 0x280601ef 0x2806b130: 0x00000000 0x2806ac00 (gdb) objdump shows the lib does have a .tbbs: 14 .tbss 00000658 000181f8 000181f8 000171f8 2**3 ALLOC, THREAD_LOCAL Thanks, Phil
The work around is: @@ -1376,8 +1380,8 @@ xo_retain_entry_t *xr_bucket[RETAIN_HASH_SIZE]; } xo_retain_t; -static THREAD_LOCAL(xo_retain_t) xo_retain; -static THREAD_LOCAL(unsigned) xo_retain_count; +static THREAD_LOCAL(xo_retain_t) xo_retain = { 0 }; +static THREAD_LOCAL(unsigned) xo_retain_count = 0; /* * Simple hash function based on Thomas Wang's paper. The original is Thanks, Phil
Hmm, now that we've identified .tbss as a contributor to the problem, it looks relevant that the r331838 version of libxo.so.0 (compiled with the clang 6.0.0 update) does NOT have a "section to segment mapping" for .tbss: ====================================================================== File: libxo.so.0.r331837 [...] Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x17224 0x17224 R E 0x1000 LOAD 0x017228 0x00018228 0x00018228 0x0074d 0x007a0 RW 0x1000 DYNAMIC 0x017324 0x00018324 0x00018324 0x000d8 0x000d8 RW 0x4 TLS 0x017228 0x00018228 0x00018228 0x00000 0x00658 R 0x8 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .gnu_debuglink .shstrtab 01 .tbss .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic 03 .tbss .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt 04 There are 28 section headers, starting at offset 0x17b4c: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [...] [15] .tbss NOBITS 00018228 017228 000658 00 WAT 0 0 8 ====================================================================== File: /jail/test-r331838/lib/libxo.so.0.r331838 [...] Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x17160 0x17160 R E 0x1000 LOAD 0x017160 0x00018160 0x00018160 0x00604 0x00654 RW 0x1000 DYNAMIC 0x017264 0x00018264 0x00018264 0x000d8 0x000d8 RW 0x4 TLS 0x017160 0x00018764 0x00018764 0x00000 0x00050 R 0x8 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .gnu_debuglink .shstrtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic 03 .bss 04 There are 28 section headers, starting at offset 0x1793c: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [...] [15] .tbss NOBITS 00018160 017160 000658 00 WAT 0 0 8 In both r331837 and r331838 worlds, /usr/bin/ld is the GNU BFD ld, so it can't be caused by ldd being updated from 5.0 to 6.0. It must be something in an object file that is being linked into libxo.so.0.
(In reply to Dimitry Andric from comment #20) > it > looks relevant that the r331838 version of libxo.so.0 (compiled with the > clang 6.0.0 update) does NOT have a "section to segment mapping" for .tbss Interestingly, with the non-stripped versions of libxo.so, this is not the case: ====================================================================== File: libxo.so.0.full.r331837 [...] Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .debug_pubnames .debug_info .debug_abbrev .debug_line .debug_frame .debug_str .debug_loc .debug_macinfo .debug_pubtypes .debug_ranges .shstrtab .symtab .strtab 01 .tbss .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic 03 .tbss .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt [...] Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [...] [15] .tbss NOBITS 00018228 017228 000658 00 WAT 0 0 8 ====================================================================== File: libxo.so.0.full.r331838 [...] Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .debug_pubnames .debug_info .debug_abbrev .debug_line .debug_frame .debug_str .debug_loc .debug_macinfo .debug_pubtypes .debug_ranges .shstrtab .symtab .strtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic 03 .tbss .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss [...] Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [...] [15] .tbss NOBITS 00018160 017160 000658 00 WAT 0 0 8 So in case of r331837, segments 01 *and* 03 have a .tbss mapping, but in case of r331838, only segment 03 has it. And after stripping, the r331838 version even misses the .tbss mappings completely: ====================================================================== File: libxo.so.0.r331838 [...] Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .gnu_debuglink .shstrtab .symtab .strtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic 03 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss It seems elftoolchain strip completely eradicates the mapping, for some reason?
I can confirm that the non-stripped libxo with "uptime" function correctly: % ln -s /usr/obj/usr/src/usr.bin/w/w.full /tmp/uptime % /tmp/uptime Segmentation fault (core dumped) % env LD_LIBRARY_PATH=/usr/obj/usr/src/lib/libxo/ ldd /tmp/uptime /tmp/uptime: libkvm.so.7 => /lib/libkvm.so.7 (0x28070000) libsbuf.so.6 => /lib/libsbuf.so.6 (0x2807d000) libxo.so.0 => /usr/obj/usr/src/lib/libxo//libxo.so.0 (0x28080000) libutil.so.9 => /lib/libutil.so.9 (0x28099000) libc.so.7 => /lib/libc.so.7 (0x280ab000) libelf.so.2 => /lib/libelf.so.2 (0x2820a000) % env LD_LIBRARY_PATH=/usr/obj/usr/src/lib/libxo/ /tmp/uptime 4:37PM up 4 days, 8:32, 3 users, load averages: 0.69, 0.65, 0.54 Thanks, Phil
Another interesting data point, not that I'm sure what it means: % env LD_PRELOAD=/usr/lib/libpthread.so /tmp/uptime 5:26PM up 4 days, 9:22, 3 users, load averages: 0.55, 0.52, 0.51 (where /tmp/uptime is a symlink to /usr/obj/.../w/w). (see https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freebsd.org_bugzilla_show- 5Fbug.cgi-3Fid-3D227552&d=DwICAg&c=HAkYuh63rsuhr6Scbfh0UjBXeMK-ndb3voDTXcWzoCI&r=And7spKE XmRNIrq8pYCiSg&m=j2VlW6Tfy8t6kyvK1oE9ZgEFjaSbidABW3nn8LB2aU0&s=5EdUIHB_DHlP55i7bc5ZIXEDP2 1GYA_bgfEqe5PY1Mg&e=) Does the mean that the use of __thread requires -lpthread? My understanding was that the startup code handled thread-specific data for the main thread of execution. Thanks, Phil
I'm looking into why readelf output differs between the stripped and unstripped versions of the library, per comment #20. readelf.c:2381 has the following code: 2371 printf("\n Section to Segment mapping:\n"); 2372 printf(" Segment Sections...\n"); 2373 for (i = 0; (size_t)i < phnum; i++) { 2374 if (gelf_getphdr(re->elf, i, &phdr) != &phdr) { 2375 warnx("gelf_getphdr failed: %s", elf_errmsg(-1)); 2376 continue; 2377 } 2378 printf(" %2.2d ", i); 2379 /* skip NULL section. */ 2380 for (j = 1; (size_t)j < re->shnum; j++) 2381 if (re->sl[j].addr >= phdr.p_vaddr && 2382 re->sl[j].addr + re->sl[j].sz <= 2383 phdr.p_vaddr + phdr.p_memsz) 2384 printf("%s ", re->sl[j].name); 2385 printf("\n"); For the unstripped library, the output is: Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .debug_pubnames .debug_info .debug_abbrev .debug_line .debug_frame .debug_str .debug_loc .debug_macinfo .debug_pubtypes .debug_ranges .shstrtab .symtab .strtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic 03 .tbss .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 04 where the stripped library says: Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .shstrtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic 03 .bss 04 So I breakpointed on line 2381 when i == 3 and j == 15. For the unstripped library (the working one): (gdb) p re->sl[j] $18 = {name = 0x28626087 ".tbss", scn = 0x28621780, off = 94712, sz = 1624, entsize = 0, align = 8, type = 8, flags = 1027, addr = 98808, link = 0, info = 0} (gdb) p phdr $19 = {p_type = 7, p_flags = 4, p_offset = 94712, p_vaddr = 98808, p_paddr = 98808, p_filesz = 0, p_memsz = 1624, p_align = 8} (gdb) p (re->sl[j].addr >= phdr.p_vaddr) $20 = 1 (gdb) p (re->sl[j].addr + re->sl[j].sz <= phdr.p_vaddr + phdr.p_memsz) $21 = 1 Both conditions are true. For the stripped library (the failing one): (gdb) p re->sl[j] $13 = {name = 0x28621077 ".tbss", scn = 0x2861d780, off = 94712, sz = 1624, entsize = 0, align = 8, type = 8, flags = 1027, addr = 98808, link = 0, info = 0} (gdb) p phdr $15 = {p_type = 7, p_flags = 4, p_offset = 94712, p_vaddr = 100340, p_paddr = 100340, p_filesz = 0, p_memsz = 80, p_align = 8} (gdb) p (re->sl[j].addr >= phdr.p_vaddr) $14 = 0 The section's address (98808) is less than the segment's (100340), so the section is no longer listed. Perhaps is strip not updating the addresses as it removes sections? Or is there a disagreement between clang-6 and binutils about elf layout? Thanks, Phil
FWIW, here's the diff between unstripped and stripped readelf output: @@ -10,14 +10,14 @@ Version: 0x1 Entry point address: 0x2e60 Start of program headers: 52 (bytes into file) - Start of section headers: 287684 (bytes into file) + Start of section headers: 96676 (bytes into file) Flags: 0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 5 Size of section headers: 40 (bytes) - Number of section headers: 39 - Section header string table index: 36 + Number of section headers: 27 + Section header string table index: 26 Elf file type is DYN (Shared object file) Entry point 0x2e60 @@ -28,17 +28,17 @@ LOAD 0x000000 0x00000000 0x00000000 0x171f8 0x171f8 R E 0x1000 LOAD 0x0171f8 0x000181f8 0x000181f8 0x005fc 0x0064c RW 0x1000 DYNAMIC 0x0172f4 0x000182f4 0x000182f4 0x000d8 0x000d8 RW 0x4 - TLS 0x0171f8 0x000181f8 0x000181f8 0x00000 0x00658 R 0x8 + TLS 0x0171f8 0x000187f4 0x000187f4 0x00000 0x00050 R 0x8 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 Section to Segment mapping: Segment Sections... - 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .debug_pubnames .debug_info .debug_abbrev .debug_line .debug_frame .debug_str .debug_loc .debug_macinfo .debug_pubtypes .debug_ranges .shstrtab .symtab .strtab + 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .shstrtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic - 03 .tbss .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss + 03 .bss 04 -There are 39 section headers, starting at offset 0x463c4: +There are 27 section headers, starting at offset 0x179a4: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al @@ -49,8 +49,8 @@ [ 4] .dynstr STRTAB 00001648 001648 0009c2 00 A 0 0 1 [ 5] .gnu.version SUNW_versym 0000200a 00200a 000188 02 A 3 0 2 [ 6] .gnu.version_r SUNW_verneed 00002194 002194 000030 00 A 4 1 4 - [ 7] .rel.dyn REL 000021c4 0021c4 000438 08 A 3 0 4 - [ 8] .rel.plt REL 000025fc 0025fc 0002c0 08 A 3 10 4 + [ 7] .rel.dyn REL 000021c4 0021c4 000438 08 AI 3 0 4 + [ 8] .rel.plt REL 000025fc 0025fc 0002c0 08 AI 3 10 4 [ 9] .init PROGBITS 000028bc 0028bc 000011 00 AX 0 0 4 [10] .plt PROGBITS 000028d0 0028d0 000590 04 AX 0 0 4 [11] .text PROGBITS 00002e60 002e60 0129c0 00 AX 0 0 16 @@ -68,19 +68,7 @@ [23] .data PROGBITS 00018578 017578 00027c 00 WA 0 0 4 [24] .bss NOBITS 000187f4 0177f4 000050 00 WA 0 0 4 [25] .comment PROGBITS 00000000 0177f4 0000e6 01 MS 0 0 1 - [26] .debug_pubnames PROGBITS 00000000 0178da 0018dc 00 0 0 1 - [27] .debug_info PROGBITS 00000000 0191b6 00e557 00 0 0 1 - [28] .debug_abbrev PROGBITS 00000000 02770d 000951 00 0 0 1 - [29] .debug_line PROGBITS 00000000 02805e 00bb0b 00 0 0 1 - [30] .debug_frame PROGBITS 00000000 033b6c 001498 00 0 0 4 - [31] .debug_str PROGBITS 00000000 035004 0023d9 01 MS 0 0 1 - [32] .debug_loc PROGBITS 00000000 0373dd 00ce49 00 0 0 1 - [33] .debug_macinfo PROGBITS 00000000 044226 000003 00 0 0 1 - [34] .debug_pubtypes PROGBITS 00000000 044229 0009b5 00 0 0 1 - [35] .debug_ranges PROGBITS 00000000 044bde 001688 00 0 0 1 - [36] .shstrtab STRTAB 00000000 046266 00015e 00 0 0 1 - [37] .symtab SYMTAB 00000000 0469dc 000eb0 10 38 40 4 - [38] .strtab STRTAB 00000000 04788c 000c9d 00 0 0 1 + [26] .shstrtab STRTAB 00000000 0178da 0000c8 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) Of particular interest is the TLS line, which changes from: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align ... TLS 0x0171f8 0x000181f8 0x000181f8 0x00000 0x00658 R 0x8 to: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align ... TLS 0x0171f8 0x000187f4 0x000187f4 0x00000 0x00050 R 0x8 The size changes from 0x658 to 0x50 and given that the xo_default_handle is 328 bytes (0x148), this is likely broken. Not that this explains the starting address being below segment #3's starting address...... Thanks, Phil
Looks to be a "strip" issue: Jimi [lib/test]% mkdir works fails Jimi [lib/test]% install -s /usr/obj/usr/src/lib/libxo/libxo.so.0.full works/libxo.so.0 Jimi [lib/test]% install -s /usr/obj/usr/src/lib/libxo/libxo.so.0.full fails/libxo.so.0 Jimi [lib/test]% ll */*0 -rwxr-xr-x 1 phil phil 97756 May 11 16:43 fails/libxo.so.0* -rwxr-xr-x 1 phil phil 97756 May 11 16:43 works/libxo.so.0* Jimi [lib/test]% env LD_LIBRARY_PATH=works /tmp/uptime 4:45PM up 7 days, 8:40, 3 users, load averages: 0.55, 0.45, 0.43 Jimi [lib/test]% env LD_LIBRARY_PATH=fails /tmp/uptime 4:45PM up 7 days, 8:40, 3 users, load averages: 0.51, 0.44, 0.43 Jimi [lib/test]% strip fails/libxo.so.0 Jimi [lib/test]% env LD_LIBRARY_PATH=fails /tmp/uptime Segmentation fault (core dumped) Jimi [lib/test]% readelf -e works/libxo.so.0 > works/out Jimi [lib/test]% readelf -e fails/libxo.so.0 > fails/out Jimi [lib/test]% diff -u works/out fails/out --- works/out 2018-05-11 16:45:46.660037000 -0400 +++ fails/out 2018-05-11 16:45:56.004434000 -0400 @@ -28,7 +28,7 @@ LOAD 0x000000 0x00000000 0x00000000 0x171f8 0x171f8 R E 0x1000 LOAD 0x0171f8 0x000181f8 0x000181f8 0x005fc 0x0064c RW 0x1000 DYNAMIC 0x0172f4 0x000182f4 0x000182f4 0x000d8 0x000d8 RW 0x4 - TLS 0x0171f8 0x000181f8 0x000181f8 0x00000 0x0064c R 0x8 + TLS 0x0171f8 0x000187f4 0x000187f4 0x00000 0x00050 R 0x8 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 Section to Segment mapping: @@ -36,7 +36,7 @@ 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .shstrtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic - 03 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss + 03 .bss 04 There are 27 section headers, starting at offset 0x179a4: Jimi [lib/test]% which strip /usr/bin/strip Jimi [lib/test]% So "strip" (but not "install -s"?) doctors the TLS header, reducing the length and causing TLS bss data to be uninitialized. Both versions have the .tbss section removed from the "Segment to Section" map. Thanks, Phil
Even more odd, running "strip" twice on the same target gives the same TLS length change: Jimi [lib/test]% strip -o mine /usr/obj/usr/src/lib/libxo/libxo.so.0.full Jimi [lib/test]% readelf -e mine > before.elf Jimi [lib/test]% strip mine Jimi [lib/test]% readelf -e mine > after.elf Jimi [lib/test]% diff -u before.elf after.elf --- before.elf 2018-05-11 16:56:33.492235000 -0400 +++ after.elf 2018-05-11 16:56:40.876225000 -0400 @@ -28,7 +28,7 @@ LOAD 0x000000 0x00000000 0x00000000 0x171f8 0x171f8 R E 0x1000 LOAD 0x0171f8 0x000181f8 0x000181f8 0x005fc 0x0064c RW 0x1000 DYNAMIC 0x0172f4 0x000182f4 0x000182f4 0x000d8 0x000d8 RW 0x4 - TLS 0x0171f8 0x000181f8 0x000181f8 0x00000 0x0064c R 0x8 + TLS 0x0171f8 0x000187f4 0x000187f4 0x00000 0x00050 R 0x8 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 Section to Segment mapping: @@ -36,7 +36,7 @@ 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .shstrtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic - 03 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss + 03 .bss 04 There are 27 section headers, starting at offset 0x179a4: I see the same issue when "strip" is used twice in a row (both with "-o"): Jimi [lib/test]% strip -o mine /usr/obj/usr/src/lib/libxo/libxo.so.0.full Jimi [lib/test]% readelf -e mine > before.elf Jimi [lib/test]% strip -o never mine Jimi [lib/test]% readelf -e never > after.elf Jimi [lib/test]% diff -u before.elf after.elf --- before.elf 2018-05-11 16:58:07.845980000 -0400 +++ after.elf 2018-05-11 16:58:44.398731000 -0400 @@ -28,7 +28,7 @@ LOAD 0x000000 0x00000000 0x00000000 0x171f8 0x171f8 R E 0x1000 LOAD 0x0171f8 0x000181f8 0x000181f8 0x005fc 0x0064c RW 0x1000 DYNAMIC 0x0172f4 0x000182f4 0x000182f4 0x000d8 0x000d8 RW 0x4 - TLS 0x0171f8 0x000181f8 0x000181f8 0x00000 0x0064c R 0x8 + TLS 0x0171f8 0x000187f4 0x000187f4 0x00000 0x00050 R 0x8 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 Section to Segment mapping: @@ -36,7 +36,7 @@ 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .shstrtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic - 03 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss + 03 .bss 04 There are 27 section headers, starting at offset 0x179a4: ---------- Off to look at strip..... Please holler if this sounds familiar.... Thanks, Phil
Created attachment 193345 [details] Fix from Kai Wang (kaiw@) for elfcopy Many thanks for Kai Wang (kaiw@) for this fix! Thanks, Phil
Created attachment 193346 [details] Fix from Kai Wang (kaiw@) for readelf Companion fix from Kai for readelf. Thanks, Phil
Building head now; will commit fix tomorrow. Thanks, Phil
A commit references this bug: Author: phil Date: Mon May 14 05:21:19 UTC 2018 New revision: 333600 URL: https://svnweb.freebsd.org/changeset/base/333600 Log: Handle thread-local storage (TLS) segments correctly when copying (objcopy) and displaying (readelf) them. PR: 227552 Submitted by: kaiw (maintainer) Reported by: jachmann@unitix.org Reviewed by: phil MFC after: 1 day Changes: head/contrib/elftoolchain/elfcopy/elfcopy.h head/contrib/elftoolchain/elfcopy/sections.c head/contrib/elftoolchain/elfcopy/segments.c head/contrib/elftoolchain/readelf/readelf.c
Fix from kaiw@ is in head at r333600. Building 11-stable now; will MFC tomorrow. Thanks, Phil
I'm still seeing this bug in 11-STABLE with i386, even after these patches. root@fbsd-stable-vm:/usr/src/tools/tools/nanobsd # /usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime Segmentation fault (core dumped) (lldb) target create "/usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime" Current executable set to '/usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime' (i386). (lldb) r Process 51608 launching Process 51608 launched: '/usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime' (i386) Process 51608 stopped * thread #1, name = 'uptime', stop reason = signal SIGBUS: hardware error frame #0: 0xffffffff error: Bad address (lldb) bt * thread #1, name = 'uptime', stop reason = signal SIGBUS: hardware error * frame #0: 0xfffffff (compiling i386 nanobsd images on amd64).
(In reply to Adam Stylinski from comment #33) > I'm still seeing this bug in 11-STABLE with i386, even after these patches. > > root@fbsd-stable-vm:/usr/src/tools/tools/nanobsd # > /usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime > Segmentation fault (core dumped) It's probably loading a bad copy of libxo.so.0, from /lib. What is the output of: ldd /usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime and for the libxo.so.0 file listed there, show the output of: readelf -lW <path from ldd output above>/libxo.so.0
(In reply to Dimitry Andric from comment #34) root@fbsd-stable-vm:/usr/src/tools/tools/nanobsd # ldd /usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime /usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime: libkvm.so.7 => /usr/lib32/libkvm.so.7 (0x28071000) libsbuf.so.6 => /usr/lib32/libsbuf.so.6 (0x2807e000) libxo.so.0 => /usr/lib32/libxo.so.0 (0x28081000) libutil.so.9 => /usr/lib32/libutil.so.9 (0x2809a000) libc.so.7 => /usr/lib32/libc.so.7 (0x280ac000) libelf.so.2 => /usr/lib32/libelf.so.2 (0x28213000) root@fbsd-stable-vm:/usr/src/tools/tools/nanobsd # readelf -lW /usr/lib32/libxo.so.0 Elf file type is DYN (Shared object file) Entry point 0x2e40 There are 5 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x171cc 0x171cc R E 0x1000 LOAD 0x0171d0 0x000181d0 0x000181d0 0x00604 0x00654 RW 0x1000 DYNAMIC 0x0172d4 0x000182d4 0x000182d4 0x000d8 0x000d8 RW 0x4 TLS 0x0171d0 0x000187d4 0x000187d4 0x00000 0x00050 R 0x8 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .gnu_debuglink .shstrtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic 03 .bss 04 And on the hardware in question: root@wap1:~ # ldd `which uptime` /usr/bin/uptime: libkvm.so.7 => /lib/libkvm.so.7 (0x28070000) libsbuf.so.6 => /lib/libsbuf.so.6 (0x2807d000) libxo.so.0 => /lib/libxo.so.0 (0x28080000) libutil.so.9 => /lib/libutil.so.9 (0x28099000) libc.so.7 => /lib/libc.so.7 (0x280ab000) libelf.so.2 => /lib/libelf.so.2 (0x2820a000) root@wap1:~ # readelf -lW /lib/libxo.so.0 Elf file type is DYN (Shared object file) Entry point 0x2e40 There are 5 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x17160 0x17160 R E 0x1000 LOAD 0x017160 0x00018160 0x00018160 0x00604 0x00654 RW 0x1000 DYNAMIC 0x017264 0x00018264 0x00018264 0x000d8 0x000d8 RW 0x4 TLS 0x017160 0x00018764 0x00018764 0x00000 0x00050 R 0x8 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame .comment .gnu_debuglink .shstrtab 01 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 02 .dynamic 03 .bss 04 root@wap1:~ # uptime Segmentation fault
(In reply to Adam Stylinski from comment #35) > (In reply to Dimitry Andric from comment #34) > > root@fbsd-stable-vm:/usr/src/tools/tools/nanobsd # ldd > /usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime > /usr/obj/nanobsd.ALIX/_.w/usr/bin/uptime: > libkvm.so.7 => /usr/lib32/libkvm.so.7 (0x28071000) > libsbuf.so.6 => /usr/lib32/libsbuf.so.6 (0x2807e000) > libxo.so.0 => /usr/lib32/libxo.so.0 (0x28081000) > libutil.so.9 => /usr/lib32/libutil.so.9 (0x2809a000) > libc.so.7 => /usr/lib32/libc.so.7 (0x280ac000) > libelf.so.2 => /usr/lib32/libelf.so.2 (0x28213000) Hmm this is weird, it should not link to 32 bit libraries, unless uptime itself is a 32 bit executable? Are you doing a cross-build here? > root@fbsd-stable-vm:/usr/src/tools/tools/nanobsd # readelf -lW > /usr/lib32/libxo.so.0 > > Elf file type is DYN (Shared object file) > Entry point 0x2e40 > There are 5 program headers, starting at offset 52 > > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > LOAD 0x000000 0x00000000 0x00000000 0x171cc 0x171cc R E 0x1000 > LOAD 0x0171d0 0x000181d0 0x000181d0 0x00604 0x00654 RW 0x1000 > DYNAMIC 0x0172d4 0x000182d4 0x000182d4 0x000d8 0x000d8 RW 0x4 > TLS 0x0171d0 0x000187d4 0x000187d4 0x00000 0x00050 R 0x8 Yeah, this is definitely a messed up TLS section, produced by the buggy version of strip. > And on the hardware in question: > > root@wap1:~ # ldd `which uptime` > /usr/bin/uptime: > libkvm.so.7 => /lib/libkvm.so.7 (0x28070000) > libsbuf.so.6 => /lib/libsbuf.so.6 (0x2807d000) > libxo.so.0 => /lib/libxo.so.0 (0x28080000) > libutil.so.9 => /lib/libutil.so.9 (0x28099000) > libc.so.7 => /lib/libc.so.7 (0x280ab000) > libelf.so.2 => /lib/libelf.so.2 (0x2820a000) This looks more normal... > root@wap1:~ # readelf -lW /lib/libxo.so.0 > > Elf file type is DYN (Shared object file) > Entry point 0x2e40 > There are 5 program headers, starting at offset 52 > > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > LOAD 0x000000 0x00000000 0x00000000 0x17160 0x17160 R E 0x1000 > LOAD 0x017160 0x00018160 0x00018160 0x00604 0x00654 RW 0x1000 > DYNAMIC 0x017264 0x00018264 0x00018264 0x000d8 0x000d8 RW 0x4 > TLS 0x017160 0x00018764 0x00018764 0x00000 0x00050 R 0x8 But that is still messed up. For some reason, it still used the buggy version of strip.
(In reply to Dimitry Andric from comment #36) The build machine in question is on amd64 and is using nanobsd scripts to build images for i386. The install from which this build is happening is on 11-stable, updated as of yesterday afternoon (with make delete-old and make delete-old-libs run). The same source tree (updated via svnup) is used to build the nanobsd image.
A commit references this bug: Author: marius Date: Thu May 17 21:49:35 UTC 2018 New revision: 333770 URL: https://svnweb.freebsd.org/changeset/base/333770 Log: MFC: r333600 (phil) Handle thread-local storage (TLS) segments correctly when copying (objcopy) and displaying (readelf) them. PR: 227552 Submitted by: kaiw (maintainer) Approved by: re (gjb) Changes: _U stable/11/ stable/11/contrib/elftoolchain/elfcopy/elfcopy.h stable/11/contrib/elftoolchain/elfcopy/sections.c stable/11/contrib/elftoolchain/elfcopy/segments.c stable/11/contrib/elftoolchain/readelf/readelf.c
(In reply to commit-hook from comment #38) Ahh, had this not been MFC'd yet? I thought I saw it had in the web svn frontend but maybe I was mistakenly browsing /base.
The fix is not MFC'd to 11/stable yet. I was waiting for the 3 day minimum MFC delay, but am away on business. I'll try to get it in tomorrow. If not, it will be MFC'd next week. Thanks, Phil
(In reply to Phil Shafer from comment #40) It looks like kaiw did it already.
Fix implemented in head r333600, merged to stable/11 in r333770, and also available in 11.2-BETA2.