Created attachment 162523 [details]
Sample object file with bogus .SUNW_ctf due to SHT_REL mishandling
When using dtrace on ARM and I tracked it down to what I think is an issue in libdwarf and how it is handling relocations.
From what I've been able to tell from reading various specifications and other documents, relocations of the DWARF sections are supposed to be handled as follows:
1. Given the following:
P is the address we want to do relocations on
S is the value of the symbol referenced in the relocation entry
A is the addendum in the relocation entry
2. For SHT_RELA type, we need to take S, add A, then store the resulting value at the location P
3. For SHT_REL type, we need to take the value at the location P, add S, then store the resulting value at the location P
From what I was able to read in the libdwarf code (contrib/elftoolchain/libdwarf/libdwarf_elf_init.c), it looks like SHT_RELA type is handled correctly. However, for SHT_REL type, there seems to be a missing step. It looks like it is just taking S and storing that at the location P, which ends up clobbering the value that was already at the location P.
This ends up being a disaster for ctfconvert, as you end up with very few to no actual symbols in that binary able to be used in probe statements.
See the attached arc4random.o file. This was built with the clang 3.5.1 for ARM. You should be able to use ctfdump on it to see the bogus .SUNW_ctf section data generated by ctfconvert as the code currently exists (with the flawed REL relocation handling.)
Created attachment 162532 [details]
arc4random.o built with Clang 3.7
Your analysis looks good - I suspect it works on Clang 3.6+ because the required relocations have 0 as the A value at P.
A commit references this bug:
Date: Fri Mar 11 16:24:39 UTC 2016
New revision: 296663
libdwarf: fix SHT_REL relocation processing
Relocation of type SHT_REL must use the current value as addend.
Obtained from: NetBSD libdwarf_elf_init.c v1.4
Proposed rework in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204084
HEAD has NetBSD's version of the fix, and a different version is now committed upstream as r3438: https://sourceforge.net/p/elftoolchain/code/3438/
I anticipate importing an updated ELF Tool Chain version before 11.0, and where possible merging libelf and libdwarf fixes to stable/10.
Steve, can you confirm the fix (either with HEAD or upstream ELF Tool Chain)?
Steve confirmed that this fixes the issue; need to investigate the merge to stable/10 still.
libdwarf in stable/10 is rather old compared to HEAD; it does not even support REL relocations and thus the bug in REL relocation handling does not exist there. This is fragile and toolchain changes in 10.x could turn the lack of REL relocation handling into a real problem on stable/10. Since the toolchain is unlikely to be changed there just leave things as they are.