Bug 204084 - libdwarf does not handle SHT_REL relocations properly
Summary: libdwarf does not handle SHT_REL relocations properly
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: standards (show other bugs)
Version: CURRENT
Hardware: arm Any
: --- Affects Only Me
Assignee: Ed Maste
Depends on:
Reported: 2015-10-28 06:17 UTC by Steve Kiernan
Modified: 2016-06-13 14:49 UTC (History)
1 user (show)

See Also:
emaste: mfc-stable10-

Sample object file with bogus .SUNW_ctf due to SHT_REL mishandling (12.11 KB, application/x-object)
2015-10-28 06:17 UTC, Steve Kiernan
no flags Details
arc4random.o built with Clang 3.7 (17.38 KB, application/x-object)
2015-10-28 13:11 UTC, Ed Maste
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steve Kiernan 2015-10-28 06:17:31 UTC
Created attachment 162523 [details]
Sample object file with bogus .SUNW_ctf due to SHT_REL mishandling

When using dtrace on ARM and I tracked it down to what I think is an issue in libdwarf and how it is handling relocations.

From what I've been able to tell from reading various specifications and other documents, relocations of the DWARF sections are supposed to be handled as follows:
1. Given the following:
   P is the address we want to do relocations on
   S is the value of the symbol referenced in the relocation entry
   A is the addendum in the relocation entry
2. For SHT_RELA type, we need to take S, add A, then store the resulting value at the location P
3. For SHT_REL type, we need to take the value at the location P, add S, then store the resulting value at the location P

From what I was able to read in the libdwarf code (contrib/elftoolchain/libdwarf/libdwarf_elf_init.c), it looks like SHT_RELA type is handled correctly. However, for SHT_REL type, there seems to be a missing step. It looks like it is just taking S and storing that at the location P, which ends up clobbering the value that was already at the location P.

This ends up being a disaster for ctfconvert, as you end up with very few to no actual symbols in that binary able to be used in probe statements.

See the attached arc4random.o file. This was built with the clang 3.5.1 for ARM. You should be able to use ctfdump on it to see the bogus .SUNW_ctf section data generated by ctfconvert as the code currently exists (with the flawed REL relocation handling.)
Comment 1 Ed Maste freebsd_committer 2015-10-28 13:11:36 UTC
Created attachment 162532 [details]
arc4random.o built with Clang 3.7
Comment 2 Ed Maste freebsd_committer 2015-10-28 14:19:25 UTC
Your analysis looks good - I suspect it works on Clang 3.6+ because the required relocations have 0 as the A value at P.
Comment 3 commit-hook freebsd_committer 2016-03-11 16:24:51 UTC
A commit references this bug:

Author: emaste
Date: Fri Mar 11 16:24:39 UTC 2016
New revision: 296663
URL: https://svnweb.freebsd.org/changeset/base/296663

  libdwarf: fix SHT_REL relocation processing

  Relocation of type SHT_REL must use the current value as addend.

  PR:		204084
  Obtained from:	NetBSD libdwarf_elf_init.c v1.4

Comment 4 Ed Maste freebsd_committer 2016-03-14 02:04:00 UTC
Proposed rework in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204084
Comment 5 Ed Maste freebsd_committer 2016-04-05 17:51:12 UTC
HEAD has NetBSD's version of the fix, and a different version is now committed upstream as r3438: https://sourceforge.net/p/elftoolchain/code/3438/

I anticipate importing an updated ELF Tool Chain version before 11.0, and where possible merging libelf and libdwarf fixes to stable/10.

Steve, can you confirm the fix (either with HEAD or upstream ELF Tool Chain)?
Comment 6 Ed Maste freebsd_committer 2016-06-10 17:26:28 UTC
Steve confirmed that this fixes the issue; need to investigate the merge to stable/10 still.
Comment 7 Ed Maste freebsd_committer 2016-06-13 14:49:45 UTC
libdwarf in stable/10 is rather old compared to HEAD; it does not even support REL relocations and thus the bug in REL relocation handling does not exist there. This is fragile and toolchain changes in 10.x could turn the lack of REL relocation handling into a real problem on stable/10. Since the toolchain is unlikely to be changed there just leave things as they are.