Bug 232176 - elftoolchain elfcopy/strip incorrectly strips relocations
Summary: elftoolchain elfcopy/strip incorrectly strips relocations
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords: patch
Depends on:
Blocks: 231882
  Show dependency treegraph
 
Reported: 2018-10-11 12:57 UTC by Ed Maste
Modified: 2018-10-25 15:19 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ed Maste freebsd_committer 2018-10-11 12:57:04 UTC
When using ifuncs in static binaries we'll have R_X86_64_IRELATIVE relocations in .rela.plt:

nuc% readelf -r ifunc_reproducer/make.full 

Relocation section with addend (.rela.plt):
r_offset     r_info       r_type              st_value         st_name + r_addend
0000002edaa0 000000000025 R_X86_64_IRELATIVE  0000000000000000  + 2e91f0

Relevant sections:

% readelf -S ifunc_reproducer/make.full 
There are 36 section headers, starting at offset 0x434420:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 3] .rela.plt         RELA             00000000002181a0  000181a0
       0000000000000018  0000000000000018   A      33    11     8
  [ 9] .plt              PROGBITS         00000000002e9c20  000e9c20
       0000000000000010  0000000000000000  AX       0     0     16
  [11] .got.plt          PROGBITS         00000000002edaa0  000edaa0
       0000000000000008  0000000000000000  WA       0     0     8
  [33] .symtab           SYMTAB           0000000000000000  004148f8
       0000000000012d38  0000000000000018          35   1582     8
  [34] .shstrtab         STRTAB           0000000000000000  00427630
       000000000000016f  0000000000000000           0     0     1
  [35] .strtab           STRTAB           0000000000000000  0042779f
       000000000000cc7b  0000000000000000           0     0     1

Stripping with objcopy:
% objcopy --strip-all ifunc_reproducer/make.full make.stripped

results in a broken binary:
% ./make.stripped
<jemalloc>: jemalloc_arena.c:230: Failed assertion: "!bitmap_full(slab_data->bitmap, &bin_info->bitmap_info)"
zsh: abort (core dumped)  ./make.stripped

because the relocation has been removed:
% readelf -r make.stripped

Relocation section with addend (.rela.plt):
r_offset     r_info       r_type              st_value         st_name + r_addend

Due to this snippet in elfcopy/sections.c::filter_reloc()
        /* We don't want to touch relocation info for dynamic symbols. */
        if ((ecp->flags & SYMTAB_EXIST) == 0) {
                if (ish.sh_link == 0 || ecp->secndx[ish.sh_link] == 0) {
                        /*
                         * This reloc section applies to the symbol table
                         * that was stripped, so discard whole section.
                         */
                        s->nocopy = 1;
                        s->sz = 0;
                }
                return;

This is invalid - we can remove the reloc section if the section to which it refers is being stripped, as in create_scn():

                if (ish.sh_type == SHT_REL || ish.sh_type == SHT_RELA)
                        if (ish.sh_info != 0 &&
                            is_remove_reloc_sec(ecp, ish.sh_info))
                                continue;

but removing based on the string table results in broken output.
Comment 1 Ed Maste freebsd_committer 2018-10-11 13:08:05 UTC
Workaround is to just disable filter_reloc():

diff --git a/elfcopy/sections.c b/elfcopy/sections.c
index 037f9288..78b06ca6 100644
--- a/elfcopy/sections.c
+++ b/elfcopy/sections.c
@@ -689,6 +689,8 @@ filter_reloc(struct elfcopy *ecp, struct section *s)
        uint64_t         cap, n, nrels;
        int              elferr, i;
 
+       return; /* PR232176 */
+
        if (gelf_getshdr(s->is, &ish) == NULL)
                errx(EXIT_FAILURE, "gelf_getehdr() failed: %s",
                    elf_errmsg(-1));
Comment 2 Ed Maste freebsd_committer 2018-10-11 13:49:43 UTC
In fact it's even worse; from filter_reloc():

        for(i = 0; (uint64_t)i < n; i++) {
                if (s->type == SHT_REL) {
                        if (gelf_getrel(id, i, &rel) != &rel)
                                errx(EXIT_FAILURE, "gelf_getrel failed: %s",
                                    elf_errmsg(-1));
                } else {
                        if (gelf_getrela(id, i, &rela) != &rela)
                                errx(EXIT_FAILURE, "gelf_getrel failed: %s",
                                    elf_errmsg(-1));
                }
                name = elf_strptr(ecp->ein, elf_ndxscn(ecp->strtab->is),
                    GELF_R_SYM(rel.r_info));

rel.r_info is uninitialized if s->type == SHT_RELA.
Comment 3 Ed Maste freebsd_committer 2018-10-11 14:56:56 UTC
Proposed removing filter_reloc() in https://reviews.freebsd.org/D17519
Comment 4 commit-hook freebsd_committer 2018-10-13 21:26:29 UTC
A commit references this bug:

Author: emaste
Date: Sat Oct 13 21:26:07 UTC 2018
New revision: 339350
URL: https://svnweb.freebsd.org/changeset/base/339350

Log:
  elfcopy: delete filter_reloc, it is broken and unnecessary

  elfcopy contained logic to filter individual relocations in STRIP_ALL
  mode.  However, this is not valid; relocations emitted by the linker are
  required, unless they apply to an entire section being removed (which is
  handled by other logic in elfcopy).

  Note that filter_reloc was also buggy: for RELA relocation sections it
  operated on uninitialized rel.r_info resulting in invalid operation.

  The logic most likely needs to be inverted: instead of removing
  relocations because their associated symbols are being removed, we must
  keep symbols referenced by relocations.  That said, in practice we do
  not encounter this code path today: objects being stripped are either
  dynamically linked binaries which retain .dynsym, or static binaries
  with no relocations.

  Just remove filter_reloc.  This fixes certain cases including statically
  linked binaries containing ifuncs.  Stripping binaries with relocations
  referencing removed symbols was already broken, and after this change
  may still be broken in a different way.

  PR:		232176
  Reviewed by:	kaiw, kib, markj
  Approved by:	re (rgrimes)
  MFC after:	1 month
  Sponsored by:	The FreeBSD Foundation
  Differential Revision:	https://reviews.freebsd.org/D17519

Changes:
  head/contrib/elftoolchain/elfcopy/sections.c
Comment 5 commit-hook freebsd_committer 2018-10-20 17:28:39 UTC
A commit references this bug:

Author: emaste
Date: Sat Oct 20 17:27:53 UTC 2018
New revision: 339451
URL: https://svnweb.freebsd.org/changeset/base/339451

Log:
  objcopy: restore behaviour required by GCC's build

  In r339350 filter_reloc() was removed, to fix the case of stripping
  statically linked binaries with relocations (which may come from ifunc
  use, for example).  As a side effect this changed the behaviour when
  stripping object files - the output was broken both before and after
  r339350, in different ways.  Unfortunately GCC's build process relies
  on the previous behaviour, so:

  - Revert r339350, restoring filter_reloc().
  - Fix an unitialized variable use (commited as r3638 in ELF Tool Chain).
  - Change filter_reloc() to omit relocations referencing removed
    symbols, while retaining relocations with no symbol reference.
  - Retain the entire relocation section if it references the dynamic
    symbol table (fix from kaiw in D17596).

  PR:		232176
  Reported by:	antoine
  Reviewed by:	kaiw
  MFC with:	r339350
  Sponsored by:	The FreeBSD Foundation
  Differential Revision:	https://reviews.freebsd.org/D17596

Changes:
  head/contrib/elftoolchain/elfcopy/sections.c
Comment 6 commit-hook freebsd_committer 2018-10-25 13:46:47 UTC
A commit references this bug:

Author: emaste
Date: Thu Oct 25 13:46:28 UTC 2018
New revision: 339710
URL: https://svnweb.freebsd.org/changeset/base/339710

Log:
  elfcopy: avoid stripping relocations from static binaries

  MFC r339350: elfcopy: delete filter_reloc, it is broken and unnecessary

  elfcopy contained logic to filter individual relocations in STRIP_ALL
  mode.  However, this is not valid; relocations emitted by the linker are
  required, unless they apply to an entire section being removed (which is
  handled by other logic in elfcopy).

  Note that filter_reloc was also buggy: for RELA relocation sections it
  operated on uninitialized rel.r_info resulting in invalid operation.

  The logic most likely needs to be inverted: instead of removing
  relocations because their associated symbols are being removed, we must
  keep symbols referenced by relocations.  That said, in practice we do
  not encounter this code path today: objects being stripped are either
  dynamically linked binaries which retain .dynsym, or static binaries
  with no relocations.

  Just remove filter_reloc.  This fixes certain cases including statically
  linked binaries containing ifuncs.  Stripping binaries with relocations
  referencing removed symbols was already broken, and after this change
  may still be broken in a different way.

  MFC r339451: objcopy: restore behaviour required by GCC's build

  In r339350 filter_reloc() was removed, to fix the case of stripping
  statically linked binaries with relocations (which may come from ifunc
  use, for example).  As a side effect this changed the behaviour when
  stripping object files - the output was broken both before and after
  r339350, in different ways.  Unfortunately GCC's build process relies
  on the previous behaviour, so:

  - Revert r339350, restoring filter_reloc().
  - Fix an unitialized variable use (commited as r3638 in ELF Tool Chain).
  - Change filter_reloc() to omit relocations referencing removed
    symbols, while retaining relocations with no symbol reference.
  - Retain the entire relocation section if it references the dynamic
    symbol table (fix from kaiw in D17596).

  PR:		232176
  Sponsored by:	The FreeBSD Foundation

Changes:
_U  stable/11/
  stable/11/contrib/elftoolchain/elfcopy/sections.c
Comment 7 commit-hook freebsd_committer 2018-10-25 15:19:07 UTC
A commit references this bug:

Author: emaste
Date: Thu Oct 25 15:18:54 UTC 2018
New revision: 339726
URL: https://svnweb.freebsd.org/changeset/base/339726

Log:
  MFC r339451: objcopy: restore behaviour required by GCC's build

  In r339350 filter_reloc() was removed, to fix the case of stripping
  statically linked binaries with relocations (which may come from ifunc
  use, for example).  As a side effect this changed the behaviour when
  stripping object files - the output was broken both before and after
  r339350, in different ways.  Unfortunately GCC's build process relies
  on the previous behaviour, so:

  - Revert r339350, restoring filter_reloc().
  - Fix an unitialized variable use (commited as r3638 in ELF Tool Chain).
  - Change filter_reloc() to omit relocations referencing removed
    symbols, while retaining relocations with no symbol reference.
  - Retain the entire relocation section if it references the dynamic
    symbol table (fix from kaiw in D17596).

  PR:		232176
  Approved by:	re (gjb, kib)
  Sponsored by:	The FreeBSD Foundation

Changes:
_U  stable/12/
  stable/12/contrib/elftoolchain/elfcopy/sections.c