Bug 234949 - Strip is enormously slow when operating on Haskell binaries
Summary: Strip is enormously slow when operating on Haskell binaries
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 12.0-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: Mark Johnston
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2019-01-14 17:14 UTC by Gleb Popov
Modified: 2019-06-26 16:43 UTC (History)
4 users (show)

See Also:


Attachments
proposed patch (18.07 KB, patch)
2019-05-27 04:51 UTC, Mark Johnston
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gleb Popov freebsd_committer 2019-01-14 17:14:19 UTC
Somewhere since 12.0-CURRENT, strip started to take much longer when working on Haskell static libraries. This makes `install` stage of GHC take about 10 mins on my machine.

Steps to reproduce:

# fetch http://distcache.FreeBSD.org/local-distfiles/arrowd/stack-bindists/ghc-8.6.2-x86_64-portbld-freebsd.tar.xz
# tar -xzf ghc-8.6.2-x86_64-portbld-freebsd.tar.xz
# /usr/bin/time -h /usr/bin/strip ghc-8.6.2/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1.a
        1m12,41s real           1m12,24s user           0,14s sys
# /usr/bin/time -h /usr/bin/strip ghc-8.6.2/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1_p.a
        4m2,88s real            4m2,43s user            0,36s sys
Comment 1 Ed Maste freebsd_committer 2019-01-14 18:21:38 UTC
I'm not quite sure what you mean by "Somewhere since 12.0-CURRENT"; could you describe the pre-slowdown FreeBSD version in more detail?

As a reference point my laptop has a -CURRENT kernel from end of April 2018 and with your first example strip took:

1m53.95s real           1m51.04s user           0.57s sys

GNU strip from binutils 2.30_7,1 took:

4.31s real              0.96s user              3.31s sys

(Note output was to /dev/null with our strip, and an actual tempfile with GNU strip as -o /dev/null failed.)
Comment 2 Gleb Popov freebsd_committer 2019-01-14 18:43:48 UTC
(In reply to Ed Maste from comment #1)
I'm nut quite sure, maybe spring, maybe summer. Did we switch strip from some old implementation to elftoolchain one somewhere in this time span?
Comment 3 commit-hook freebsd_committer 2019-01-16 08:04:42 UTC
A commit references this bug:

Author: arrowd
Date: Wed Jan 16 08:04:04 UTC 2019
New revision: 490456
URL: https://svnweb.freebsd.org/changeset/ports/490456

Log:
  lang/ghc: Since we already depend on binutils on ARM, use their strip, because our strip is incredibly slow.

  PR:		234949
  Submitted by:	Mikael Urankar <mikael.urankar@gmail.com>
  Approved by:	tcberner (mentor)
  Differential Revision:	https://reviews.freebsd.org/D18842

Changes:
  head/lang/ghc/bsd.ghc.mk
Comment 4 Ed Maste freebsd_committer 2019-01-16 15:17:42 UTC
(In reply to Gleb Popov from comment #2)
We've been using ELF Tool Chain's elfcopy as objcopy/strip since FreeBSD 11. An option existed to revert to GNU objcopy/strip (WITHOUT_ELFCOPY_AS_OBJCOPY) but it was removed in r306649. So if this started in spring/summer 2018 then it appears to be an ELF Tool Chain regression.
Comment 5 Gleb Popov freebsd_committer 2019-05-19 18:10:05 UTC
I compiled strip from elftoolchain upstream and ran it under google-perftools.

Results are here:
http://arrowd.name/strip.prof.1.ps
http://arrowd.name/strip.prof.2.ps

Most of time (53%) is spent in elf_getscn() function.
Comment 6 Ed Maste freebsd_committer 2019-05-20 20:26:28 UTC
in elf_getscn:

   151          STAILQ_FOREACH(s, &e->e_u.e_elf.e_scn, s_next)
   152                  if (s->s_ndx == index)
   153                          return (s);

Well, there's your problem... (or at least a big part of the problem)
Comment 7 Gleb Popov freebsd_committer 2019-05-21 05:40:32 UTC
(In reply to Ed Maste from comment #6)
By saying "your problem", you mean that I should go report it upstream? Who should fix this?
Comment 8 Ed Maste freebsd_committer 2019-05-21 15:40:09 UTC
(In reply to Gleb Popov from comment #7)
Sorry, "Well, there's your problem" is a quote originally from the Mythbusters TV show; here it is commentary on the linked list traversal in elf_getscn.  You could submit a ticket on https://sourceforge.net/p/elftoolchain/ but we could just make a change in FreeBSD under this PR 234949 and then upstream it.
Comment 9 Steve Wills freebsd_committer 2019-05-26 21:06:22 UTC
I'm seeing this as well. What's the status of this issue? Waiting on a patch?
Comment 10 Mark Johnston freebsd_committer 2019-05-26 22:10:12 UTC
(In reply to Ed Maste from comment #6)
I tried replacing the section list with a red-black tree, ordered by section index.  It reduced the runtime of "strip libHSCabal-2.4.0.1.a" from 1m20s to 52s, so this isn't the only problem.

From reading the elfcopy code, it looks like the strip implementation is just naive.  We copy each elf file in the archive into a temp archive, during which strip does its filtering, and finally rename the tempfile over the original.  So even if strip has nothing to do, it ends up doing a lot of work.
Comment 11 Mark Johnston freebsd_committer 2019-05-26 22:35:02 UTC
(In reply to Mark Johnston from comment #10)
Hmm, GNU strip seems to do something similar, but manages to be much faster.

Some profiling points at nested loops in create_scn->is_remove_reloc_scn().  Changing is_remove_reloc_scn() to use elf_getscn() instead of searching the section list gets me down to 38s.

Profiling also points at insert_to_sec_list(), which would appear to benefit from a binary search tree as well.
Comment 12 Mark Johnston freebsd_committer 2019-05-26 23:31:56 UTC
It looks like most of the rest of the slowdown comes from inefficient implementations in insert_to_strtab() and lookup_string().  The latter can be replaced with memmem(), which helps a bit.  In a lot of cases when we call lookup_string() we should in principle be able to reuse the shstrtab index from the old section and thus avoid the lookup entirely.  However, insert_to_strtab() may modify the table in place if it finds a string in the table that is a suffix of the string to be inserted.
Comment 13 Mark Johnston freebsd_committer 2019-05-27 04:51:44 UTC
Created attachment 204641 [details]
proposed patch

It turns out that libelftc provides some ELF string table routines which are more efficient than what strip uses today.  With all of my diffs combined I get:

> time strip ghc-8.6.2/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1.a

real    0m6.817s
user    0m6.428s
sys     0m0.135s
> time strip ghc-8.6.2/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1_p.a

real    0m8.674s
user    0m8.185s
sys     0m0.155s

vs. GNU strip:

> time /usr/local/bin/strip ghc-8.6.2/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1.a

real    0m2.671s
user    0m2.334s
sys     0m0.245s

> time /usr/local/bin/strip ghc-8.6.2/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1_p.a

real    0m4.161s
user    0m3.599s
sys     0m0.351s

So there's clearly still room for improvement.

I did not test my changes extensively yet, but I attached a combined patch if anyone else is interested in testing them with ports builds.
Comment 14 Mark Johnston freebsd_committer 2019-05-27 05:02:40 UTC
I should note that elftc_string_table does not do suffix matching, unlike the implementation in strip.  However in this case we're only dealing with the section name string table, where this kind of optimization is not important.
Comment 15 commit-hook freebsd_committer 2019-05-30 15:29:42 UTC
A commit references this bug:

Author: markj
Date: Thu May 30 15:28:49 UTC 2019
New revision: 348431
URL: https://svnweb.freebsd.org/changeset/base/348431

Log:
  elfcopy: Optimize for insertions at the end of the section list.

  This is the common case when strip(1) is creating the output file.
  The change provides a significant speedup when running on ELF files with
  many sections.

  PR:		234949
  Reviewed by:	emaste
  MFC after:	1 week
  Sponsored by:	The FreeBSD Foundation
  Differential Revision:	https://reviews.freebsd.org/D20444

Changes:
  head/contrib/elftoolchain/elfcopy/elfcopy.h
  head/contrib/elftoolchain/elfcopy/sections.c
Comment 16 Gleb Popov freebsd_committer 2019-06-01 09:04:23 UTC
I can confirm that last commit improved things a bit:

# /usr/bin/time -h /usr/bin/strip ghc-8.6.4/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1.a
        59,70s real             59,53s user             0,16s sys
# /usr/bin/time -h /usr/bin/strip ghc-8.6.4/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1_p.a 
        3m17,14s real           3m16,82s user           0,29s sys
Comment 18 commit-hook freebsd_committer 2019-06-04 18:27:01 UTC
A commit references this bug:

Author: markj
Date: Tue Jun  4 18:26:31 UTC 2019
New revision: 348652
URL: https://svnweb.freebsd.org/changeset/base/348652

Log:
  libelf: Use a red-black tree to manage the section list.

  The tree is indexed by section number.  This speeds up elf_getscn()
  and its callers, which previously had to traverse a linked list. In
  particular, since .shstrtab is often the last section in a file,
  elf_strptr() would have to traverse the entire list.

  PR:		234949
  Reviewed by:	emaste
  MFC after:	2 weeks
  Sponsored by:	The FreeBSD Foundation
  Differential Revision:	https://reviews.freebsd.org/D20443

Changes:
  head/contrib/elftoolchain/libelf/_libelf.h
  head/contrib/elftoolchain/libelf/elf_end.c
  head/contrib/elftoolchain/libelf/elf_scn.c
  head/contrib/elftoolchain/libelf/elf_update.c
  head/contrib/elftoolchain/libelf/libelf_allocate.c
  head/contrib/elftoolchain/libelf/libelf_ehdr.c
  head/contrib/elftoolchain/libelf/libelf_extended.c
Comment 19 commit-hook freebsd_committer 2019-06-04 18:30:09 UTC
A commit references this bug:

Author: markj
Date: Tue Jun  4 18:29:09 UTC 2019
New revision: 348654
URL: https://svnweb.freebsd.org/changeset/base/348654

Log:
  elfcopy: Use elf_getscn() instead of iterating over all sections.

  When removing a section, we would loop over all sections looking for
  a corresponding relocation section.  With r348652 it is much faster
  to just use elf_getscn().

  PR:		234949
  Reviewed by:	emaste
  MFC after:	2 weeks
  Sponsored by:	The FreeBSD Foundation
  Differential Revision:	https://reviews.freebsd.org/D20471

Changes:
  head/contrib/elftoolchain/elfcopy/sections.c
Comment 20 commit-hook freebsd_committer 2019-06-04 18:34:35 UTC
A commit references this bug:

Author: markj
Date: Tue Jun  4 18:34:06 UTC 2019
New revision: 348657
URL: https://svnweb.freebsd.org/changeset/base/348657

Log:
  elfcopy: Use libelftc's string table routines to build .shstrtab.

  This replaces some hand-rolled routines and is substantially faster
  since libelftc uses a hash table for lookups and insertions, whereas
  elfcopy would perform a linear scan of the table.

  PR:		234949
  Reviewed by:	emaste
  MFC after:	2 weeks
  Sponsored by:	The FreeBSD Foundation
  Differential Revision:	https://reviews.freebsd.org/D20473

Changes:
  head/contrib/elftoolchain/elfcopy/ascii.c
  head/contrib/elftoolchain/elfcopy/binary.c
  head/contrib/elftoolchain/elfcopy/elfcopy.h
  head/contrib/elftoolchain/elfcopy/main.c
  head/contrib/elftoolchain/elfcopy/sections.c
Comment 21 Ed Maste freebsd_committer 2019-06-06 01:54:08 UTC
arrowd@, can you try again now that the changes have been committed?
Comment 22 commit-hook freebsd_committer 2019-06-06 03:04:31 UTC
A commit references this bug:

Author: markj
Date: Thu Jun  6 03:03:55 UTC 2019
New revision: 348720
URL: https://svnweb.freebsd.org/changeset/base/348720

Log:
  MFC r348431:
  elfcopy: Optimize for insertions at the end of the section list.

  PR:	234949

Changes:
_U  stable/12/
  stable/12/contrib/elftoolchain/elfcopy/elfcopy.h
  stable/12/contrib/elftoolchain/elfcopy/sections.c
Comment 23 Gleb Popov freebsd_committer 2019-06-06 08:20:45 UTC
(In reply to Ed Maste from comment #21)


Much better:

# /usr/bin/time -h /usr/bin/strip ghc-8.6.5/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1.a
        37,35s real             37,11s user             0,23s sys
# /usr/bin/time -h /usr/bin/strip ghc-8.6.5/libraries/Cabal/Cabal/dist-install/build/libHSCabal-2.4.0.1_p.a
        2m12,54s real           2m12,21s user           0,32s sys
Comment 24 Mark Johnston freebsd_committer 2019-06-06 14:46:59 UTC
(In reply to Gleb Popov from comment #23)
Err, it should be much better than that.  What revision are you running?
Comment 25 Gleb Popov freebsd_committer 2019-06-06 14:57:05 UTC
(In reply to Mark Johnston from comment #24)

I've done svn update only for /contrib/elftoolchain:

# cd /usr/src
# svn info
Path: .
Working Copy Root Path: /usr/src
URL: svn+ssh://repo.freebsd.org/base/head
Relative URL: ^/head
Repository Root: svn+ssh://repo.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 346440
Node Kind: directory
Schedule: normal
Last Changed Author: markj
Last Changed Rev: 346440
Last Changed Date: 2019-04-20 15:34:53 +0400 (сб, 20 апр. 2019)

# cd contrib/elftoolchain/
# svn info
Path: .
Working Copy Root Path: /usr/src
URL: svn+ssh://repo.freebsd.org/base/head/contrib/elftoolchain
Relative URL: ^/head/contrib/elftoolchain
Repository Root: svn+ssh://repo.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 348728
Node Kind: directory
Schedule: normal
Last Changed Author: markj
Last Changed Rev: 348657
Last Changed Date: 2019-06-04 22:34:05 +0400 (вт, 04 июня 2019)
Comment 26 Mark Johnston freebsd_committer 2019-06-06 15:00:19 UTC
(In reply to Gleb Popov from comment #25)
Did you rebuild both strip and libelf?
Comment 27 Gleb Popov freebsd_committer 2019-06-06 15:01:36 UTC
(In reply to Mark Johnston from comment #26)
Ah, sorry. I only did

# make -C /usr/src/usr.bin/objcopy install

Do I need something else?
Comment 28 Mark Johnston freebsd_committer 2019-06-06 15:02:05 UTC
(In reply to Gleb Popov from comment #27)
Try also rebuilding lib/libelf.
Comment 29 Gleb Popov freebsd_committer 2019-06-06 15:04:02 UTC
Yep, that helped. Now stripping _p version of the library takes only

11,16s real             10,85s user             0,29s sys

Awesome! Thanks for fixing this.
Comment 30 Mark Johnston freebsd_committer 2019-06-06 15:12:06 UTC
(In reply to Gleb Popov from comment #29)
Hmm, we can do even better.  In r348657 I did not specify a hint for the number of hash table entries: the elftc code automatically resizes the hash table.  I think it's a bit conservative in how it does that though; if I do

diff --git a/contrib/elftoolchain/elfcopy/sections.c b/contrib/elftoolchain/elfcopy/sections.c
index 93dbf4422863..25f3c8c0bac8 100644
--- a/contrib/elftoolchain/elfcopy/sections.c
+++ b/contrib/elftoolchain/elfcopy/sections.c
@@ -1410,7 +1410,7 @@ init_shstrtab(struct elfcopy *ecp)
        s->loadable = 0;
        s->type = SHT_STRTAB;
        s->vma = 0;
-       s->strtab = elftc_string_table_create(0);
+       s->strtab = elftc_string_table_create(1024 * 1024);
 
        add_to_shstrtab(ecp, "");
        add_to_shstrtab(ecp, ".symtab");

then the runtime drops further.  (And we end up being slightly faster than GNU strip on the _p library.)
Comment 31 Mark Johnston freebsd_committer 2019-06-06 21:23:00 UTC
https://reviews.freebsd.org/D20544
Comment 32 commit-hook freebsd_committer 2019-06-18 16:29:57 UTC
A commit references this bug:

Author: markj
Date: Tue Jun 18 16:29:48 UTC 2019
New revision: 349171
URL: https://svnweb.freebsd.org/changeset/base/349171

Log:
  MFC r348652:
  libelf: Use a red-black tree to manage the section list.

  PR:	234949

Changes:
_U  stable/12/
  stable/12/contrib/elftoolchain/libelf/_libelf.h
  stable/12/contrib/elftoolchain/libelf/elf_end.c
  stable/12/contrib/elftoolchain/libelf/elf_scn.c
  stable/12/contrib/elftoolchain/libelf/elf_update.c
  stable/12/contrib/elftoolchain/libelf/libelf_allocate.c
  stable/12/contrib/elftoolchain/libelf/libelf_ehdr.c
  stable/12/contrib/elftoolchain/libelf/libelf_extended.c
Comment 33 commit-hook freebsd_committer 2019-06-18 16:31:00 UTC
A commit references this bug:

Author: markj
Date: Tue Jun 18 16:30:35 UTC 2019
New revision: 349172
URL: https://svnweb.freebsd.org/changeset/base/349172

Log:
  MFC r348654:
  elfcopy: Use elf_getscn() instead of iterating over all sections.

  PR:	234949

Changes:
_U  stable/12/
  stable/12/contrib/elftoolchain/elfcopy/sections.c
Comment 34 commit-hook freebsd_committer 2019-06-18 16:32:03 UTC
A commit references this bug:

Author: markj
Date: Tue Jun 18 16:31:07 UTC 2019
New revision: 349173
URL: https://svnweb.freebsd.org/changeset/base/349173

Log:
  MFC r348657:
  elfcopy: Use libelftc's string table routines to build .shstrtab.

  PR:	234949

Changes:
_U  stable/12/
  stable/12/contrib/elftoolchain/elfcopy/ascii.c
  stable/12/contrib/elftoolchain/elfcopy/binary.c
  stable/12/contrib/elftoolchain/elfcopy/elfcopy.h
  stable/12/contrib/elftoolchain/elfcopy/main.c
  stable/12/contrib/elftoolchain/elfcopy/sections.c
Comment 35 Mark Johnston freebsd_committer 2019-06-26 16:43:15 UTC
Closing since this seems to be sufficiently addressed by my testing, and I've merged the changes to stable/12.  Please reopen if you disagree.