Bug 246537 - lib.libexecinfo.backtrace_test.backtrace_fmt_basic starts failing on amd64 after r360915
Summary: lib.libexecinfo.backtrace_test.backtrace_fmt_basic starts failing on amd64 af...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: tests (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-testing mailing list
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2020-05-18 12:13 UTC by Li-Wen Hsu
Modified: 2020-05-20 20:21 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Li-Wen Hsu freebsd_committer 2020-05-18 12:13:18 UTC
This test case starts failing since this build https://ci.freebsd.org/job/FreeBSD-head-amd64-test/15169/
Still need to check which commit caused this.

Error Message

/usr/src/contrib/netbsd-tests/lib/libexecinfo/t_backtrace.c:93: nptrs >= ncalls + 2 + min_frames not met

Standard Output

got nptrs=17 ncalls=12 (min_frames: 4, max_frames: 9)
backtrace is:
#0: myfunc3
#1: myfunc2
#2: myfunc1
#3: myfunc1
#4: myfunc1
#5: myfunc1
#6: myfunc1
#7: myfunc1
#8: myfunc1
#9: myfunc1
#10: myfunc1
#11: myfunc1
#12: myfunc1
#13: myfunc1
#14: myfunc
#15: atfu_backtrace_fmt_basic_body
#16: atf_tc_run
Comment 1 commit-hook freebsd_committer 2020-05-18 12:37:21 UTC
A commit references this bug:

Author: lwhsu
Date: Mon May 18 12:36:29 UTC 2020
New revision: 361210
URL: https://svnweb.freebsd.org/changeset/base/361210

Log:
  Temporarily disable failing case in CI of amd64:

  - lib.libexecinfo.backtrace_test.backtrace_fmt_basic

  PR:		246537
  Sponsored by:	The FreeBSD Foundation

Changes:
  head/contrib/netbsd-tests/lib/libexecinfo/t_backtrace.c
Comment 2 Li-Wen Hsu freebsd_committer 2020-05-18 15:09:21 UTC
After more testing, it starts failing on amd64 after https://svnweb.freebsd.org/changeset/base/360915
i386 is not affected.
Comment 3 Conrad Meyer freebsd_committer 2020-05-18 15:51:32 UTC
I suspect r360915 is just wrong and should be reverted.  We're missing the frame below /above atf_tc_run (probably "_start" and "main").
Comment 4 Conrad Meyer freebsd_committer 2020-05-18 15:56:58 UTC
$ nm crt1.o:
..
0000000000000000 T _start

If we had one more frame (_start or main), the test would be:

18 >= 12 + 2 + 4 (pass).
Comment 5 Conrad Meyer freebsd_committer 2020-05-18 15:58:35 UTC
(Our backtrace / libexecinfo (llvm-libunwind) relies on asynchronous unwind tables and does not know about frame pointers.  Sure, maybe it should!  That would be nice.)
Comment 6 Dimitry Andric freebsd_committer 2020-05-18 16:02:50 UTC
(In reply to Conrad Meyer from comment #3)
This was done to fix bug 246322. As explained there, it is to work around a BFD ld assertion or bug. I'm fine with reverting, as long as we either fix BFD ld, or ban the use of it. :)
Comment 7 Dimitry Andric freebsd_committer 2020-05-18 16:04:17 UTC
(In reply to Dimitry Andric from comment #6)
> work around a BFD ld assertion or bug.

For more explanation see base r209294. It still hits the assertion with binutils master as of last week.
Comment 8 Conrad Meyer freebsd_committer 2020-05-18 16:05:14 UTC
We switched to both ld.lld and llvm-libunwind in CURRENT, so maybe leave it be in stable/ and just revert in CURRENT?
Comment 9 Conrad Meyer freebsd_committer 2020-05-18 16:07:52 UTC
> places the special CIE into the .eh_frame indicating the end of section,
> that is located before generated unwind table. New ld has assertion that
> verifies that closing CIE is indeed the last CIE,

Would it be possible to (?)fix LLVM to not mark the CIE as end of section?  I guess I'm unclear on if ld.bfd's assertion is reasonable or what.  Why doesn't this happen for other objects built with async unwind tables?
Comment 10 Dimitry Andric freebsd_committer 2020-05-18 16:12:22 UTC
(In reply to Conrad Meyer from comment #9)
Maybe, but I'm not sure that is the right way to go. In base r209294 the problem in BFD ld was first spotted, and at that time it was likely triggered by newer versions of gcc. But kan@ may know a bit more about this.

E.g. with the information I have now, I think this is a BFD ld bug/misfeature, and it should be fixed there.
Comment 11 Conrad Meyer freebsd_committer 2020-05-20 14:38:17 UTC
Any idea why this trips for crt1.o but not any other .o?
Comment 12 Dimitry Andric freebsd_committer 2020-05-20 19:54:00 UTC
(In reply to Conrad Meyer from comment #11)
You mean the ld.bfd assertion? Maybe it is because crt1.o is linked using -r from crt1_c.o and crt1_s.o?
Comment 13 Conrad Meyer freebsd_committer 2020-05-20 20:04:01 UTC
(In reply to Dimitry Andric from comment #12)
Yep, the ld.bfd assertion.
Comment 14 Konstantin Belousov freebsd_committer 2020-05-20 20:15:52 UTC
Trips exactly what ?
crt1.o is specially added to the linker command line to create the final binary.  It mist go first among all object files supplied to the linker.
Comment 15 Conrad Meyer freebsd_committer 2020-05-20 20:21:04 UTC
(In reply to Konstantin Belousov from comment #14)
Trips bug 246322:
> /usr/bin/ld: error in /usr/lib/crt1.o(.eh_frame); no .eh_frame_hdr table will be created.