Bug 246537

Summary: lib.libexecinfo.backtrace_test.backtrace_fmt_basic starts failing on amd64 after r360915
Product: Base System Reporter: Li-Wen Hsu <lwhsu>
Component: testsAssignee: freebsd-testing (Nobody) <testing>
Status: Closed FIXED    
Severity: Affects Only Me CC: cem, dchagin, dim, emaste, kan, kib
Priority: --- Keywords: regression
Version: CURRENT   
Hardware: amd64   
OS: Any   

Description Li-Wen Hsu freebsd_committer freebsd_triage 2020-05-18 12:13:18 UTC
This test case starts failing since this build https://ci.freebsd.org/job/FreeBSD-head-amd64-test/15169/
Still need to check which commit caused this.

Error Message

/usr/src/contrib/netbsd-tests/lib/libexecinfo/t_backtrace.c:93: nptrs >= ncalls + 2 + min_frames not met

Standard Output

got nptrs=17 ncalls=12 (min_frames: 4, max_frames: 9)
backtrace is:
#0: myfunc3
#1: myfunc2
#2: myfunc1
#3: myfunc1
#4: myfunc1
#5: myfunc1
#6: myfunc1
#7: myfunc1
#8: myfunc1
#9: myfunc1
#10: myfunc1
#11: myfunc1
#12: myfunc1
#13: myfunc1
#14: myfunc
#15: atfu_backtrace_fmt_basic_body
#16: atf_tc_run
Comment 1 commit-hook freebsd_committer freebsd_triage 2020-05-18 12:37:21 UTC
A commit references this bug:

Author: lwhsu
Date: Mon May 18 12:36:29 UTC 2020
New revision: 361210
URL: https://svnweb.freebsd.org/changeset/base/361210

Log:
  Temporarily disable failing case in CI of amd64:

  - lib.libexecinfo.backtrace_test.backtrace_fmt_basic

  PR:		246537
  Sponsored by:	The FreeBSD Foundation

Changes:
  head/contrib/netbsd-tests/lib/libexecinfo/t_backtrace.c
Comment 2 Li-Wen Hsu freebsd_committer freebsd_triage 2020-05-18 15:09:21 UTC
After more testing, it starts failing on amd64 after https://svnweb.freebsd.org/changeset/base/360915
i386 is not affected.
Comment 3 Conrad Meyer freebsd_committer freebsd_triage 2020-05-18 15:51:32 UTC
I suspect r360915 is just wrong and should be reverted.  We're missing the frame below /above atf_tc_run (probably "_start" and "main").
Comment 4 Conrad Meyer freebsd_committer freebsd_triage 2020-05-18 15:56:58 UTC
$ nm crt1.o:
..
0000000000000000 T _start

If we had one more frame (_start or main), the test would be:

18 >= 12 + 2 + 4 (pass).
Comment 5 Conrad Meyer freebsd_committer freebsd_triage 2020-05-18 15:58:35 UTC
(Our backtrace / libexecinfo (llvm-libunwind) relies on asynchronous unwind tables and does not know about frame pointers.  Sure, maybe it should!  That would be nice.)
Comment 6 Dimitry Andric freebsd_committer freebsd_triage 2020-05-18 16:02:50 UTC
(In reply to Conrad Meyer from comment #3)
This was done to fix bug 246322. As explained there, it is to work around a BFD ld assertion or bug. I'm fine with reverting, as long as we either fix BFD ld, or ban the use of it. :)
Comment 7 Dimitry Andric freebsd_committer freebsd_triage 2020-05-18 16:04:17 UTC
(In reply to Dimitry Andric from comment #6)
> work around a BFD ld assertion or bug.

For more explanation see base r209294. It still hits the assertion with binutils master as of last week.
Comment 8 Conrad Meyer freebsd_committer freebsd_triage 2020-05-18 16:05:14 UTC
We switched to both ld.lld and llvm-libunwind in CURRENT, so maybe leave it be in stable/ and just revert in CURRENT?
Comment 9 Conrad Meyer freebsd_committer freebsd_triage 2020-05-18 16:07:52 UTC
> places the special CIE into the .eh_frame indicating the end of section,
> that is located before generated unwind table. New ld has assertion that
> verifies that closing CIE is indeed the last CIE,

Would it be possible to (?)fix LLVM to not mark the CIE as end of section?  I guess I'm unclear on if ld.bfd's assertion is reasonable or what.  Why doesn't this happen for other objects built with async unwind tables?
Comment 10 Dimitry Andric freebsd_committer freebsd_triage 2020-05-18 16:12:22 UTC
(In reply to Conrad Meyer from comment #9)
Maybe, but I'm not sure that is the right way to go. In base r209294 the problem in BFD ld was first spotted, and at that time it was likely triggered by newer versions of gcc. But kan@ may know a bit more about this.

E.g. with the information I have now, I think this is a BFD ld bug/misfeature, and it should be fixed there.
Comment 11 Conrad Meyer freebsd_committer freebsd_triage 2020-05-20 14:38:17 UTC
Any idea why this trips for crt1.o but not any other .o?
Comment 12 Dimitry Andric freebsd_committer freebsd_triage 2020-05-20 19:54:00 UTC
(In reply to Conrad Meyer from comment #11)
You mean the ld.bfd assertion? Maybe it is because crt1.o is linked using -r from crt1_c.o and crt1_s.o?
Comment 13 Conrad Meyer freebsd_committer freebsd_triage 2020-05-20 20:04:01 UTC
(In reply to Dimitry Andric from comment #12)
Yep, the ld.bfd assertion.
Comment 14 Konstantin Belousov freebsd_committer freebsd_triage 2020-05-20 20:15:52 UTC
Trips exactly what ?
crt1.o is specially added to the linker command line to create the final binary.  It mist go first among all object files supplied to the linker.
Comment 15 Conrad Meyer freebsd_committer freebsd_triage 2020-05-20 20:21:04 UTC
(In reply to Konstantin Belousov from comment #14)
Trips bug 246322:
> /usr/bin/ld: error in /usr/lib/crt1.o(.eh_frame); no .eh_frame_hdr table will be created.
Comment 16 Conrad Meyer freebsd_committer freebsd_triage 2020-12-02 00:33:21 UTC
I would like to fix this for 13.0.  I think we can just revert the change that broke this now, some discussion over in another bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246322#c14
Comment 17 Konstantin Belousov freebsd_committer freebsd_triage 2020-12-02 12:43:30 UTC
(In reply to Conrad Meyer from comment #16)
So if reverting, does gcc+binutils toolchain work still ?

If yes, then there should be no reason to not revert.  Might be an exp run
for amd64+i386 is due.
Comment 18 Conrad Meyer freebsd_committer freebsd_triage 2020-12-02 18:01:07 UTC
Yeah, in the other bug we are unable to reproduce the original issue with current binutils (comments #15 and 16).  And the original reporter has not responded for ~4 months.
Comment 19 commit-hook freebsd_committer freebsd_triage 2023-06-29 16:39:34 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=c969310c992a12459ed4025c1cd8b22f29c763b5

commit c969310c992a12459ed4025c1cd8b22f29c763b5
Author:     Dmitry Chagin <dchagin@FreeBSD.org>
AuthorDate: 2023-06-29 16:34:39 +0000
Commit:     Dmitry Chagin <dchagin@FreeBSD.org>
CommitDate: 2023-06-29 16:34:39 +0000

    csu: Implement _start using as to satisfy unwinders on x86_64

    The right unwinding stop indicator should be CFI-undefined PC.
    https://dwarfstd.org/doc/Dwarf3.pdf - page 118:
    If a Return Address register is defined in the virtual unwind table,
    and its rule is undefined (for example, by DW_CFA_undefined), then
    there is no return address and no call address, and the virtual
    unwind of stack activations is complete.

    This requires the crt code be built with unwind tables, for that remove
    -fno-asynchronous-unwind-tables to enable unwind tables generation.

    PR:                     241562, 246322, 246537
    Reviewed by:            kib
    Differential Revision:  https://reviews.freebsd.org/D40780

 lib/csu/Makefile.inc         | 20 +++++-----
 lib/csu/amd64/Makefile       |  3 ++
 lib/csu/amd64/crt1_c.c       | 22 -----------
 lib/csu/amd64/crt1_s.S (new) | 88 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 102 insertions(+), 31 deletions(-)
Comment 20 commit-hook freebsd_committer freebsd_triage 2023-06-29 16:53:39 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=a18b956b73cee784e5c422d20fd0e4dabebd7eee

commit a18b956b73cee784e5c422d20fd0e4dabebd7eee
Author:     Dmitry Chagin <dchagin@FreeBSD.org>
AuthorDate: 2023-06-29 16:53:07 +0000
Commit:     Dmitry Chagin <dchagin@FreeBSD.org>
CommitDate: 2023-06-29 16:53:07 +0000

    libexecinfo: Enable backtrace_test.backtrace_fmt_basic on amd64 again

    Due to unwind tables generation enabled after c969310c for csu.

    PR:                     241562, 246322, 246537
    Reviewed by:            kib, ngie
    Differential Revision:  https://reviews.freebsd.org/D40758

 contrib/netbsd-tests/lib/libexecinfo/t_backtrace.c | 5 -----
 1 file changed, 5 deletions(-)