It was reported on the mailing list [1], on the builder [2] and it also fails on my board RUST_BACKTRACE=1 seems to be the culprit [1] https://lists.freebsd.org/archives/freebsd-arm/2021-June/000255.html [2] http://ampere2.nyi.freebsd.org/data/main-arm64-default/pc9afda5a14a3_sb43d600c83/logs/errors/rust-1.52.1.log
Has something changed on main? It (rust-1.53.0) was fine on ref13-aarch64. I also started a build on ref14-aarch64 just now and it seems fine too (currently in the LLVM stage), but base seems kind of outdated: ref14-aarch64$ uname -a FreeBSD ref14-aarch64.freebsd.org 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n246034-e6ab1e365c0: Tue Apr 13 11:01:50 UTC 2021 root@build-14.freebsd.org:/usr/obj/arm64.aarch64/usr/src/sys/CLUSTER14 arm64 ref14-aarch64$ grep __FreeBSD_version /usr/include/sys/param.h #define __FreeBSD_version 1400008 /* Master, propagated to newvers */
Created attachment 226084 [details] lang/rust 1.53.0 poudriere failing on aarch64 I start with a fresh install and recent world and no-debug kernel: # uname -a FreeBSD asn 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n247543-33e1287b6a54: Fri Jun 25 15:23:11 CEST 2021 root@asn:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC-NODEBUG arm64 # cc -v FreeBSD clang version 12.0.1 (git@github.com:llvm/llvm-project.git llvmorg-12.0.1-rc2-0-ge7dac564cd0e) Target: aarch64-unknown-freebsd14.0 Thread model: posix InstalledDir: /usr/bin As suggested by Mikael in comment #0 I removed RUST_BACKTRACE=1 from the Makefile and compilation fails after cmake with 9.2M of logs ending with: <LOG> -- Generating done -- Build files have been written to: /wrkdirs/usr/ports/lang/rust/work/rustc-1.53.0-src/build/aarch64-unknown-freebsd/ llvm/build ninja: error: manifest 'build.ninja' still dirty after 100 tries thread 'main' panicked at ' command did not execute successfully, got: exit code: 1 build script failed, must exit now', /wrkdirs/usr/ports/lang/rust/work/rustc-1.53.0-src/vendor/cmake/src/lib.rs:885:5 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace finished in 1533.597 seconds Traceback (most recent call last): File "x.py", line 27, in <module> bootstrap.main() File "/wrkdirs/usr/ports/lang/rust/work/rustc-1.53.0-src/src/bootstrap/bootstrap.py", line 1191, in main bootstrap(help_triggered) File "/wrkdirs/usr/ports/lang/rust/work/rustc-1.53.0-src/src/bootstrap/bootstrap.py", line 1177, in bootstrap run(args, env=env, verbose=build.verbose) File "/wrkdirs/usr/ports/lang/rust/work/rustc-1.53.0-src/src/bootstrap/bootstrap.py", line 153, in run raise RuntimeError(err) RuntimeError: failed to run: /wrkdirs/usr/ports/lang/rust/work/rustc-1.53.0-src/build/bootstrap/debug/bootstrap build --jobs=16 *** Error code 1 Stop. make: stopped in /usr/ports/lang/rust </LOG> As per comment #1 by Tobias it seems compilation fails anywhere between his environment (April 13) and a more recent one. Can someone with a recent environment reproduce? Thank you.
(In reply to pr from comment #2) Yes, my RTC does not work, the build was attempted a few minutes ago, not in March 2011...
(In reply to Tobias Kortkamp from comment #1) It "seems" to work with an old kernel: main-n247244-61814702398c-dirty: Tue Jun 8 14:29:29 CEST 2021 will try to bisect when I have time.
(In reply to Mikael Urankar from comment #4) here is the stacktrace: gdb --args env RUST_BACKTRACE=1 /usr/ports/lang/rust/work/bootstrap/bin/cargo build --manifest-path /usr/ports/lang/rust/work/rustc-1.53.0-src/src/bootstrap/Cargo.toml --verbose --frozen Reading symbols from env... Reading symbols from /usr/lib/debug//usr/bin/env.debug... (gdb) r Starting program: /usr/bin/env RUST_BACKTRACE=1 /usr/ports/lang/rust/work/bootstrap/bin/cargo build --manifest-path /usr/ports/lang/rust/work/rustc-1.53.0-src/src/bootstrap/Cargo.toml --verbose --frozen process 17398 is executing new program: /usr/ports/lang/rust/work/bootstrap/bin/cargo Program received signal SIGSEGV, Segmentation fault. libunwind::DwarfInstructions<libunwind::LocalAddressSpace, libunwind::Registers_arm64>::getSavedRegister (addressSpace=..., registers=..., cfa=cfa@entry=64, savedReg=...) at /usr/src/contrib/llvm-project/libunwind/src/DwarfInstructions.hpp:84 warning: Source file is more recent than executable. 84 return (pint_t)addressSpace.getRegister(cfa + (pint_t)savedReg.value); (gdb) bt #0 libunwind::DwarfInstructions<libunwind::LocalAddressSpace, libunwind::Registers_arm64>::getSavedRegister (addressSpace=..., registers=..., cfa=cfa@entry=64, savedReg=...) at /usr/src/contrib/llvm-project/libunwind/src/DwarfInstructions.hpp:84 #1 0x0000000040dce838 in libunwind::DwarfInstructions<libunwind::LocalAddressSpace, libunwind::Registers_arm64>::stepWithDwarf (addressSpace=..., pc=<optimized out>, fdeStart=<optimized out>, registers=..., isSignalFrame=@0xffffffff8ec9: false) at /usr/src/contrib/llvm-project/libunwind/src/DwarfInstructions.hpp:192 #2 0x0000000040dce2f8 in libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_arm64>::stepWithDwarfFDE (this=0xffffffff8c60) at /usr/src/contrib/llvm-project/libunwind/src/UnwindCursor.hpp:954 #3 libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_arm64>::step (this=0xffffffff8c60) at /usr/src/contrib/llvm-project/libunwind/src/UnwindCursor.hpp:2090 #4 0x0000000040dcbcfc in _Unwind_Backtrace (callback=0xc0e5a0 <std::backtrace_rs::backtrace::libunwind::trace::trace_fn>, ref=<optimized out>) at /usr/src/contrib/llvm-project/libunwind/src/UnwindLevel1-gcc-ext.c:131 #5 0x0000000000be9a8c in std::backtrace::Backtrace::create () #6 0x00000000007cfc78 in cargo::util::restricted_names::validate_package_name () #7 0x00000000007c7db4 in cargo::util::config::Config::get_registry_index () #8 0x00000000006d7d54 in cargo::util::toml::TomlManifest::patch () #9 0x00000000006cbf14 in cargo::util::toml::read_manifest () #10 0x00000000008011d8 in cargo::core::workspace::Packages::load () #11 0x00000000007fd6a4 in cargo::core::workspace::Workspace::find_root () #12 0x00000000007f9340 in cargo::core::workspace::Workspace::new () #13 0x0000000000483dac in cargo::util::command_prelude::ArgMatchesExt::workspace () #14 0x0000000000484828 in cargo::commands::build::exec () #15 0x000000000042b398 in cargo::cli::main () #16 0x00000000004338b4 in cargo::main () #17 0x000000000042664c in std::sys_common::backtrace::__rust_begin_short_backtrace () #18 0x000000000047f154 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h82294d672672995f () #19 0x0000000000c0df80 in std::rt::lang_start_internal () #20 0x00000000004353c0 in main () I wonder if it's the recent llvm12 upgrade that's causing this issue.
I do not normally build rust, but in a context with two poudriere jails, one for main [so: 14] and one for releng/13 (with -p2) and the host system booted in main . . . # poudriere bulk -jmain-CA72 -w lang/rust . . . [00:00:17] [01] [00:00:00] Building lang/rust | rust-1.52.1 [00:01:52] [01] [00:01:35] Saving lang/rust | rust-1.52.1 wrkdir [00:08:22] [01] [00:08:05] Saved lang/rust | rust-1.52.1 wrkdir to: /usr/local/poudriere/data/wrkdirs/main-CA72-default/default/rust-1.52.1.tbz [00:08:25] [01] [00:08:08] Finished lang/rust | rust-1.52.1: Failed: build and it produced: /wrkdirs/usr/ports/lang/rust/work/rustc-1.52.1-src/cargo.core but in the releng/13 jail . . . # poudriere bulk -j13_0R-CA72 -w lang/rust [00:00:20] [01] [00:00:00] Building lang/rust | rust-1.52.1 load: 17.14 cmd: sh 64405 [nanslp] 588.33r 0.42u 2.79s 0% 3944k mi_switch+0xf4 sleepq_catch_signals+0x458 sleepq_timedwait_sig+0x14 _sleep+0x17c kern_clock_nanosleep+0x1c4 sys_nanosleep+0x3c do_el0_sync+0x4ac handle_el0_sync+0x90 [13_0R-CA72-default] [2021-06-28_08h41m27s] [parallel_build:] Queued: 1 Built: 0 Failed: 0 Skipped: 0 Ignored: 0 Tobuild: 1 Time: 00:09:48 [01]: lang/rust | rust-1.52.1 build (00:08:20 / 00:09:35) [00:09:55] Logs: /usr/local/poudriere/data/logs/bulk/13_0R-CA72-default/2021-06-28_08h41m27s it did not crash at the first few minutes, unlike the main jail. Both use the same ports tree. For reference: The host and jail main-CA72 : # uname -apKU FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #5 main-n247562-66aec14a5391-dirty: Thu Jun 24 21:36:23 PDT 2021 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400024 1400024 The 13_0R-CA72 jail (on the same system): # uname -apKU FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #5 main-n247562-66aec14a5391-dirty: Thu Jun 24 21:36:23 PDT 2021 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400024 1300139
(In reply to Mikael Urankar from comment #5) Well, building lang/rust-bootstrap from 12.2 with llvm 11 and hoping it will work on CURRENT with llvm 12 is very optimistic, to say the least. # file /usr/ports/lang/rust/work/bootstrap/bin/cargo /usr/ports/lang/rust/work/bootstrap/bin/cargo: ELF 64-bit LSB shared object, ARM aarch64, version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 12.2, FreeBSD-style, with debug_info, not stripped The workaround seems to require to re-build lang/rust-bootstrap on CURRENT, but this port depends on rust, creating a chicken-egg problem. Can you/Mark build a package for lang/rust-bootstrap on your not-so-current CURRENT kernel? This way we can then use it to build rust on a CURRENT kernel and settle this. If there is a bug in llvm12, shall this be filed separately?
(In reply to pr from comment #7) > Well, building lang/rust-bootstrap from 12.2 with llvm 11 and hoping > it will work on CURRENT with llvm 12 is very optimistic, to say the > least. How so? Please explain this.
Ok, so can we assume this is a llvm12 + CURRENT problem? I run some tests: 1) Rewinding to d21c884 (before llvm 12 landed in CURRENT); 2) Building packages for 1.53 rust-bootstrap on CURRENT with llvm 11.0.1; ...and lang/rust builds. Can we have the right people looking at this bug? Maybe it shall be re-assigned?
^Triage: Request feedback from Dimitry Additionally, not sure if as have a tracking (meta) bug for LLVM12 issues
Hm, I've no idea what action(s) I could take here. I don't have access to aarch64 hardware, at least not where I could build ports. Did somebody manage to reduce this crash to some sort of self-contained test case?
(In reply to Dimitry Andric from comment #11) I am a rust illiterate and, honestly, not willing to learn rust as of now. I can give you ssh access to an aarch64 machine. Contact me by email if you are interested.
(In reply to Dimitry Andric from comment #11) Here is a simple reproducer. $ cat bt.c #include <execinfo.h> int main() { void *addrlist[100]; backtrace(addrlist, 100); } Compile it on FreeBSD 12.2 (ref12-aarch64.freebsd.org): $ cc -o bt bt.c -lexecinfo $ ./bt $ Copy to install of FreeBSD 14 with LLVM12 and run it (using 20210701-c5f4772c66d-247671 aarch64 snapshot running in QEMU) $ ./bt Segmentation Fault (core dumped) Does not crash when LD_PRELOADing libgcc_s.so.1 from 12.2: $ env LD_PRELOAD=libgcc_s.so.1.12.2 ./bt $
(In reply to Tobias Kortkamp from comment #13) I need some handholding here. Which of the images did you download? THere are a lot of them on e.g. https://download.freebsd.org/ftp/snapshots/arm64/aarch64/ISO-IMAGES/14.0/, and I have no idea which one to choose. The qemu command line I've stolen from https://wiki.freebsd.org/arm64/QEMU is: qemu-system-aarch64 \ -m 4096M \ -cpu cortex-a57 \ -M virt \ -bios edk2-aarch64-code.fd \ -serial telnet::4444,server \ -nographic \ -drive if=none,file=VMDISK,id=hd0 \ -device virtio-blk-device,drive=hd0 \ -device virtio-net-device,netdev=net0 \ -netdev user,id=net0 but the one that's missing is: where do you get a prepopulated disk from? :)
(In reply to Dimitry Andric from comment #14) There are ready to use vm images for aarch64 too ;-). I used this one: https://download.freebsd.org/ftp/snapshots/VM-IMAGES/14.0-CURRENT/aarch64/20210701/FreeBSD-14.0-CURRENT-arm64-aarch64-20210701-c5f4772c66d-247671.qcow2.xz
(In reply to Tobias Kortkamp from comment #13) FYI: I tried the small test sequence using releng/13 (-p3) chroot instead of 12.2 and did not get a crash. I guess that the implication is the the bootstrap involved for rust was built on 12 and used as a pre-built binary on 14? I've not had any variant of 12 around for a rather long time. So anything tied to 12 in building rust would have to have shown up from outside my system. For reference: # file bt bt: ELF 64-bit LSB executable, ARM aarch64, version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 13.0 (1300139), FreeBSD-style, with debug_info, not stripped # uname -apKU FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #7 main-n247651-a00d703f2f43-dirty: Wed Jun 30 15:11:11 PDT 2021 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400024 1400024 # ./bt #
I have been able to reproduce the segfault (thanks Tobias!), and it appears to be a regression due to https://github.com/llvm/llvm-project/commit/23bef7ee9923b1262326981960397e8cd95d6923 ("[libunwind] Support for leaf function unwinding"). Not sure what is going wrong exactly, though. I will have to take it up with upstream to get it properly sorted out, but I guess that is handy to revert the commit in the main branch for now. Upstream also reverted this later in https://github.com/llvm/llvm-project/commit/5831adb8c38f3fd1b17ff52984c514fc32e893f6, then reapplied it in https://github.com/llvm/llvm-project/commit/22b615a96593f13109a27cabfd1764ec4f558c7a.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=5866c369e4fd917c0d456f0f10b92ee354b82279 commit 5866c369e4fd917c0d456f0f10b92ee354b82279 Author: Dimitry Andric <dim@FreeBSD.org> AuthorDate: 2021-07-02 22:35:42 +0000 Commit: Dimitry Andric <dim@FreeBSD.org> CommitDate: 2021-07-02 22:35:49 +0000 Revert libunwind change to fix backtrace segfault on aarch64 Revert commit 22b615a96593 from llvm git (by Daniel Kiss): [libunwind] Support for leaf function unwinding. Unwinding leaf function is useful in cases when the backtrace finds a leaf function for example when it caused a signal. This patch also add the support for the DW_CFA_undefined because it marks the end of the frames. Ryan Prichard provided code for the tests. Reviewed By: #libunwind, mstorsjo Differential Revision: https://reviews.llvm.org/D83573 Reland with limit the test to the x86_64-linux target. Bisection has shown that this particular upstream commit causes programs using backtrace(3) on aarch64 to segfault. This affects the lang/rust port, for instance. Until we can upstream to fix this problem, revert the commit for now. Reported by: mikael PR: 256864 contrib/llvm-project/libunwind/src/DwarfInstructions.hpp | 9 +-------- contrib/llvm-project/libunwind/src/DwarfParser.hpp | 3 +-- 2 files changed, 2 insertions(+), 10 deletions(-)
(In reply to Dimitry Andric from comment #17) Thanks Dim. If there is (or will be) a base tracking issue associated with this regression, please add this issue to its See Also field. If there is (or will be) an meta tracking issue for LLVM12 issues/regressions, please add this issue to its Depends On: field ^Triage: Assign to committer that resolves and close
Reported upstream as https://bugs.llvm.org/show_bug.cgi?id=50972 although I hope I can also reproduce it somehow on Linux :)
(In reply to Dimitry Andric from comment #20) Bob Prohaska started a poudriere bulk on an RPi3B (used as aarch64) before 2021-Jul-02 and its build of lang/rust started before then too. http://www.zefox.org/~bob/poudriere/data/logs/bulk/main-default/2021-07-01_19h00m04s/build.html shows that rust-1.53.0 built in 38:53:19 . No build error. (As I write this the overall bulk run is still going.) I'm not so sure about the libunwind change vs. the building of rust. The change was not needed for his build of the later 1.53.0 rust version in/for an aarch64 context.
(In reply to Dimitry Andric from comment #20) Thanks Dimitry! (In reply to Mark Millard from comment #21) This is not very surprising given that the Poudriere jail is clearly still using an older base with LLVM11. You can see this in the ports_env.sh output: #### /usr/ports/Mk/Scripts/ports_env.sh #### _CCVERSION_921dbbb2=FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c3fe) Target: aarch64-unknown-freebsd14.0 Thread model: posix InstalledDir: /usr/bin ... OSVERSION=1400019 and also in the CMake output later: -- The C compiler identification is Clang 11.0.1 -- The CXX compiler identification is Clang 11.0.1 http://www.zefox.org/~bob/poudriere/data/logs/bulk/main-default/2021-07-01_19h00m04s/logs/rust-1.53.0.log
(In reply to Tobias Kortkamp from comment #22) Thanks for that note. I'd not even thought to check on that distinction, implicitly expecting that things were tracking. It likely also explains a problem Bob has been having building devel/llvm10 in poudriere.
(In reply to Dimitry Andric from comment #20) I have the same failure in the Ceph port. There is an explicit backtrace test: unittest_back_trace. It fails with exactly the same backtrace in gdb. But this is on amd64. So I guess it is not only a problem om aarch64
(In reply to Willem Jan Withagen from comment #24) But, is this solved by https://cgit.FreeBSD.org/src/commit/?id=5866c369e4fd917c0d456f0f10b92ee354b82279 ? Otherwise it is a different bug.
(In reply to Dimitry Andric from comment #25) I'm running: FreeBSD quad-b.digiware.nl 14.0-CURRENT FreeBSD 14.0-CURRENT #2 main-n247748-c5d6dd80b54: Mon Jul 5 18:37:29 CEST 2021 root@quad-b.digiware.nl:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 But I'll git-pull a new version, and fully rebuild. It there a chance that make buildworld does not rebuild libunwind??
(In reply to Willem Jan Withagen from comment #26) > But I'll git-pull a new version, and fully rebuild. > > It there a chance that make buildworld does not rebuild libunwind?? It should, but there are always possibilities that stuff goes wrong. In any case, the original bug is about aarch64, so your case might be something completely different. What you can easily try is running (supposing that your source is in /usr/src): cd /usr/src/lib/libgcc_s make cleandir make obj make depend make then run your test(s) with LD_LIBRARY_PATH set to the objdir, which is usually: /usr/obj/usr/src/amd64.amd64/lib/libgcc_s (you can double-check that it is loading libgcc_s.so from that dir by running "ldd" on your test case executable first)
(In reply to Dimitry Andric from comment #27) You suggest: cd /usr/src/lib/libgcc_s make cleandir make obj make depend make Does not quite work: /usr/local/bin/x86_64-unknown-freebsd14.0-ld: /usr/obj/usr/src/amd64.amd64/lib/libc/libc.a(strftime.o): relocation R_X86_64_32 against symbol `__xlocale_global_locale' can not be used when making a shared object; recompile with -fPIC /usr/local/bin/x86_64-unknown-freebsd14.0-ld: /usr/obj/usr/src/amd64.amd64/lib/libc/libc.a(fix_grouping.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC /usr/local/bin/x86_64-unknown-freebsd14.0-ld: libgcc_s.so.1.full: version node not found for symbol _malloc_options@FBSD_1.0 /usr/local/bin/x86_64-unknown-freebsd14.0-ld: failed to set dynamic section sizes: bad value cc: error: linker command failed with exit code 1 (use -v to see invocation) *** Error code 1 Stop. make: stopped in /usr/src/lib/libgcc_s
(In reply to Willem Jan Withagen from comment #28) I submitted a new bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257081
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=03c75960497c7a8aa6ff836be3a7be8a3b081d39 commit 03c75960497c7a8aa6ff836be3a7be8a3b081d39 Author: Dimitry Andric <dim@FreeBSD.org> AuthorDate: 2021-07-02 22:35:42 +0000 Commit: Dimitry Andric <dim@FreeBSD.org> CommitDate: 2021-12-25 11:51:10 +0000 Revert libunwind change to fix backtrace segfault on aarch64 Revert commit 22b615a96593 from llvm git (by Daniel Kiss): [libunwind] Support for leaf function unwinding. Unwinding leaf function is useful in cases when the backtrace finds a leaf function for example when it caused a signal. This patch also add the support for the DW_CFA_undefined because it marks the end of the frames. Ryan Prichard provided code for the tests. Reviewed By: #libunwind, mstorsjo Differential Revision: https://reviews.llvm.org/D83573 Reland with limit the test to the x86_64-linux target. Bisection has shown that this particular upstream commit causes programs using backtrace(3) on aarch64 to segfault. This affects the lang/rust port, for instance. Until we can upstream to fix this problem, revert the commit for now. Reported by: mikael PR: 256864 (cherry picked from commit 5866c369e4fd917c0d456f0f10b92ee354b82279) contrib/llvm-project/libunwind/src/DwarfInstructions.hpp | 9 +-------- contrib/llvm-project/libunwind/src/DwarfParser.hpp | 3 +-- 2 files changed, 2 insertions(+), 10 deletions(-)