Bug 257419 - lang/rust: panic on armv7 at 'capacity overflow', library/alloc/src/raw_vec.rs:546:5
Summary: lang/rust: panic on armv7 at 'capacity overflow', library/alloc/src/raw_vec.r...
Status: Open
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: arm Any
: --- Affects Many People
Assignee: FreeBSD Rust Team
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2021-07-25 22:04 UTC by Robert Clausecker
Modified: 2021-09-19 16:40 UTC (History)
3 users (show)

See Also:
jbeich: maintainer-feedback+
jbeich: maintainer-feedback+
tobik: merge-quarterly-


Attachments
devel/rust-cbindgen-0.19.0_2 build log on armv7 (144.84 KB, text/plain)
2021-07-25 22:04 UTC, Robert Clausecker
no flags Details
devel/cargo-c 0.9.1 build log on armv7 (275.29 KB, text/plain)
2021-07-25 22:10 UTC, Robert Clausecker
no flags Details
sysutils/potnet 0.4.4_13 armv7 build log (394.78 KB, text/plain)
2021-07-27 06:28 UTC, Robert Clausecker
no flags Details
net-im/libsignal-client 0.8.2 armv7 build log (320.72 KB, text/plain)
2021-07-27 12:02 UTC, Robert Clausecker
no flags Details
security/sequoia-0.19.0_7 build log on armv7 FreeBSD 13.0 (699.14 KB, text/plain)
2021-08-04 21:54 UTC, Robert Clausecker
no flags Details
editors/xi-core-0.3.0_15 build log on armv7 (67.97 KB, text/plain)
2021-08-07 09:45 UTC, Robert Clausecker
no flags Details
x11-wm/leftwm-0.2.7.40_1 build log on armv7 (123.20 KB, text/plain)
2021-08-08 21:55 UTC, Robert Clausecker
no flags Details
security/vaultwarden-1.21.0_3 build log on armv7 (243.63 KB, text/plain)
2021-08-09 08:14 UTC, Robert Clausecker
no flags Details
x11/alacritty-0.8.0_1 build log on amdv7 (197.55 KB, text/plain)
2021-08-09 15:27 UTC, Robert Clausecker
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Clausecker 2021-07-25 22:04:52 UTC
Created attachment 226690 [details]
devel/rust-cbindgen-0.19.0_2 build log on armv7

This port consistently fails to build in an armv7 FreeBSD 13.0 jail on an arm64 FreeBSD 13.0 host.  The symptom is always the same: after processing a number of files, the compiler crashes with a “capacity overflow” and then panics again during the generation of the backtrace.  This somehow leads to an infinite backtrace (see attached log where I truncated this backtrace to the first 100 entries).

If left running, eventually Poudriere kills the port for being in a runaway state, but not before generating several gigabytes of logs.

The attached log was generated on the quarterly branch, but I imagine it applies to the main branch the same way.  Will test shortly.
Comment 1 Robert Clausecker 2021-07-25 22:10:44 UTC
Created attachment 226691 [details]
devel/cargo-c 0.9.1 build log on armv7

The same issue also seems to affect devel/cargo-c.  See attached buid log.
Comment 2 Kubilay Kocak freebsd_committer freebsd_triage 2021-07-26 02:57:52 UTC
@Robert Thank you for the report. Can you test builds on other !x86 archs too?
Comment 3 Robert Clausecker 2021-07-26 07:43:46 UTC
(In reply to Kubilay Kocak from comment #2)

I may be able to test riscv64 in the near future but as of now I lack hardware with other ISAs.
Comment 4 Jan Beich freebsd_committer 2021-07-26 11:34:33 UTC
Same panic in 2 ports. For more details:
- Build lang/rust WITH_DEBUG=1 then run with RUST_BACKTRACE=full to get readable stacktrace
- Test all lang/rust consumers (150 according to freshports) in order to identify other affected ports

Over to rust@ as debugging is hard due to bug 221185. armv7 is missing on https://www.freebsd.org/internal/machines/ and uses qemu-user-static (QEMU_EMULATING in poudriere log) on https://pkg-status.freebsd.org/ while riscv64 is not supported by lang/rust yet.
Comment 5 Mikael Urankar freebsd_committer 2021-07-26 11:40:23 UTC
rust-1.51.0 is not affected
rust-1.52.0 is affected, maybe it's due to the llvm12 upgrade
Comment 6 Robert Clausecker 2021-07-26 12:28:03 UTC
(In reply to Jan Beich from comment #4)

> - Build lang/rust WITH_DEBUG=1 then run with RUST_BACKTRACE=full to get readable stacktrace

Can do that, but it'll take me a while to get to that point.  A full rust build takes a lot of time and at the same time, the machine in question is currently busy building all quarterly packages so I can feed updates to my remaining armv7 machines.

> - Test all lang/rust consumers (150 according to freshports) in order to identify other affected ports

Can you attach a list of all rust consumers to this issue so I can feed them to Poudriere?  Alternatively, tell me how to generate such a list.  Due to the nature of this failure condition, systematically testing for it requires some hand holding as the log file explosion causes Poudriere to spend a potentially unbounded time just grepping the log files for standard error sources, regardless of how high I set the runaway timeout.

> armv7 is missing on https://www.freebsd.org/internal/machines/ and uses qemu-user-static (QEMU_EMULATING in poudriere log) on https://pkg-status.freebsd.org/ while riscv64 is not supported by lang/rust yet.

It is possible to build armv7 ports on arm64 machines in an armv7 jail.  In fact, I use just this mechanism to build the armv7 packages.
Comment 7 Robert Clausecker 2021-07-27 06:28:19 UTC
Created attachment 226730 [details]
sysutils/potnet 0.4.4_13 armv7 build log

sysutils/potnet is affected, too.  See attached log.
Comment 8 Robert Clausecker 2021-07-27 12:02:35 UTC
Created attachment 226735 [details]
net-im/libsignal-client 0.8.2 armv7 build log

net-im/libsignal-client is another case.
Comment 9 Robert Clausecker 2021-08-04 21:54:43 UTC
Created attachment 226957 [details]
security/sequoia-0.19.0_7 build log on armv7 FreeBSD 13.0

Also affects security/sequoia.
Comment 10 Robert Clausecker 2021-08-05 06:45:57 UTC
Also happens on the main branch with rust-cbindgen-0.20.0.  Unfortunately I didn't manage to keep the logs.
Comment 11 Robert Clausecker 2021-08-07 09:45:06 UTC
Created attachment 227005 [details]
editors/xi-core-0.3.0_15 build log on armv7

editors/xi-core is affected, too.
Comment 12 Robert Clausecker 2021-08-08 21:55:19 UTC
Created attachment 227026 [details]
x11-wm/leftwm-0.2.7.40_1 build log on armv7

Also affects x11-wm/leftwm.
Comment 13 Robert Clausecker 2021-08-09 08:14:02 UTC
Created attachment 227039 [details]
security/vaultwarden-1.21.0_3 build log on armv7

Also affects security/vaultwarden.
Comment 14 Robert Clausecker 2021-08-09 15:27:01 UTC
Created attachment 227050 [details]
x11/alacritty-0.8.0_1 build log on amdv7

Also affects x11/alacritty.
Comment 15 Mikael Urankar freebsd_committer 2021-08-09 15:36:13 UTC
No need to mention the failing ports, we are aware of the issue.
Comment 16 Robert Clausecker 2021-08-09 16:09:05 UTC
(In reply to Mikael Urankar from comment #15)

It was specifically requested that I identify all affected ports (see comment #4).  I can stop if that's no longer needed.
Comment 17 Robert Clausecker 2021-08-09 23:12:59 UTC
Also affected:

gifski-1.4.3_2
git-absorb-0.5.0_23
Comment 18 Robert Clausecker 2021-09-10 11:42:24 UTC
(In reply to Jan Beich from comment #4)

I have now identified all ports for which this issue occurs (on branch 2021Q3).  These are:

audio/spotify-tui
devel/cargo-c
devel/cargo-generate
devel/git-absorb
devel/git-delta
devel/grcov
devel/rust-analyzer
devel/rust-cbindgen
devel/sentry-cli
devel/tokei
editors/parinfer-rust
editors/xi-core
finance/tickrs
games/abstreet
games/dose-response
graphics/gifski
graphics/viu
java/icedtea-web
lang/gleam
multimedia/scte35dump
net/proby
net/rabbiteer
net-im/libsignal-client
net-mgmt/bandwhich
net-mgmt/nfs-exporter
shells/ion
security/acmed
security/cloak
security/rustscan
security/sequoia
security/solana
security/vaultwarden
sysutils/diskonaut
sysutils/flowgger
sysutils/fselect
sysutils/mcfly
sysutils/onefetch
sysutils/jail_exporter
sysutils/potnet
sysutils/rsfetch
sysutils/tealdeer
sysutils/vector
textproc/angle-grinder
textproc/mdbook
textproc/sd
x11/alacritty
x11-wm/leftwm
www/deno
www/ffsend
www/geckodriver
www/lychee
www/websocat
www/xh
www/zola0
x11/wezterm
Comment 19 Mark Millard 2021-09-11 23:52:07 UTC
(In reply to Robert Clausecker from comment #8)

I've seen this under main [so: 14] in a poudriere bulk -a attempt
on aarch64 targetting armv7, so far rust-cbindgen cargo-c potnet
and libsignal-client but also: textproc/mdbook .

It is not limited to FreeBSD 13.0 .
Comment 20 Mark Millard 2021-09-11 23:59:07 UTC
(In reply to Mark Millard from comment #19)

For reference:

# poudriere jail -i -jmain-CA7
Jail name:         main-CA7
Jail version:      14.0-CURRENT
Jail arch:         arm.armv7
Jail method:       null
Jail mount:        /usr/obj/DESTDIRs/main-CA7-poud
Jail fs:           
Jail updated:      2021-06-27 17:58:33
Jail pkgbase:      disabled
Comment 21 Mark Millard 2021-09-13 00:32:16 UTC
(In reply to Jan Beich from comment #4)

So far every example that I've seen in the ongpoing bulk -a I've
got in process is doing (via an example: cargo-c):

|   `-- /usr/bin/make -C /usr/ports/devel/cargo-c build
|     `-- /usr/local/bin/cargo build --manifest-path /wrkdirs/usr/ports/devel/cargo-c/work/cargo-c-0.9.2+cargo-0.55/C
|       `-- /usr/local/bin/rustc --crate-name im_rc --edition=2018 

There can be multiple cargo subprocess instead of just one /usr/local/bin/rustc .

The process getting the huge CPU time is the one running /usr/local/bin/cargo .
For example:

342:31.37 /usr/local/bin/cargo . . .
 33:21.39 /usr/local/bin/rustc . . .

Given the (effectively unbounded?) number of messages about pthread_peekjoin_np
that seem to be accumulating in the log files, the problem may be tied to cargo
vs. its subprocesses.
Comment 22 Mikael Urankar freebsd_committer 2021-09-13 12:36:52 UTC
Can someone try with rust 1.55.0 please?
https://reviews.freebsd.org/D31872
Comment 23 Mark Millard 2021-09-13 17:58:26 UTC
(In reply to Mark Millard from comment #21)

The ongoing bulk -a has had an example that instead involved:

/usr/local/bin/cargo test

(not build).
Comment 24 Mark Millard 2021-09-13 22:43:40 UTC
(In reply to Robert Clausecker from comment #18)

So far the ongoing bulk -a has also had the problem for:

ncspot-0.6.0_3
ripgrep-all-0.9.6_1
Comment 25 Mikael Urankar freebsd_committer 2021-09-14 12:26:03 UTC
(In reply to Mark Millard from comment #24)
can you try with the following bootstrap (rebuild lang/rust):
http://mikael.urankar.free.fr/rust-std-1.54.0-armv7-unknown-freebsd.tar.xz
http://mikael.urankar.free.fr/cargo-1.54.0-armv7-unknown-freebsd.tar.xz
http://mikael.urankar.free.fr/rustc-1.54.0-armv7-unknown-freebsd.tar.xz

You'll have to update lang/rust/distinfo
SHA256 (cargo-1.54.0-armv7-unknown-freebsd.tar.xz) = b0d89e13cc35a943ba3da5de5247d97d6b3dac0abcd331736bc9176e413e8eee
SHA256 (rust-std-1.54.0-armv7-unknown-freebsd.tar.xz) = bc80b15a9ba60c66250d1b31fb610941205b3a2d2787021a5bdb96e667b28d45
SHA256 (rustc-1.54.0-armv7-unknown-freebsd.tar.xz) = 3927fc48020b3b94bdbaf160720664b4a6cdeb6f96b3415d7026326873e20368
Comment 26 Mark Millard 2021-09-14 19:25:05 UTC
(In reply to Mark Millard from comment #184)

My test bulk -a completed in somewhat under 87.25 hours,
building 26995 ports successfully.


Here is a more complete list of differences, going in both
directions.

Ones bulk -a of my ports tree got the problem in:

librespot-0.2.0_5
ncspot-0.6.0_3
openethereum-3.2.6_2.log
ripgrep-all-0.9.6_1
spotifyd-0.3.0_5

But the following built to completion without
a huge log file:

gifski-1.5.0_1

This may suggest that some sort of race condition
is involved.

The following failed for other reasons:

alacritty-0.9.0_1

The log reports:

memory allocation of 1879048192 bytes failed
error: could not compile `smithay-client-toolkit`

Caused by:
  process didn't exit successfully: `CARGO=/usr/local/bin/cargo CARGO_CRATE_NAME=smithay_client_toolkit CARGO_MANIFEST_DIR=/wrkdirs/usr/ports/x11/alacritty/work/alacritty-0.9.0/cargo-crates/smithay-client-toolkit-0.14.0 CARGO_PKG_AUTHORS='Victor Berger <victor.berger@m4x.org>' CARGO_PKG_DESCRIPTION='Toolkit for making client wayland applications.' CARGO_PKG_HOMEPAGE='' CARGO_PKG_LICENSE=MIT CARGO_PKG_LICENSE_FILE='' CARGO_PKG_NAME=smithay-client-toolkit CARGO_PKG_REPOSITORY='https://github.com/smithay/client-toolkit' CARGO_PKG_VERSION=0.14.0 CARGO_PKG_VERSION_MAJOR=0 CARGO_PKG_VERSION_MINOR=14 CARGO_PKG_VERSION_PATCH=0 CARGO_PKG_VERSION_PRE='' LD_LIBRARY_PATH='/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps:/usr/local/lib' /usr/local/bin/rustc --crate-name smithay_client_toolkit --edition=2018 /wrkdirs/usr/ports/x11/alacritty/work/alacritty-0.9.0/cargo-crates/smithay-client-toolkit-0.14.0/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C linker-plugin-lto -C debuginfo=1 -C metadata=a5c00debf47fe1dc -C extra-filename=-a5c00debf47fe1dc --out-dir /wrkdirs/usr/ports/x11/alacritty/work/target/release/deps -L dependency=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps --extern bitflags=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps/libbitflags-97fcf63a98d67f7d.rmeta --extern dlib=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps/libdlib-58bd5533d8e2f399.rmeta --extern lazy_static=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps/liblazy_static-262b437c84c2b97f.rmeta --extern log=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps/liblog-7978d3bf9bb63659.rmeta --extern memmap2=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps/libmemmap2-89036ac41b83b1e3.rmeta --extern nix=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps/libnix-92fa465df208e236.rmeta --extern wayland_client=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps/libwayland_client-14e61e0514099b56.rmeta --extern wayland_cursor=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps/libwayland_cursor-c84b26efb1b8c9ce.rmeta --extern wayland_protocols=/wrkdirs/usr/ports/x11/alacritty/work/target/release/deps/libwayland_protocols-7f484596d8de4b38.rmeta --cap-lints warn -C target-cpu=cortex-a7 -C linker=cc -C link-arg=-fstack-protector-strong -C link-arg=-L/usr/local/lib` (signal: 6, SIGABRT: process abort signal)
*** Error code 101



Finally, the following was not attemped at all:

wezterm

This was because of:

Skipping x11/wezterm | wezterm-20210814.124438.54.e29167_4: Dependent port textproc/mdbook | mdbook-0.4.12_1 build
Comment 27 Mark Millard 2021-09-14 19:30:14 UTC
(In reply to Mark Millard from comment #26)

My prior note (26) should have said:

(In reply to Robert Clausecker from comment #18)
Comment 28 Mark Millard 2021-09-14 20:10:48 UTC
(In reply to Mikael Urankar from comment #25)

I'd take a stab at it but everything else is based on
*-1.53.0-*-unknown-freebsd.tar.xz in the distinfo for
what my bulk -a was based on. (1.54 uses 1.53 bootstrap
and ports is still at 1.54 building based on 1.53, not
1.55 building basead on 1.54.)

So the result appears to involve mixed verions.

Did you intended for this to be based on:

https://reviews.freebsd.org/D31872

? If so, may be you should be more explicit about the
full set of steps you are after so I do not guess wrong.
Comment 29 Mikael Urankar freebsd_committer 2021-09-14 20:25:44 UTC
(In reply to Mark Millard from comment #28
Yes, based on rust 1.55.0
Comment 30 Mark Millard 2021-09-14 22:21:23 UTC
(In reply to Mikael Urankar from comment #25)

I applied the patch from https://reviews.freebsd.org/D31872 then
had the port fetch. Then I substituted in to:

/usr/ports/distfiles/rust/2021-07-29/

the 3 files from your area. Finally I substitued into
/usr/ports/lang/rust/distinfo (note the rust/2021-07-29/
prefixes in the SHA256 line's relative paths and the updated
sizes):

SHA256 (rust/2021-07-29/rustc-1.54.0-armv7-unknown-freebsd.tar.xz) = 3927fc48020b3b94bdbaf160720664b4a6cdeb6f96b3415d7026326873e20368
SIZE (rust/2021-07-29/rustc-1.54.0-armv7-unknown-freebsd.tar.xz) = 46448620
SHA256 (rust/2021-07-29/rust-std-1.54.0-armv7-unknown-freebsd.tar.xz) = bc80b15a9ba60c66250d1b31fb610941205b3a2d2787021a5bdb96e667b28d45
SIZE (rust/2021-07-29/rust-std-1.54.0-armv7-unknown-freebsd.tar.xz) = 18206828
SHA256 (rust/2021-07-29/cargo-1.54.0-armv7-unknown-freebsd.tar.xz) = b0d89e13cc35a943ba3da5de5247d97d6b3dac0abcd331736bc9176e413e8eee
SIZE (rust/2021-07-29/cargo-1.54.0-armv7-unknown-freebsd.tar.xz) = 4542272

I've got lang/rust building in poudriere. So far it has not rejected
what I did. Anyhing else will be a separate poudriere bulk run.

(I was not familiar with any internals of rust's build structure.
Hopefully I noticed every appropriate that needed to be done.)

Note: I found that /usr/ports/distfiles/rust/crates/ had accumuated
many versions of materials going back into 2015. (I do not normally
build rust.) Appearently nothing cleans up the old content in this
area as things progress. So I cleared it out, hoping it would be
filled in as needed.
Comment 31 Mark Millard 2021-09-14 23:47:06 UTC
(In reply to Mark Millard from comment #30)

Rust is still building but so far there have been no
problems with:

/wrkdirs/usr/ports/lang/rust/work/bootstrap/bin/cargo build . . .

operation during the build.
Comment 32 Mark Millard 2021-09-15 01:32:44 UTC
(In reply to Mark Millard from comment #31)

rust-1.55.0 built just fine so I started a poudriere bulk -a to
retry building everything that failed to complete in the first
bulk -a . It took about 30 min to get to the point of builders
starting to do builds. So far:

[00:40:47] [01] [00:10:39] Finished devel/rust-cbindgen | rust-cbindgen-0.20.0_1: Success

[00:46:10] [16] [00:13:11] Finished sysutils/potnet | potnet-0.4.4_14: Success

None of the in-process cargo-using builds have shown evidence of the huge
log file generation yet.
Comment 33 Mark Millard 2021-09-15 07:28:43 UTC
(In reply to Mark Millard from comment #32)

That 2nd bulk -a hs had builders active for 6 hours, building
589 successfully. So far, there have been no examples of
indefinately growing log files (of mostly pthread_peekjoin_np
messages).

It may be something like another 16 hours before the bulk -a
finishes.
Comment 34 Mikael Urankar freebsd_committer 2021-09-15 07:54:47 UTC
(In reply to Mark Millard from comment #33)
Thanks for the test. No need to build the remaining ports.
Comment 35 Mark Millard 2021-09-15 08:15:06 UTC
(In reply to Mikael Urankar from comment #34)

I'll be letting the bulk -a complete for reasons of setting
up a test of memory and disk use for specific parallel port
rebuilds.
Comment 36 Mark Millard 2021-09-16 01:52:45 UTC
(In reply to Mark Millard from comment #35)

The 2nd bulk -a finished with no huge log files. 1309 built and
349 failed (for other issues).
Comment 37 commit-hook freebsd_committer 2021-09-19 09:16:19 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=6f1fefb50e755d727f471aeb75ebe4e28f876b4b

commit 6f1fefb50e755d727f471aeb75ebe4e28f876b4b
Author:     Tobias Kortkamp <tobik@FreeBSD.org>
AuthorDate: 2021-09-07 08:14:14 +0000
Commit:     Tobias Kortkamp <tobik@FreeBSD.org>
CommitDate: 2021-09-19 09:03:21 +0000

    lang/rust: Update to 1.55.0

    - Set codegen-units=1 [1]
    - Add hack to skip cargo update on git sources as a step towards solving [2]
    - Fix 'capacity overflow' panics on armv* [3]

    Changes:        https://blog.rust-lang.org/2021-09-09/Rust-1.55.0.html
    PR:             258337
    PR:             256099 [1]
    PR:             256581 [2]
    PR:             257419 [3]
    Reviewed by:    mikael, pkubaj
    Exp-run by:     antoine
    Differential Revision:  https://reviews.freebsd.org/D31872
    With hat:       rust

 Mk/Uses/cargo.mk                                   |   2 +-
 Mk/bsd.gecko.mk                                    |   2 +-
 lang/rust-bootstrap/Makefile                       |   8 +-
 lang/rust-bootstrap/distinfo                       |   6 +-
 lang/rust/Makefile                                 |  12 +--
 lang/rust/distinfo                                 | 114 ++++++++++-----------
 ...m-project_compiler-rt_lib_builtins_cpu__model.c |  21 ++--
 ...ols_cargo_src_cargo_sources_git_source.rs (new) |  45 ++++++++
 ...rc_tools_cargo_src_cargo_util_toml_mod.rs (new) |  22 ++++
 .../patch-vendor_openssl-sys_build_main.rs (gone)  |  19 ----
 ..._src_unix_bsd_freebsdlike_freebsd_mod.rs (gone) |  12 ---
 ..._unix_bsd_freebsdlike_freebsd_powerpc.rs (gone) |  50 ---------
 .../powerpc64-elfv1/patch-src_bootstrap_native.rs  |  10 +-
 ...h-compiler_rustc__target_src_spec_mod.rs (gone) |  10 --
 ...rc_spec_powerpc64le__unknown__freebsd.rs (gone) |  19 ----
 15 files changed, 154 insertions(+), 198 deletions(-)
Comment 38 Tobias Kortkamp freebsd_committer 2021-09-19 09:23:30 UTC
Can we close this?

NACK to merge-quarterly. Merging lang/rust is a lot of extra work and the
quarter is almost over.
Comment 39 Robert Clausecker 2021-09-19 16:40:37 UTC
(In reply to Tobias Kortkamp from comment #38)

I'll give it a try and report back if it works.