Bug 238031 - [panic] non-zero write count (during heavy consistent poudriere cross-arch builds)
Summary: [panic] non-zero write count (during heavy consistent poudriere cross-arch bu...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2019-05-21 18:29 UTC by Dave Cottlehuber
Modified: 2019-06-30 14:31 UTC (History)
3 users (show)

See Also:
koobs: mfc-stable12?
koobs: mfc-stable11?


Attachments
auto-ddb output attached (48.00 KB, text/plain)
2019-05-21 18:29 UTC, Dave Cottlehuber
no flags Details
dmesg from first panic report (18.88 KB, text/plain)
2019-06-03 07:55 UTC, Dave Cottlehuber
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Cottlehuber freebsd_committer freebsd_triage 2019-05-21 18:29:42 UTC
Created attachment 204519 [details]
auto-ddb output attached

uname: FreeBSD 13.0-CURRENT r348047+1c489956cbe7(D10335) GENERIC
arch: amd64

while under reasonably heavy continuous load (poudriere bulk building on 6 architectures (11.2 + 12.0 + 13.0 vs x86, amd64, aarch64).

[14771] VNASSERT failed
[14771] 0xfffff81a671413c0: tag none, type VBAD
[14771]     usecount 0, writecount -14, refcount 0
[14771]     flags (VI_DOOMED)
[14771]  VI_LOCKed    lock type tmpfs: UNLOCKED
[14771] panic: Non-zero write count
[14771] cpuid = 5
[14771] time = 1558448598
[14771] KDB: stack backtrace:
[14771] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe02f61c1350
[14771] vpanic() at vpanic+0x19d/frame 0xfffffe02f61c13a0
[14771] panic() at panic+0x43/frame 0xfffffe02f61c1400
[14771] _vdrop() at _vdrop+0x5c4/frame 0xfffffe02f61c1430
[14771] vnlru_free_locked() at vnlru_free_locked+0x42a/frame 0xfffffe02f61c14a0
[14771] getnewvnode() at getnewvnode+0xc7/frame 0xfffffe02f61c14e0
[14771] null_nodeget() at null_nodeget+0xd2/frame 0xfffffe02f61c1540
[14771] null_lookup() at null_lookup+0x196/frame 0xfffffe02f61c15c0
[14771] VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x60/frame 0xfffffe02f61c15e0
[14771] lookup() at lookup+0x5f1/frame 0xfffffe02f61c1680
[14771] namei() at namei+0x4e9/frame 0xfffffe02f61c1770
[14771] kern_statat() at kern_statat+0x7f/frame 0xfffffe02f61c1890
[14771] sys_fstatat() at sys_fstatat+0x2f/frame 0xfffffe02f61c1990
[14771] amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe02f61c1ab0
[14771] fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe02f61c1ab0
[14771] --- syscall (552, FreeBSD ELF64, sys_fstatat), rip = 0x80046741a, rsp = 0x7fffffffc258, rbp = 0x7fffffffc300 ---
[14771] KDB: enter: panic



db:0:kdb.enter.panic>  run lockinfo
db:1:lockinfo> show locks
exclusive sleep mutex vnode interlock (vnode interlock) r = 0 (0xfffff81a67141458) locked @ /usr/src/sys/kern/vfs_subr.c:2910
exclusive lockmgr zfs (zfs) r = 0 (0xfffff815ab0cd068) locked @ /usr/src/sys/kern/vfs_subr.c:2598
shared lockmgr zfs (zfs) r = 0 (0xfffff808771a2068) locked @ /usr/src/sys/kern/vfs_subr.c:2598
db:1:lockinfo>  show alllocks
Process 24845 (sed) thread 0xfffff819b3c9d000 (102075)
shared lockmgr tmpfs (tmpfs) r = 0 (0xfffff81e1a3227e8) locked @ /usr/src/sys/kern/vfs_vnops.c:789
Process 35362 (pkg-static) thread 0xfffff80eca6ce000 (101610)
exclusive lockmgr tmpfs (tmpfs) r = 0 (0xfffff815664bd9c8) locked @ /usr/src/sys/fs/tmpfs/tmpfs_subr.c:609
exclusive lockmgr tmpfs (tmpfs) r = 0 (0xfffff814e10a39c8) locked @ /usr/src/sys/kern/vfs_subr.c:2598
Process 78980 (qemu-aarch64-static) thread 0xfffff81322d145a0 (103054)
exclusive sx vm map (user) (vm map (user)) r = 0 (0xfffff80d178f7070) locked @ /usr/src/sys/vm/vm_map.c:1841
Process 51907 (ccache) thread 0xfffff806552725a0 (101778)
exclusive sleep mutex vnode interlock (vnode interlock) r = 0 (0xfffff81907617bd8) locked @ /usr/src/sys/fs/nullfs/null_vnops.c:843
exclusive lockmgr zfs (zfs) r = 0 (0xfffff801677469c8) locked @ /usr/src/sys/fs/nullfs/null_vnops.c:672
exclusive lockmgr zfs (zfs) r = 0 (0xfffff8132c9eb428) locked @ /usr/src/sys/kern/vfs_subr.c:2598
shared lockmgr zfs (zfs) r = 0 (0xfffff808ba33f428) locked @ /usr/src/sys/kern/vfs_subr.c:2598
Process 90930 (ccache) thread 0xfffff802cd477000 (101887)
exclusive sleep mutex vnode interlock (vnode interlock) r = 0 (0xfffff81a67141458) locked @ /usr/src/sys/kern/vfs_subr.c:2910
exclusive lockmgr zfs (zfs) r = 0 (0xfffff815ab0cd068) locked @ /usr/src/sys/kern/vfs_subr.c:2598
shared lockmgr zfs (zfs) r = 0 (0xfffff808771a2068) locked @ /usr/src/sys/kern/vfs_subr.c:2598
Process 91799 (ccache) thread 0xfffff81f5f6a45a0 (102628)
shared lockmgr zfs (zfs) r = 0 (0xfffff808771a2068) locked @ /usr/src/sys/kern/vfs_subr.c:2598
Process 9517 (ccache) thread 0xfffff806ba8e0000 (102771)
shared lockmgr zfs (zfs) r = 0 (0xfffff808771a2068) locked @ /usr/src/sys/kern/vfs_subr.c:2598
Process 16512 (ccache) thread 0xfffff81f5feac5a0 (102174)

more info from ddb.txt is attached.
Comment 1 Kurt Jaeger freebsd_committer freebsd_triage 2019-05-31 09:06:07 UTC
Same here with r348371M on a

AMD Ryzen Threadripper 2950X 16-Core Processor  (3493.52-MHz K8-class CPU)

box.

For screenshots and dmesg.boot see https://people.freebsd.org/~pi/pr238031/

Works approx. 3-5min after starting poudriere.
Comment 2 Kurt Jaeger freebsd_committer freebsd_triage 2019-05-31 09:07:58 UTC
The vnassert can be found in sys/kern/vfs_subr.c, _vdrop(), line 2982
Comment 3 Kurt Jaeger freebsd_committer freebsd_triage 2019-05-31 17:25:03 UTC
https://lists.freebsd.org/pipermail/freebsd-current/2019-May/073538.html

has a patch, which needs to be tested.

Unfortunatly, after upgrading to r348454, it's more difficult to reproduce.
Comment 4 Kubilay Kocak freebsd_committer freebsd_triage 2019-06-01 07:18:59 UTC
@Kurt Could you attach the dmesg and screenshot here in the issue please. External URL references become stale/go missing over time, and its good to have all information self contained in the issue
Comment 5 Dave Cottlehuber freebsd_committer freebsd_triage 2019-06-03 07:55:08 UTC
Created attachment 204796 [details]
dmesg from first panic report

I haven't had a repeat of the original 2 panics yet, but applying patch on r348527 / c7cdb4a80779a0451dc2c04c3d6b30769049d402 and will report back in the next week if I get anything.
Comment 6 Kurt Jaeger freebsd_committer freebsd_triage 2019-06-30 13:42:54 UTC
I no longer see the crash, so I suggest to close with unable to reproduce.
Comment 7 Konstantin Belousov freebsd_committer freebsd_triage 2019-06-30 14:31:56 UTC
Should have been fixed by r348698.