Bug 265695

Summary: Kernel panic on ZFS service
Product: Base System Reporter: bugreporter
Component: kernAssignee: freebsd-fs (Nobody) <fs>
Status: New ---    
Severity: Affects Some People CC: grahamperrin, pi
Priority: --- Keywords: crash
Version: 13.1-STABLE   
Hardware: powerpc   
OS: Any   

Description bugreporter 2022-08-07 20:00:11 UTC
Hello awesome FreeBSD people.

I can reliably cause a crash in 13.1-STABLE (cross-built on an amd64 host from an earlier commitid on stable/13) as of commitid 3fbe3365df59f9b973c7b5bc8e82e13199ab5057 (the last commit that had llvm-14.0.4) by starting the ZFS service from a fresh install.

Hardware is a Raptor Computing Talos II, 128GB RAM, 36-core, etc.

To trigger the crash:

/etc/rc.d/zfs onestart

Results on console are:

FreeBSD/powerpc (machinename) (ttyu0)

login: ZFS filesystem version: 5
ZFS storage pool version: features support (5000)

fatal kernel trap:

   exception       = 0x400 (instruction storage interrupt)
   virtual address = 0x3abd29ae1c9b0f88
   srr0            = 0x3abd29ae1c9b0f88 (0x7abd29ae1a6c0f88)
   srr1            = 0x9000000040009032
   current msr     = 0x9000000000009032
   lr              = 0xc0080001ef181758 (0x80001ece91758)
   frame           = 0xc0080001f24c9eb0
   curthread       = 0xc00800000bf68b00
          pid = 1599, comm = zfs

panic: instruction storage interrupt trap
cpuid = 96
time = 1659901415
KDB: stack backtrace:
#0 0xc000000002bda030 at kdb_backtrace+0x90
#1 0xc000000002b6af4c at vpanic+0x1f0
#2 0xc000000002b6ad48 at panic+0x44
#3 0xc000000003035900 at trap+0x304
#4 0xc000000003029b54 at powerpc_interrupt+0x1b4
Uptime: 1h41m52s

Dump failed. Partition too small.
aacraid0: shutting down controller...done
[84664.102624126,5] OPAL: Reboot request...
[84664.103037173,5] RESET: Initiating fast reboot 12...
[rest of OBMC-mediated reboot process]

I don't really see anything in stable/13 tip (in my tree that's f95569fafcba5ed3cd119f8d177622fe0e64bbf6) which might suggest this is fixed in a later commit, however, due to another bug which I'll file shortly, I currently can't boot anything after the llvm-14.0.5 import on stable/13 on this machine due to insta-panic.

However! I would be glad to spend time helping anyone chase this down and I'm good at following instructions. If you (whoever you might be) would like me to instrument with debugging and explore data structures, feel free to tell me what to do and I'll get on it.