When using GELI with autodetach enabled (which it is by default if using geli_devices in rc.conf) enabled, ZFS can trigger a panic. Steps to reproduce: - Create a VM with FreeBSD 12.0-RELEASE rootfs and two attached drives, which I'll be calling ada0 and ada1 for simplicity - Format the devices with geli init with no passphrase: # dd if=/dev/random of=/root/k bs=64 count=1 # geli init -PK /root/k ada0 # geli init -PK /root/k ada1 - Attach to the devices and set up a mirrored zpool (I don't know if the mirroring is needed, this is just what my setup was when I discovered it): # geli attach -pk /root/k ada0 # geli attach -pk /root/k ada1 # zpool create pool mirror ada0.eli ada1.eli - Ensure zfs and geli load at boot: # cat >> /boot/loader.conf <<END zfs_load="YES" geom_eli_load="YES" END # cat >> /etc/rc.conf <<END geli_devices="ada0 ada1" geli_ada0_flags="-p -k /root/k" geli_ada1_flags="-p -k /root/k" END - Reboot the VM and run `zpool status` Expected results: - GELI and ZFS work and it shows the status of `pool` Actual results: - Kernel panic in vdev_dtl_reassess I don't seem to have a way to gather the crashlog from this VM (I'm using VMware Player on Linux to reproduce at the moment) otherwise I'd attach it. Configuration: - Running stock amd64 FreeBSD 12.0-RELEASE on a fresh installation.
Created attachment 203438 [details] Core dump log Found a core.txt from the live system where I first ran into this. Also tried reproducing it with only one geli device/vdev, which also crashes similarly.
Just ran into a similar panic, but with ggate and ggate destroy -f, so force-destroying the ggate device that a zfs/zpool command wants to write to. There's a ggate specific thing here: because of a bug in ggatec it's possible for a read or write request to end of dangling forever. So for example a zpool-create will never finish. To get out of that sticky situation you might be inclined to run `ggatec destroy -f -u 0`. That will trigger the kernel panic. Unread portion of the kernel message buffer: code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6 (solthread 0xfffffff) trap number = 12 panic: page fault cpuid = 0 time = 1627668025 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00511a8720 vpanic() at vpanic+0x17b/frame 0xfffffe00511a8770 panic() at panic+0x43/frame 0xfffffe00511a87d0 trap_fatal() at trap_fatal+0x391/frame 0xfffffe00511a8830 trap_pfault() at trap_pfault+0x66/frame 0xfffffe00511a8880 trap() at trap+0x4f7/frame 0xfffffe00511a8990 calltrap() at calltrap+0x8/frame 0xfffffe00511a8990 --- trap 0xc, rip = 0xffffffff803e838c, rsp = 0xfffffe00511a8a60, rbp = 0xfffffe00511a8ad0 --- vdev_dtl_reassess() at vdev_dtl_reassess+0x11c/frame 0xfffffe00511a8ad0 vdev_dtl_reassess() at vdev_dtl_reassess+0x89/frame 0xfffffe00511a8b50 spa_vdev_state_exit() at spa_vdev_state_exit+0x127/frame 0xfffffe00511a8b80 spa_async_thread_vd() at spa_async_thread_vd+0xe0/frame 0xfffffe00511a8bb0 fork_exit() at fork_exit+0x85/frame 0xfffffe00511a8bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00511a8bf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- This is on stable-12, commit 7fa95d69f10827d0b02607682a2c4a1513d658e5, with a custom stripped down kernel, on amd64, built with DIAGNOSTIC and INVARIANTS and stuff. I have kgdb on this box/vm. It's very reproducible.
This is amongst bug reports that need special attention; see <https://lists.freebsd.org/archives/freebsd-fs/2023-April/002047.html>. Please: are either of the panics reproducible with a currently supported RELEASE, or branch, of the OS?
^Triage: clear stale flags.