With a zpool based on two of the geli vdevs configured with the /etc/rc.conf lines: geli_devices="ad0s1 ad0s3f ad0s2" geli_ad0s1_flags="-k /root/ad0s1.key" geli_ad0s2_flags="-k /root/ad0s2.key" geli_ad0s3f_flags="-k /root/ad0s3f.key" zpool scrub causes the following panic: Unread portion of the kernel message buffer: GEOM_ELI: Detached ad0s2.eli on last close. GEOM_LABEL: Label for provider ad0s2 is msdosfs/ÒA.Û,{(#0. panic: Function g_eli_orphan_spoil_assert() called for ad0s3f.eli. KDB: enter: panic panic: from debugger Uptime: 5m27s Physical memory: 1014 MB Dumping 120 MB: 105 89 73 57 41 25 9 #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:195 #1 0xc05db8f3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc05dbb1c in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:563 #3 0xc04a40c7 in db_panic (addr=Could not find the frame base for "db_panic". ) at /usr/src/sys/ddb/db_command.c:433 #4 0xc04a4825 in db_command_loop () at /usr/src/sys/ddb/db_command.c:401 #5 0xc04a6255 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:222 #6 0xc06014a4 in kdb_trap (type=3, code=0, tf=0xf4bcac1c) at /usr/src/sys/kern/subr_kdb.c:502 #7 0xc0838d2b in trap (frame=0xf4bcac1c) at /usr/src/sys/i386/i386/trap.c:621 #8 0xc082120b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #9 0xc0601602 in kdb_enter (msg=0xc0889693 "panic") at cpufunc.h:60 #10 0xc05dbb05 in panic (fmt=0xc3e9a0e8 "Function %s() called for %s.") at /usr/src/sys/kern/kern_shutdown.c:547 #11 0xc3e92d65 in ?? () #12 0xc3e9a0e8 in ?? () #13 0xc3e99d84 in ?? () #14 0xc3e0d890 in ?? () #15 0xf4bcacac in ?? () #16 0xc058eff5 in g_spoil_event (arg=0xc3c92940, flag=-945983104) at /usr/src/sys/geom/geom_subr.c:903 (kgdb) f 16 #16 0xc058eff5 in g_spoil_event (arg=0xc3c92940, flag=-945983104) at /usr/src/sys/geom/geom_subr.c:903 903 cp->geom->spoiled(cp); (kgdb) l 898 if (!cp->spoiled) 899 continue; 900 cp->spoiled = 0; 901 if (cp->geom->spoiled == NULL) 902 continue; 903 cp->geom->spoiled(cp); 904 g_topology_assert(); 905 } 906 } 907 Problem first reported in: http://lists.freebsd.org/pipermail/freebsd-current/2007-October/078105.html Fix: Quoting Pawel Jakub Dawidek's response to my initial report: |GELI's detach-on-last-close mechanism is a general purpose mechanism, it |may not work correctly with ZFS, because ZFS sometimes closes and reopen |providers, which will make GELI to detach. In other words you shouldn't |configure detach-on-last-close for ZFS components. It shouldn't panic |still. Adding geli_autodetach="NO" to /etc/rc.conf indeed prevents the panic. I previously wasn't aware that this option exists, so maybe it should be mentioned in geli(8). How-To-Repeat: Create a zpool with more than one geli vdev that detaches on last close and run zpool scrub. Actually I am not sure if a zpool with only one such vdev is guaranteed to work, but at least for me the problem only started to show when I added the second one.
Responsible Changed From-To: pjd->freebsd-fs With pjd's permission, reassing ZFS-related PRs to freebsd-fs.
Responsible Changed From-To: freebsd-bugs->pjd I'll take this one.
This bug bit me today - it is quite unnerving to watch your zpool go offline due to unavailable devices, all from initiating a simply scrub! However, I can confirm that a reboot recovered the zpool successfully, and adding the following to /etc/rc.conf prevented it from happening again during the next zpool scrub: geli_autodetach="NO" Note: I rebooted once more after adding that to /etc/rc.conf and before initiating the next zpool scrub, just in case that configuration parameter is only read at boot time or the time the geli is attached. Not sure if that was required.
I should add that I am running FreeBSD 10.1 amd64 with generic kernel/world.
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
Is this still reproducible? There have been a number of fixes related to geom spoiling of partitions and labels.
Looks like this is no longer an issue on more recent systems as ZFS now fails to import pools on geli devices that detach on last close: # geli attach -d /dev/md0 Enter passphrase: GEOM_ELI: Device md0.eli created. GEOM_ELI: Encryption: AES-XTS 128 GEOM_ELI: Crypto: software # geli attach -d /dev/md1 Enter passphrase: GEOM_ELI: Device md1.eli created. GEOM_ELI: Encryption: AES-XTS 128 GEOM_ELI: Crypto: software # zpool import pool: test id: 10011777910752807569 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: test ONLINE md0.eli ONLINE md1.eli ONLINE # zpool import test GEOM_ELI: Device md0.eli destroyed. GEOM_ELI: Detached md0.eli on last close. GEOM_ELI: Device md1.eli destroyed. GEOM_ELI: Detached md1.eli on last close. g_access(944): provider md0.eli has error 6 set g_access(944): provider md1.eli has error 6 set cannot import 'test': one or more devices is currently unavailable