Bug 117158 - [zfs] [panic] zpool scrub causes panic if geli vdevs detach on last close
Summary: [zfs] [panic] zpool scrub causes panic if geli vdevs detach on last close
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 7.0-CURRENT
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-fs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-10-13 20:40 UTC by Fabian Keil
Modified: 2015-03-13 15:20 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fabian Keil 2007-10-13 20:40:00 UTC
With a zpool based on two of the geli vdevs
configured with the /etc/rc.conf lines:

   geli_devices="ad0s1 ad0s3f ad0s2"
   geli_ad0s1_flags="-k /root/ad0s1.key"
   geli_ad0s2_flags="-k /root/ad0s2.key"
   geli_ad0s3f_flags="-k /root/ad0s3f.key"

zpool scrub causes the following panic:

Unread portion of the kernel message buffer:
GEOM_ELI: Detached ad0s2.eli on last close.
GEOM_LABEL: Label for provider ad0s2 is msdosfs/ÒA.Û,{(#0.
panic: Function g_eli_orphan_spoil_assert() called for ad0s3f.eli.
KDB: enter: panic
panic: from debugger
Uptime: 5m27s
Physical memory: 1014 MB
Dumping 120 MB: 105 89 73 57 41 25 9

#0  doadump () at pcpu.h:195
195     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt   
#0  doadump () at pcpu.h:195
#1  0xc05db8f3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2  0xc05dbb1c in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:563
#3  0xc04a40c7 in db_panic (addr=Could not find the frame base for "db_panic".
) at /usr/src/sys/ddb/db_command.c:433
#4  0xc04a4825 in db_command_loop () at /usr/src/sys/ddb/db_command.c:401
#5  0xc04a6255 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:222
#6  0xc06014a4 in kdb_trap (type=3, code=0, tf=0xf4bcac1c) at /usr/src/sys/kern/subr_kdb.c:502
#7  0xc0838d2b in trap (frame=0xf4bcac1c) at /usr/src/sys/i386/i386/trap.c:621
#8  0xc082120b in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#9  0xc0601602 in kdb_enter (msg=0xc0889693 "panic") at cpufunc.h:60
#10 0xc05dbb05 in panic (fmt=0xc3e9a0e8 "Function %s() called for %s.") at /usr/src/sys/kern/kern_shutdown.c:547
#11 0xc3e92d65 in ?? ()
#12 0xc3e9a0e8 in ?? ()
#13 0xc3e99d84 in ?? ()
#14 0xc3e0d890 in ?? ()
#15 0xf4bcacac in ?? ()
#16 0xc058eff5 in g_spoil_event (arg=0xc3c92940, flag=-945983104) at /usr/src/sys/geom/geom_subr.c:903
(kgdb) f 16
#16 0xc058eff5 in g_spoil_event (arg=0xc3c92940, flag=-945983104) at /usr/src/sys/geom/geom_subr.c:903
903                     cp->geom->spoiled(cp);
(kgdb) l
898                     if (!cp->spoiled)
899                             continue;
900                     cp->spoiled = 0;
901                     if (cp->geom->spoiled == NULL)
902                             continue;
903                     cp->geom->spoiled(cp);
904                     g_topology_assert();
905             }
906     }
907

Problem first reported in:
http://lists.freebsd.org/pipermail/freebsd-current/2007-October/078105.html

Fix: 

Quoting Pawel Jakub Dawidek's response to my initial report:

|GELI's detach-on-last-close mechanism is a general purpose mechanism, it
|may not work correctly with ZFS, because ZFS sometimes closes and reopen
|providers, which will make GELI to detach. In other words you shouldn't
|configure detach-on-last-close for ZFS components. It shouldn't panic
|still.

Adding geli_autodetach="NO" to /etc/rc.conf indeed prevents the panic.

I previously wasn't aware that this option exists,
so maybe it should be mentioned in geli(8).
How-To-Repeat: Create a zpool with more than one geli vdev
that detaches on last close and run zpool scrub.

Actually I am not sure if a zpool with only one
such vdev is guaranteed to work, but at least
for me the problem only started to show when I
added the second one.
Comment 1 Mark Linimon freebsd_committer 2009-05-28 23:20:42 UTC
Responsible Changed
From-To: pjd->freebsd-fs

With pjd's permission, reassing ZFS-related PRs to freebsd-fs.
Comment 2 Pawel Jakub Dawidek freebsd_committer 2014-06-01 06:41:13 UTC
Responsible Changed
From-To: freebsd-bugs->pjd

I'll take this one.
Comment 3 Ben Woods freebsd_committer 2015-03-13 15:19:27 UTC
This bug bit me today - it is quite unnerving to watch your zpool go offline due to unavailable devices, all from initiating a simply scrub!

However, I can confirm that a reboot recovered the zpool successfully, and adding the following to /etc/rc.conf prevented it from happening again during the next zpool scrub:
geli_autodetach="NO"

Note: I rebooted once more after adding that to /etc/rc.conf and before initiating the next zpool scrub, just in case that configuration parameter is only read at boot time or the time the geli is attached. Not sure if that was required.
Comment 4 Ben Woods freebsd_committer 2015-03-13 15:20:25 UTC
I should add that I am running FreeBSD 10.1 amd64 with generic kernel/world.