Summary: | [zfs] [panic] panic booting after removing zil | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | will | ||||
Component: | kern | Assignee: | freebsd-fs (Nobody) <fs> | ||||
Status: | New --- | ||||||
Severity: | Affects Only Me | CC: | junovitch | ||||
Priority: | --- | Keywords: | crash | ||||
Version: | 12.1-RELEASE | ||||||
Hardware: | amd64 | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
will
2020-04-15 20:11:51 UTC
Has the disk been physically removed? Can you show zpool status -v and gpart show output after importing the pool into mfsbsd? At first, I tried booting with the disk still attached, since I was going to repurpose it as a cache device. However, now the disk is entirely removed. Note that I also have 2 USB disks (mfsbsd itself, and an 18TB external drive) attached. They're at the bottom of the gpart output. gpart show: => 34 7814037101 diskid/DISK-WD-WCC4EKKD0A8P GPT (3.6T) 34 6 - free - (3.0K) 40 1024 1 freebsd-boot (512K) 1064 4194304 2 freebsd-swap (2.0G) 4195368 7809841760 3 freebsd-zfs (3.6T) 7814037128 7 - free - (3.5K) => 40 15628053088 diskid/DISK-VAHDA7WL GPT (7.3T) 40 1024 1 freebsd-boot (512K) 1064 4194304 2 freebsd-swap (2.0G) 4195368 15623857760 3 freebsd-zfs (7.3T) => 34 7814037101 diskid/DISK-WD-WCC4E0478835 GPT (3.6T) 34 6 - free - (3.0K) 40 1024 1 freebsd-boot (512K) 1064 4194304 2 freebsd-swap (2.0G) 4195368 7809841760 3 freebsd-zfs (3.6T) 7814037128 7 - free - (3.5K) => 34 7814037101 diskid/DISK-WD-WCC4E1262418 GPT (3.6T) 34 6 - free - (3.0K) 40 1024 1 freebsd-boot (512K) 1064 4194304 2 freebsd-swap (2.0G) 4195368 7809841760 3 freebsd-zfs (3.6T) 7814037128 7 - free - (3.5K) => 34 7814037101 diskid/DISK-WD-WCC4E2VZV3E1 GPT (3.6T) 34 6 - free - (3.0K) 40 1024 1 freebsd-boot (512K) 1064 4194304 2 freebsd-swap (2.0G) 4195368 7809841760 3 freebsd-zfs (3.6T) 7814037128 7 - free - (3.5K) => 34 7814037101 ada5 GPT (3.6T) 34 6 - free - (3.0K) 40 1024 1 freebsd-boot (512K) 1064 4194304 2 freebsd-swap (2.0G) 4195368 7809841760 3 freebsd-zfs (3.6T) 7814037128 7 - free - (3.5K) => 34 7814037101 diskid/DISK-WD-WCC4E1965981 GPT (3.6T) 34 6 - free - (3.0K) 40 1024 1 freebsd-boot (512K) 1064 4194304 2 freebsd-swap (2.0G) 4195368 7809841760 3 freebsd-zfs (3.6T) 7814037128 7 - free - (3.5K) => 34 7814037101 diskid/DISK-WD-WCC4E2050088 GPT (3.6T) 34 6 - free - (3.0K) 40 1024 1 freebsd-boot (512K) 1064 4194304 2 freebsd-swap (2.0G) 4195368 7809841760 3 freebsd-zfs (3.6T) 7814037128 7 - free - (3.5K) => 40 655344 da0 GPT (7.5G) [CORRUPT] 40 472 1 freebsd-boot (236K) 512 654872 2 freebsd-ufs (320M) => 40 655344 diskid/DISK-07AA16081C285D19 GPT (7.5G) [CORRUPT] 40 472 1 freebsd-boot (236K) 512 654872 2 freebsd-ufs (320M) => 40 39065624496 da1 GPT (18T) 40 39065624496 1 freebsd-zfs (18T) => 40 39065624496 diskid/DISK-575542533239343130393639 GPT (18T) 40 39065624496 1 freebsd-zfs (18T) zpool status -v: pool: tank state: ONLINE status: One or more devices are configured to use a non-native block size. Expect reduced performance. action: Replace affected devices with devices that support the configured block size, or migrate data to a properly configured pool. scan: scrub repaired 0 in 0 days 06:42:00 with 0 errors on Sat Apr 11 08:36:28 2020 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 diskid/DISK-WD-WCC4E2VZV3E1p3 ONLINE 0 0 0 gptid/c74650ad-c61c-11e3-8b42-d0509909d8a6 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 diskid/DISK-WD-WCC4E0478835p3 ONLINE 0 0 0 diskid/DISK-WD-WCC4E1262418p3 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 diskid/DISK-WD-WCC4E1965981p3 ONLINE 0 0 0 block size: 512B configured, 4096B native diskid/DISK-VAHDA7WLp3 ONLINE 0 0 0 block size: 512B configured, 4096B native mirror-4 ONLINE 0 0 0 diskid/DISK-WD-WCC4E2050088p3 ONLINE 0 0 0 diskid/DISK-WD-WCC4EKKD0A8Pp3 ONLINE 0 0 0 errors: No known data errors I am wondering if DISK-VAHDA7WL could be a problem. It has a 7+ TB partition mirrored with a 3+ TB partition in the pool. If there's any garbage that looks like a valid ZFS label in the unused portion of the larger partition that that might confuse ZFS. Is there a way that I can verify that? While that's one of the newer disks, I have rebooted with that disk installed previously. I can also try breaking the mirror and rebooting that, if and only if that's the sole way to verify. I have tried now removing the larger hard disk and rebooting, and I still get the same panic. I came across a similar panic: VERIFY(nvlist_lookup_uint64(configs[i], ZPOOL_CONFIG_POOL_TXG, &txg) == 0) failed on a newer OpenZFS system with 13.0-CURRENT from 31 Dec. In this case, the panic was not after removing the ZIL device from the pool. There was only a panic on executing bectl list after removing the ZIL. However if I tried to add the ZIL back into the pool I see the panic on that statement. Should this be related, the test case in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252396 will cause the similar fault after re-adding the ZIL to the pool. |