Bug 199405 - Panic trying to mount ZFS pool after 10.1 update
Summary: Panic trying to mount ZFS pool after 10.1 update
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.1-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2015-04-12 23:48 UTC by Chris Smith
Modified: 2022-10-21 05:04 UTC (History)
2 users (show)

See Also:


Attachments
Core dump text file (82.13 KB, text/plain)
2015-04-12 23:48 UTC, Chris Smith
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Smith 2015-04-12 23:48:14 UTC
Created attachment 155528 [details]
Core dump text file

My home FreeBSD servers is panicking trying to mount one of its ZFS pools.  It is a VM running on vSphere 5.5 with two LSI SAS9211-8i HBAs presented to it using PCI passthrough and has been quite stable for a couple of years in this configuration.

It was recently updated to 10.1 (from 10), but ran successfully for a week or two before the current problem.  It was also relatively recently (within the last month or two) updated from 9.3-STABLE.

The ZFS pool was not upgraded to the latest features when the system was.

Trying to "zpool import -f" on freshly built 10.1 or 9.3 systems causes the same panic.  I have also tried importing with readonly=on.  I haven't tried systems earlier than 9.3.

A second zpool from the same server is working without problems (ie: I was able to mount it on the freshly built 10.1 box with zpool import-f).

The text from the panic is:

FreeBSD freebsd 10.1-RELEASE-p9 FreeBSD 10.1-RELEASE-p9 #0: Tue Apr  7 01:09:46 UTC 2015     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

panic: solaris assert: range_tree_space(rt) == space (0x6b34cb000 == 0x6b3525000), file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, line: 130

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: solaris assert: range_tree_space(rt) == space (0x6b34cb000 == 0x6b3525000), file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, line: 130
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff80963000 at kdb_backtrace+0x60
#1 0xffffffff80928125 at panic+0x155
#2 0xffffffff81b7c22f at assfail3+0x2f
#3 0xffffffff81a836e5 at space_map_load+0x3d5
#4 0xffffffff81a69b0e at metaslab_load+0x2e
#5 0xffffffff81a6b609 at metaslab_alloc+0x6b9
#6 0xffffffff81aa9ca6 at zio_dva_allocate+0x76
#7 0xffffffff81aa7382 at zio_execute+0x162
#8 0xffffffff80971475 at taskqueue_run_locked+0xe5
#9 0xffffffff80971f08 at taskqueue_thread_loop+0xa8
#10 0xffffffff808f8b6a at fork_exit+0x9a
#11 0xffffffff80d0acfe at fork_trampoline+0xe
Uptime: 4m26s
Dumping 418 out of 8168 MB:..4%..12%..23%..31%..43%..54%..62%..73%..81%..92%


I have attached the core.txt file produced, I also have a core dump.  This is from trying to import on the fresh 10.1 system.

This may be related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193875 ?

It would be great if I could even get this pool mounted in a readonly state to pull some of the data off it. Nothing is irreplaceable, but restoring ~9T over the internet takes a long time.
Comment 1 Steven Hartland freebsd_committer freebsd_triage 2015-04-14 07:39:05 UTC
Have you tried importing with 10-Stable or current to see if they help?
Comment 2 Chris Smith 2015-04-14 21:02:52 UTC
(In reply to Steven Hartland from comment #1)

Hi, 

Tried with 10-STABLE this morning and no luck, though it's a different line reported in space_map.c.

root@freebsd:~ # cat /var/crash/core.txt.0
freebsd dumped core - see /var/crash/vmcore.0

Tue Apr 14 20:58:16 UTC 2015

FreeBSD freebsd 10.1-STABLE FreeBSD 10.1-STABLE #0 r281528: Tue Apr 14 20:40:23 UTC 2015     root@freebsd:/usr/obj/usr/src/sys/GENERIC  amd64

panic: solaris assert: range_tree_space(rt) == space (0x6b34cb000 == 0x6b3525000), file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, line: 131

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: solaris assert: range_tree_space(rt) == space (0x6b34cb000 == 0x6b3525000), file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, line: 131
cpuid = 1
KDB: stack backtrace:
#0 0xffffffff80973b90 at kdb_backtrace+0x60
#1 0xffffffff80937c15 at panic+0x155
#2 0xffffffff81c0922f at assfail3+0x2f
#3 0xffffffff81a8c5a5 at space_map_load+0x3d5
#4 0xffffffff81a721be at metaslab_load+0x2e
#5 0xffffffff81a73da7 at metaslab_alloc+0x777
#6 0xffffffff81ab4166 at zio_dva_allocate+0x76
#7 0xffffffff81ab1542 at zio_execute+0x162
#8 0xffffffff80981f75 at taskqueue_run_locked+0xe5
#9 0xffffffff80982a08 at taskqueue_thread_loop+0xa8
#10 0xffffffff80906d6a at fork_exit+0x9a
#11 0xffffffff80d1d5de at fork_trampoline+0xe
Uptime: 54s
Dumping 408 out of 8167 MB:..4%..12%..24%..32%..44%..51%..63%..71%..83%..91%



Will compile -CURRENT and try that tonight.
Comment 3 Chris Smith 2015-04-15 00:29:27 UTC
(In reply to Chris Smith from comment #0)

Also tried importing the pool to a fresh 8.4 install and had the same panic.

I was able to determine (based on the pool that's still OK) that the zpool version is from 8.x.
Comment 4 Xin LI freebsd_committer freebsd_triage 2015-04-15 00:44:32 UTC
(In reply to Chris Smith from comment #3)
The backtrace suggests that you have a space map corruption, but the validation code is only nominally different so it's not clear to me why it won't panic on 8.x.

Will the system allow you to import the pool read-only?  (zpool import -o readonly=on)?
Comment 5 Chris Smith 2015-04-15 00:50:16 UTC
(In reply to Xin LI from comment #4)

Sorry if that wasn't clear.

Trying to mount the pool on a fresh 8.4 system DOES cause a panic, same as 9.3 and 10.1.

I did try -o readonly on 10.1 and 9.3, but not 8.4.

I'm at work at the moment and can't access the system.  I'll try -o readonly on 8.4 when I get home.

Is there any chance using the -F "Recovery mode" switch to zpool import may help ?  The data in this pool is primarily archival, so I don't mind losing recent changes.
Comment 6 Chris Smith 2015-04-15 06:51:46 UTC
Success !  I was able to mount the pool read-only on the 8.3 system.

I'll backup my data off the pool first, but after that if you want me to dump any info out of the pool to help you debug the problem, I'm happy to leave it a few days before nuking and recreating the pool.
Comment 7 Chris Smith 2015-04-18 14:25:38 UTC
I've retrieved all my data from the pool.  Does anyone want a dump of any data before I nuke it and recreate ?

I will wait 24 hours.
Comment 8 Steven Hartland freebsd_committer freebsd_triage 2015-04-18 16:04:20 UTC
As with any upgrade, where the on disk state is the cause of the issue, its not going clear if the issue occurred due to old code which has already been fixed.

Given this I'm not sure how helpful a dump would be tbh.

Even so what size are we talking about?
Comment 9 Chris Smith 2015-04-18 22:07:37 UTC
(In reply to Steven Hartland from comment #8)

The pool itself is about 15T (~10T of actual data).

I thought there may be some way for you to pull out the metadata that's causing the panic.  Happy to do that if it can help.

Otherwise I'll just nuke and recreate. :)
Comment 10 Xin LI freebsd_committer freebsd_triage 2022-10-21 05:04:42 UTC
This _might_ be fixed by https://github.com/openzfs/zfs/commit/c7a4255f128cc493df8383cb9f1ed650191b2081 but unfortunately we were unable to tell with the available information, so marking this as overcome by events.