Bug 162008 - [zfs] Latest 9-STABLE and 10-CURRENT fail to boot from ZFS v15 root [regression]
Summary: [zfs] Latest 9-STABLE and 10-CURRENT fail to boot from ZFS v15 root [regression]
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: Pawel Jakub Dawidek
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-10-25 18:00 UTC by Robert Millan
Modified: 2012-01-05 10:00 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Millan 2011-10-25 18:00:19 UTC
With both 9-STABLE and 10-CURRENT, since recently the kernel is no longer
able to boot from my ZFS pool as root file system.

The on-disk pool is ZFS version 15 and was created with 8.2 kernel.

I've bisected the problem in stable/9/sys/ and found that it'd been
introduced by r226405 (commit that disables debug options in GENERIC),
which is obviously just exposing the bug and not causing it.

Ironically, in head/sys/ the same problem is present but disappears
when removing the debug options.

If I attempt to replicate the disk (by creating a new v15 pool and zfs
send/receive'ing the data), the destination ZFS pool is bootable unlike
the source one. This makes me suspect the problem has something to do
with /boot/zfs/zpool.cache.

I'm currently dd'ing the raw partition to another disk to check if the
pool can be imported/exported manually, and if "zpool upgrade" has any
effect on the problem (I don't want to risk losing the testcase). Please
let me know if there's anything else I can try.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2011-10-26 05:20:39 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 2 Andriy Gapon freebsd_committer freebsd_triage 2011-10-26 07:35:08 UTC
Please let us know how _exactly_ your "kernel is no longer
able to boot from my ZFS pool as root file system".
That is, what boot stage fails and what output you see - (gpt)zfsboot,
zfsloader, kernel, root fs mounting, something else...

-- 
Andriy Gapon
Comment 3 Robert Millan 2011-10-26 18:34:15 UTC
2011/10/26 Andriy Gapon <avg@freebsd.org>:
>
> Please let us know how _exactly_ your "kernel is no longer
> able to boot from my ZFS pool as root file system".
> That is, what boot stage fails and what output you see - (gpt)zfsboot,
> zfsloader, kernel, root fs mounting, something else...

I'm sorry, I thought there was no meaningful error, but in closer look I notice:

  Mounting from zfs:eeepc/root failed with error 6.

Assuming this means ENXIO, could it be a race condition?

-- 
Robert Millan
Comment 4 Andriy Gapon freebsd_committer freebsd_triage 2011-10-30 11:06:21 UTC
on 26/10/2011 20:34 Robert Millan said the following:
> 2011/10/26 Andriy Gapon <avg@freebsd.org>:
>>
>> Please let us know how _exactly_ your "kernel is no longer
>> able to boot from my ZFS pool as root file system".
>> That is, what boot stage fails and what output you see - (gpt)zfsboot,
>> zfsloader, kernel, root fs mounting, something else...
> 
> I'm sorry, I thought there was no meaningful error, but in closer look I notice:
> 
>   Mounting from zfs:eeepc/root failed with error 6.
> 
> Assuming this means ENXIO, could it be a race condition?
> 

IMO, not likely.
Please try setting vfs.zfs.debug=1 via loader.conf.
Maybe additional debug information will make the situation clearer.

-- 
Andriy Gapon
Comment 5 Robert Millan 2011-11-04 23:11:34 UTC
2011/10/30 Andriy Gapon <avg@freebsd.org>:
> IMO, not likely.
> Please try setting vfs.zfs.debug=1 via loader.conf.
> Maybe additional debug information will make the situation clearer.

Strangely, the system boots now, but kernel panics as soon as "zfs
volinit" is attempted:

vdev_geom_open_by_guid:352[1]: Searching by guid [13849114725133984793].
panic: _sx_xlock_hard: recursed on non-recursive sx spa_namespace_lock
@ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c:877

It also drops me to a debug prompt. Backtrace:

kdb_enter
panic
_sx_xlock_hard
_sx_xlock
zvol_geom_access
g_access
vdev_geom_open
vdev_open
vdev_open_children
vdev_root_open
vdev_open
spa_load
spa_load_best
spa_open_common
spa_get_stats
zfs_ioc_pool_stats
zfsdev_ioctl
devfs_ioctl_f
kern_ioctl

(this happened with 9-STABLE, SVN r226626)

-- 
Robert Millan
Comment 6 dfilter service freebsd_committer freebsd_triage 2011-11-05 16:29:14 UTC
Author: pjd
Date: Sat Nov  5 16:29:03 2011
New Revision: 227110
URL: http://svn.freebsd.org/changeset/base/227110

Log:
  In zvol_open() if the spa_namespace_lock is already held, it means that
  ZFS is trying to open and taste ZVOL as its VDEV. This is not supported,
  so return an error instead of panicing on spa_namespace_lock recursion.
  
  Reported by:	Robert Millan <rmh@debian.org>
  PR:		kern/162008
  MFC after:	3 days

Modified:
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c

Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
==============================================================================
--- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Sat Nov  5 16:04:57 2011	(r227109)
+++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Sat Nov  5 16:29:03 2011	(r227110)
@@ -875,6 +875,14 @@ zvol_open(struct g_provider *pp, int fla
 	zvol_state_t *zv;
 	int err = 0;
 
+	if (MUTEX_HELD(&spa_namespace_lock)) {
+		/*
+		 * If the spa_namespace_lock is being held, it means that ZFS
+		 * is trying to open ZVOL as its VDEV. This i not supported.
+		 */
+		return (EOPNOTSUPP);
+	}
+
 	mutex_enter(&spa_namespace_lock);
 
 	zv = pp->private;
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 7 dfilter service freebsd_committer freebsd_triage 2011-11-24 07:25:53 UTC
Author: pjd
Date: Thu Nov 24 07:25:43 2011
New Revision: 227923
URL: http://svn.freebsd.org/changeset/base/227923

Log:
  MFC r227110,r227111:
  
  r227110:
  
  In zvol_open() if the spa_namespace_lock is already held, it means that
  ZFS is trying to open and taste ZVOL as its VDEV. This is not supported,
  so return an error instead of panicing on spa_namespace_lock recursion.
  
  Reported by:	Robert Millan <rmh@debian.org>
  PR:		kern/162008
  
  r227111:
  
  Correct typo in comment.
  
  Reported by:	Fabian Keil <fk@fabiankeil.de>
  
  Approved by:	re (kib)

Modified:
  stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
Directory Properties:
  stable/9/sys/   (props changed)
  stable/9/sys/amd64/include/xen/   (props changed)
  stable/9/sys/boot/   (props changed)
  stable/9/sys/boot/i386/efi/   (props changed)
  stable/9/sys/boot/ia64/efi/   (props changed)
  stable/9/sys/boot/ia64/ski/   (props changed)
  stable/9/sys/boot/powerpc/boot1.chrp/   (props changed)
  stable/9/sys/boot/powerpc/ofw/   (props changed)
  stable/9/sys/cddl/contrib/opensolaris/   (props changed)
  stable/9/sys/conf/   (props changed)
  stable/9/sys/contrib/dev/acpica/   (props changed)
  stable/9/sys/contrib/octeon-sdk/   (props changed)
  stable/9/sys/contrib/pf/   (props changed)
  stable/9/sys/contrib/x86emu/   (props changed)

Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
==============================================================================
--- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Nov 24 06:27:47 2011	(r227922)
+++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Nov 24 07:25:43 2011	(r227923)
@@ -875,6 +875,14 @@ zvol_open(struct g_provider *pp, int fla
 	zvol_state_t *zv;
 	int err = 0;
 
+	if (MUTEX_HELD(&spa_namespace_lock)) {
+		/*
+		 * If the spa_namespace_lock is being held, it means that ZFS
+		 * is trying to open ZVOL as its VDEV. This is not supported.
+		 */
+		return (EOPNOTSUPP);
+	}
+
 	mutex_enter(&spa_namespace_lock);
 
 	zv = pp->private;
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 8 dfilter service freebsd_committer freebsd_triage 2011-11-24 07:39:16 UTC
Author: pjd
Date: Thu Nov 24 07:39:01 2011
New Revision: 227927
URL: http://svn.freebsd.org/changeset/base/227927

Log:
  MFC r227110,r227111:
  
  r227110:
  
  In zvol_open() if the spa_namespace_lock is already held, it means that
  ZFS is trying to open and taste ZVOL as its VDEV. This is not supported,
  so return an error instead of panicing on spa_namespace_lock recursion.
  
  Reported by:	Robert Millan <rmh@debian.org>
  PR:		kern/162008
  
  r227111:
  
  Correct typo in comment.
  
  Reported by:	Fabian Keil <fk@fabiankeil.de>
  
  Approved by:	re (kib)

Modified:
  releng/9.0/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
Directory Properties:
  releng/9.0/sys/   (props changed)
  releng/9.0/sys/amd64/include/xen/   (props changed)
  releng/9.0/sys/boot/   (props changed)
  releng/9.0/sys/boot/i386/efi/   (props changed)
  releng/9.0/sys/boot/ia64/efi/   (props changed)
  releng/9.0/sys/boot/ia64/ski/   (props changed)
  releng/9.0/sys/boot/powerpc/boot1.chrp/   (props changed)
  releng/9.0/sys/boot/powerpc/ofw/   (props changed)
  releng/9.0/sys/cddl/contrib/opensolaris/   (props changed)
  releng/9.0/sys/conf/   (props changed)
  releng/9.0/sys/contrib/dev/acpica/   (props changed)
  releng/9.0/sys/contrib/octeon-sdk/   (props changed)
  releng/9.0/sys/contrib/pf/   (props changed)
  releng/9.0/sys/contrib/x86emu/   (props changed)

Modified: releng/9.0/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
==============================================================================
--- releng/9.0/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Nov 24 07:37:19 2011	(r227926)
+++ releng/9.0/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Nov 24 07:39:01 2011	(r227927)
@@ -875,6 +875,14 @@ zvol_open(struct g_provider *pp, int fla
 	zvol_state_t *zv;
 	int err = 0;
 
+	if (MUTEX_HELD(&spa_namespace_lock)) {
+		/*
+		 * If the spa_namespace_lock is being held, it means that ZFS
+		 * is trying to open ZVOL as its VDEV. This is not supported.
+		 */
+		return (EOPNOTSUPP);
+	}
+
 	mutex_enter(&spa_namespace_lock);
 
 	zv = pp->private;
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 9 dfilter service freebsd_committer freebsd_triage 2012-01-05 09:51:01 UTC
Author: mm
Date: Thu Jan  5 09:50:47 2012
New Revision: 229567
URL: http://svn.freebsd.org/changeset/base/229567

Log:
  MFC r227110, r227111:
  
  MFC r227110 (pjd) [1]:
  In zvol_open() if the spa_namespace_lock is already held, it means that
  ZFS is trying to open and taste ZVOL as its VDEV. This is not supported,
  so return an error instead of panicing on spa_namespace_lock recursion.
  
  MFC r227111 (pjd) [2]:
  Correct typo in comment.
  
  PR:		kern/162008
  Reported by:	Robert Millan <rmh@debian.org> [1]
  		Fabian Keil <fk@fabiankeil.de> [2]

Modified:
  stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
Directory Properties:
  stable/8/sys/   (props changed)
  stable/8/sys/cddl/contrib/opensolaris/   (props changed)

Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
==============================================================================
--- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Jan  5 09:39:29 2012	(r229566)
+++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Jan  5 09:50:47 2012	(r229567)
@@ -875,6 +875,14 @@ zvol_open(struct g_provider *pp, int fla
 	zvol_state_t *zv;
 	int err = 0;
 
+	if (MUTEX_HELD(&spa_namespace_lock)) {
+		/*
+		 * If the spa_namespace_lock is being held, it means that ZFS
+		 * is trying to open ZVOL as its VDEV. This is not supported.
+		 */
+		return (EOPNOTSUPP);
+	}
+
 	mutex_enter(&spa_namespace_lock);
 
 	zv = pp->private;
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 10 Pawel Jakub Dawidek freebsd_committer freebsd_triage 2014-06-01 07:37:38 UTC
State Changed
From-To: open->patched

Fix committed to HEAD. Thanks for the report! 


Comment 11 Pawel Jakub Dawidek freebsd_committer freebsd_triage 2014-06-01 07:37:38 UTC
Responsible Changed
From-To: freebsd-fs->pjd

I'll take this one.
Comment 12 Pawel Jakub Dawidek freebsd_committer freebsd_triage 2014-06-01 07:37:38 UTC
State Changed
From-To: patched->closed

Fix merged to stable/9 and releng/9.0. Thanks.