After upgrade 8.0-RELEASE system to 8-STABLE, the machine no longer boots from degraded raidz. Booting stops with a message: error 1 lba 32 error 1 lba 1 Booting from a non-degraded raidz still works fine. After encountering a problem I prepared an isolated (qemu) environment to test this issue. In 8.0-RELEASE everything works fine. I can remove one disk and the system boots. My configuration: - gpart config: => 34 6291389 ad0 GPT (3.0G) 34 128 1 freebsd-boot (64K) 162 6291261 2 freebsd-zfs (3.0G) => 34 6291389 ad1 GPT (3.0G) 34 128 1 freebsd-boot (64K) 162 6291261 2 freebsd-zfs (3.0G) => 34 6291389 ad3 GPT (3.0G) 34 128 1 freebsd-boot (64K) 162 6291261 2 freebsd-zfs (3.0G) (with GPT labeling) - zpool (v13) config: NAME STATE READ WRITE CKSUM bijou ONLINE 0 0 0 raidz1 ONLINE 0 0 0 gpt/bijou0 ONLINE 0 0 0 gpt/bijou1 ONLINE 0 0 0 gpt/bijou2 ONLINE 0 0 0 - loader.conf: zfs_load="YES" vfs.root.mountfrom="zfs:bijou" - rc.conf: zfs_enable="YES" After upgrade to 8-STABLE (world and kernel) and BEFORE reinstalling bootcode booting from degraded raidz no longer works: http://img28.imageshack.us/img28/9118/raidz8stable.png . Note: the installworld upgrades /boot/loader. After reinstalling bootcode: gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad0 gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad1 gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad3 booting stops with a message: error 1 lba 32 error 1 lba 1 ( http://img17.imageshack.us/img17/36/raidz8stablereinstalled.png ) Upgrading zpool to v14 doesn't help. I think this is serious regression -- disks often fail during reboot. The whole idea of raidz doesn't make any sense. How-To-Repeat: 1. Have FreeBSD 8.0-RELEASE "RootOnZFS" raidz1 installation (like the one described here: http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/RAIDZ1 ) 2. Upgrade to 8-STABLE. 3. Physically remove one disk from raidz. 4. Boot the machine.
Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s).
Another data point. I had a similar problem with mirrored boot disks. I installed 8.1-RELEASE onto two identical drives in a mirror, following the "Installing FreeBSD Root on ZFS (Mirror) using GPT" guide here: http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/Mirror I can boot from either disk by changing the boot priority in the BIOS, so long as both disks are connected. I swapped the sata ports, and I can still boot from either disk. I can disconnect drive #2 and boot from drive #1 just fine, whether drive #1 is plugged into sata0 or sata1. However when I disconnect drive #1 and try to boot from drive #2, on either sata0 or sata1, I get errors: error 1 lba 32 error 1 lba 1 error 1 lba 32 error 1 lba 1 error 1 lba 32 error 1 lba 1 error 1 lba 32 error 1 lba 1 error 1 lba 32 error 1 lba 1 error 1 lba 32 error 1 lba 1 No ZFS pools located, can't boot
Hi, I was able to reproduce this on a VM with ZFS GPT mirror setup running = also 8.1 release. Regards, George=
A few more people are also confirming this issue here: http://forums.freebsd.org/showthread.php?t=16556
I have the exact same problem with fresh installed 8.1. also reproduced = it in vmware fusion=
I would like to ask those who can reproduce the problem to try to use head (CURRENT) version, manually compile sys/boot/zfs/zfstest.c and try to use it to debug what exactly fails. -- Andriy Gapon
I can confirm this behaviour, easily reproducable in virtualbox. The problem must be in the internal logic of zfsboot, because the lba errors reported are from function: drvread() in sys/boot/i386/zfsboot/zfsboot.c, line #1079: if (V86_CY(v86.efl)) { printf("error %u lba %u\n", v86.eax >> 8 & 0xff, lba); return -1; } drvread() is called from vdev_read() (line #315)
on 05/08/2010 11:19 Martin Matuska said the following: > I can confirm this behaviour, easily reproducable in virtualbox. > > The problem must be in the internal logic of zfsboot, because the lba > errors reported are from function: > > drvread() in sys/boot/i386/zfsboot/zfsboot.c, line #1079: > if (V86_CY(v86.efl)) { > printf("error %u lba %u\n", v86.eax >> 8 & 0xff, lba); > return -1; > } > > drvread() is called from vdev_read() (line #315) Right, and this is the reason why I asked to try zfstest, because it would be interesting to see the whole stack trace to determine which high-level zfs operation fails. Thanks! -- Andriy Gapon
So I have done more code reading and debugging with mfsBSD in virtualbox and I came to the following conclusion: sys/boot/zfs/zfsimpl.c reads vdev information from the pool but there is no check if these vdevs do exist on physical devices. In other words, if the pool has last seen its vdevs as HEALTHY, gptzfsboot assumes all of them are available. So this way e.g. in case of a mirror, the vdev_mirror_read() tries to read from the first "healthy" vdev in its list. If the first vdev is the missing vdev (e.g. a disconnected or failed drive), it just cannot read from it so you are unable to boot. In my test setup, vdev_mirror_read() reported two healty kids and tried to read from the non-existing vdev. I think in the boot case, we should first scan for all physically available vdevs, then scan for children from their configuration. All child vdevs that cannot be physically opened (do not have a representation from the previous scan) should be set to state VDEV_STATE_CANT_OPEN and not assumed as VDEV_STATE_HEALTHY.
on 05/08/2010 14:53 Martin Matuska said the following: > In my test setup, vdev_mirror_read() reported two healty kids and tried > to read from the non-existing vdev. What happened next? If I read vdev_mirror_read() code correctly it should continue to the next device if reading from the current device fails. -- Andriy Gapon
A proposed patch is attached. The function vdev_read_phys() (sys/boot/zfs/zfsimpl.c, #325) does call vdev->v_phys_read() without checking if that function is registered. This check should be done in vdev_read_phys before doing anything else. vdev_create initializes vdev->v_phys_read as 0 and unavailable vdevs keep this value.
on 05/08/2010 19:23 Martin Matuska said the following: > A proposed patch is attached. > > The function vdev_read_phys() (sys/boot/zfs/zfsimpl.c, #325) does call > vdev->v_phys_read() without checking if that function is registered. > > This check should be done in vdev_read_phys before doing anything else. > > vdev_create initializes vdev->v_phys_read as 0 and unavailable vdevs > keep this value. Looks very good. Thanks! -- Andriy Gapon
Responsible Changed From-To: freebsd-fs->mm I am taking this PR.
Author: mm Date: Mon Aug 9 06:36:11 2010 New Revision: 211091 URL: http://svn.freebsd.org/changeset/base/211091 Log: Return EIO if vdev->v_phys_read is NULL. This fixes booting from a ZFS mirror with a unavailable primary device. PR: kern/148655 Reviewed by: avg Approved by: delphij (mentor) MFC after: 3 days Modified: head/sys/boot/zfs/zfsimpl.c Modified: head/sys/boot/zfs/zfsimpl.c ============================================================================== --- head/sys/boot/zfs/zfsimpl.c Mon Aug 9 06:02:23 2010 (r211090) +++ head/sys/boot/zfs/zfsimpl.c Mon Aug 9 06:36:11 2010 (r211091) @@ -328,6 +328,9 @@ vdev_read_phys(vdev_t *vdev, const blkpt size_t psize; int rc; + if (!vdev->v_phys_read) + return (EIO); + if (bp) { psize = BP_GET_PSIZE(bp); } else { _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
On Thu, 05 Aug 2010 18:23:43 +0200, Martin Matuska <mm@freebsd.org> wrote: > A proposed patch is attached. It works for me on 8.1-PRERELEASE. Many thanks! (Sorry for the delay, I was on a short vacation.) -- am
Author: mm Date: Thu Aug 12 05:59:55 2010 New Revision: 211205 URL: http://svn.freebsd.org/changeset/base/211205 Log: MFC r211091: Return EIO if vdev->v_phys_read is NULL. This fixes booting from a ZFS mirror with a unavailable primary device. PR: kern/148655 Reviewed by: avg Approved by: delphij (mentor) Modified: stable/8/sys/boot/zfs/zfsimpl.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/amd64/include/xen/ (props changed) stable/8/sys/cam/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) stable/8/sys/contrib/dev/acpica/ (props changed) stable/8/sys/contrib/pf/ (props changed) stable/8/sys/dev/xen/xenpci/ (props changed) Modified: stable/8/sys/boot/zfs/zfsimpl.c ============================================================================== --- stable/8/sys/boot/zfs/zfsimpl.c Thu Aug 12 01:08:50 2010 (r211204) +++ stable/8/sys/boot/zfs/zfsimpl.c Thu Aug 12 05:59:55 2010 (r211205) @@ -328,6 +328,9 @@ vdev_read_phys(vdev_t *vdev, const blkpt size_t psize; int rc; + if (!vdev->v_phys_read) + return (EIO); + if (bp) { psize = BP_GET_PSIZE(bp); } else { _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
So is this going into an errata patch for 8.1 anytime soon? My 8.0 system is a "root on zfs mirror", so this bug is pretty much holding me from upgrading to 8.1. - Sincerely, Dan Naumov
State Changed From-To: open->closed Fixed in 9-CURRENT and 8-STABLE.