The bug is hard to get. Everything should match: * i386 arch, * `make installkernel`, * RELENG_8 after zfs v14 import, * zpool with zfs v14, * zfsboot installed. Boot partition is NOT compressed, changing the boot partition does nothing, even creating a new one. After that sometimes part of kernel or other modules would yield "ZFS: gang block detected" on loading. Fix: Bootability can be restored with: rsync -lrptygoWSH --delete /boot /somewhere/boot rm -rf /boot rsync -lrptygoWSH --delete /somewhere/boot /boot How-To-Repeat: Any installkernel can do the trick.
Responsible Changed From-To: freebsd-i386->freebsd-fs Reassign to fs team
Hi! Fix: # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad0 -- Best regards, Andrei V. Lavreniyuk.
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0 gpart: No such geom: ada0 Could you explain what exactly it should do? Using gptzfsboot instead of simple zfsboot? -- Sphinx of black quartz judge my vow.
Hi! http://wiki.freebsd.org/ZFS http://blogs.freebsdish.org/lulf/2008/12/16/setting-up-a-zfs-only-system/ -- Best regards, Andrei V. Lavreniyuk.
26.02.2010 11:42, Andrei V. Lavreniyuk wrote: > http://wiki.freebsd.org/ZFS > > http://blogs.freebsdish.org/lulf/2008/12/16/setting-up-a-zfs-only-system/ This is not my case. I have no partitions at all and I'm booting from a ZFS dedicated disk. And if I'm not missing something gpart requires separate partition with gptzfsboot to work. -- Sphinx of black quartz judge my vow.
> I have no partitions at all and I'm booting from a ZFS dedicated disk. Sure you do have partitions. Booting off a purely ZFS disk without a valid partition table is not and has never been possible in neither FreeBSD nor Solaris. What you need to do is update your bootcode (because it has changed for zfs v14). How you do this depends on whether you use zfsboot or gptzfsboot. The links above describe the process. - Sincerely, Dan Naumov
Hello, this bug seems to be still present in stable/8. The proposed workaround seems to work. You may find the console screenshot at http://danger.rulez.sk/dockdrop/144214.png -- S pozdravom / Best regards Daniel Gerzo, FreeBSD committer
Hi, note that the HDD has been almost full (97%) when the box died (ca. 2GB free). -- S pozdravom / Best regards Daniel Gerzo
This bug is still actual. We had to forcibly reboot a server with a 97% full zpool (~2GB free space) and we came to a ZFS: gang block detected The workaround with re-creating and re-populating /boot worked.
Just to be on the sure side: have you guys actually updated bootblocks on your system? I.e. the code that runs before loader and that resides beyond filesystems. -- Andriy Gapon
I had a private conversation with Daniel Gerzo (danger@) and neither him nor mm@ are sure that the system for which they reported the problem had the latest boot blocks that are supposed to actually support zfs gang blocks. P.S. gang block support seems to have been added to stable/8 by rnoland@ on 21 Nov 2009 in r199634, so anything before that is not expected to work. -- Andriy Gapon
It seems that I have been misunderstanding the problem. "ZFS: gang block detected" won't even appear if boot code is too old. Having briefly glanced over the code and comparing it to the code in osol and in zio_gang_tree_issue(), I think the following change is needed. But I am not sure if it is a real fix for the issue at hand. If anyone can reproduce the problem, could you please test this change? Thanks! -- Andriy Gapon
Andriy Gapon wrote: > It seems that I have been misunderstanding the problem. > "ZFS: gang block detected" won't even appear if boot code is too old. > > Having briefly glanced over the code and comparing it to the code in osol and in > zio_gang_tree_issue(), I think the following change is needed. > But I am not sure if it is a real fix for the issue at hand. > > If anyone can reproduce the problem, could you please test this change? > Thanks! This looks sane. I was never actually able to test it, since reproducing the issue is rather tricky. robert. >
2010/5/13 Andriy Gapon <avg@icyb.net.ua>: > > It seems that I have been misunderstanding the problem. > "ZFS: gang block detected" won't even appear if boot code is too old. > > Having briefly glanced over the code and comparing it to the code in osol and in > zio_gang_tree_issue(), I think the following change is needed. > But I am not sure if it is a real fix for the issue at hand. > > If anyone can reproduce the problem, could you please test this change? > Thanks! Tested it. Same problem. 1. Rebuild and reinstall on i386. Filling disk up (600M free of 120G, 0.5%). 2. Immediately after starting boot screen bursts into psychic colors. Computer reboots. 3. Booted from ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/201004/FreeBSD-8.0-STABLE-201004-i386-livefs.iso in VirtualBox i386. Boot code updated with dd. 4. Same as p2. in vBox i386 takes looong time to rotate dash then spits "ZFS: gang block detected" and hangs. 5. Booted from amd64 install, updated boot code with dd. 6. Booted on amd64. Immediately after starting boot spits out "ZFS: gang block detected" and hangs. 7. Booted from amd64 install. /boot transferred transferred to/from other disk. 8. Booted on amd64. Immediately after starting boot spits out "ZFS: gang block detected" and hangs. 9. Booted from amd64 install. Some files deleted (800M free, files were written contiguously). /boot transferred transferred to/from other disk. 10. Booted on amd64. Results: 1. Patch changes something. However zfsloader(?) still can't be read completely. 2. Bug can happen on amd64. More extreme conditions needed(?). 3. I'll post a follow-up on successfully booting on original i386 hardware. -- Sphinx of black quartz judge my vow.
on 14/05/2010 17:12 Volodymyr Kostyrko said the following: > 2010/5/13 Andriy Gapon <avg@icyb.net.ua>: >> It seems that I have been misunderstanding the problem. >> "ZFS: gang block detected" won't even appear if boot code is too old. >> >> Having briefly glanced over the code and comparing it to the code in osol and in >> zio_gang_tree_issue(), I think the following change is needed. >> But I am not sure if it is a real fix for the issue at hand. >> >> If anyone can reproduce the problem, could you please test this change? >> Thanks! > > Tested it. Same problem. Sigh. I almost do not see any other obvious differences with other code that is supposed to support gang blocks. > 1. Rebuild and reinstall on i386. Filling disk up (600M free of 120G, 0.5%). > 2. Immediately after starting boot screen bursts into psychic colors. > Computer reboots. With unpatched boot code I presume? > 3. Booted from ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/201004/FreeBSD-8.0-STABLE-201004-i386-livefs.iso > in VirtualBox i386. Boot code updated with dd. Have you updated both both part of zfsboot and loader? Are you sure that you used patched versions? (asking just in case) > 4. Same as p2. in vBox i386 takes looong time to rotate dash then > spits "ZFS: gang block detected" and hangs. Nothing else get printed? Asking because of this screenshot: http://danger.rulez.sk/dockdrop/144214.png > 5. Booted from amd64 install, updated boot code with dd. > 6. Booted on amd64. Immediately after starting boot spits out "ZFS: > gang block detected" and hangs. > 7. Booted from amd64 install. /boot transferred transferred to/from other disk. > 8. Booted on amd64. Immediately after starting boot spits out "ZFS: > gang block detected" and hangs. amd64 has exactly the same boot code that i386 has, perhaps some difference could arise during compilation, but even if so, it should not matter much in our case. > 9. Booted from amd64 install. Some files deleted (800M free, files > were written contiguously). /boot transferred transferred to/from > other disk. > 10. Booted on amd64. Not interested much in the workarounds - if they work, then OK, but mainly we are trying to fix the boot code. Only behavior of installed zfsboot and zfsloader are interesting to us. > Results: > 1. Patch changes something. However zfsloader(?) still can't be read completely. > 2. Bug can happen on amd64. More extreme conditions needed(?). > 3. I'll post a follow-up on successfully booting on original i386 hardware. Can you please also share output of 'zfs get all' for the boot filesystem? Thank you for your help! And one last thing that I could think of: --- a/sys/boot/zfs/zfsimpl.c +++ b/sys/boot/zfs/zfsimpl.c @@ -1001,7 +1001,7 @@ zio_read(spa_t *spa, const blkptr_t *bp, void *buf) if (DVA_GET_GANG(dva)) { printf("ZFS: gang block detected!\n"); if (zio_read_gang(spa, bp, dva, buf)) - return (EIO); + continue; } else { vdevid = DVA_GET_VDEV(dva); offset = DVA_GET_OFFSET(dva); This should be applied in addition to the previous patch. If this still doesn't work, the it would make sense to add printfs in various places of zio_read_gang() function to try to see what happens there. -- Andriy Gapon
Here's a new patch that, as I strongly believe, should fix the problem for real. I am sending "production ready" version of the patch, please keep "ZFS: gang block detected!" message in your sources during testing/verification. Thanks! -- Andriy Gapon
State Changed From-To: open->analyzed It seems that I've got interested and involved in this PR.
Responsible Changed From-To: freebsd-fs->avg It seems that I've got interested and involved in this PR.
Author: avg Date: Fri May 28 07:34:20 2010 New Revision: 208610 URL: http://svn.freebsd.org/changeset/base/208610 Log: boot/zfs: fix gang block reading code - use correct size (512) while reading a gang block - skip holes while reading child blocks - advance buffer pointer while reading child blocks PR: 144214 MFC after: 10 days Modified: head/sys/boot/zfs/zfsimpl.c Modified: head/sys/boot/zfs/zfsimpl.c ============================================================================== --- head/sys/boot/zfs/zfsimpl.c Fri May 28 06:49:57 2010 (r208609) +++ head/sys/boot/zfs/zfsimpl.c Fri May 28 07:34:20 2010 (r208610) @@ -958,12 +958,17 @@ zio_read_gang(spa_t *spa, const blkptr_t break; if (!vdev || !vdev->v_read) return (EIO); - if (vdev->v_read(vdev, bp, &zio_gb, offset, SPA_GANGBLOCKSIZE)) + if (vdev->v_read(vdev, NULL, &zio_gb, offset, SPA_GANGBLOCKSIZE)) return (EIO); for (i = 0; i < SPA_GBH_NBLKPTRS; i++) { - if (zio_read(spa, &zio_gb.zg_blkptr[i], buf)) + blkptr_t *gbp = &zio_gb.zg_blkptr[i]; + + if (BP_IS_HOLE(gbp)) + continue; + if (zio_read(spa, gbp, buf)) return (EIO); + buf = (char*)buf + BP_GET_PSIZE(gbp); } return (0); @@ -994,9 +999,8 @@ zio_read(spa_t *spa, const blkptr_t *bp, continue; if (DVA_GET_GANG(dva)) { - printf("ZFS: gang block detected!\n"); if (zio_read_gang(spa, bp, dva, buf)) - return (EIO); + continue; } else { vdevid = DVA_GET_VDEV(dva); offset = DVA_GET_OFFSET(dva); _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
State Changed From-To: analyzed->patched The fix is committed to head.
2010/5/26 Andriy Gapon <avg@freebsd.org>: > Here's a new patch that, as I strongly believe, should fix the problem for real. > I am sending "production ready" version of the patch, please keep "ZFS: gang > block detected!" message in your sources during testing/verification. Yes, this patch works. After reinitializing boot code the message "ZFS: gang block detected!" appears multiple times but system proceeds with the boot sequence. -- Sphinx of black quartz judge my vow.
Author: avg Date: Mon Jun 7 13:37:13 2010 New Revision: 208892 URL: http://svn.freebsd.org/changeset/base/208892 Log: MFC r208610: boot/zfs: fix gang block reading code - use correct size (512) while reading a gang block - skip holes while reading child blocks - advance buffer pointer while reading child blocks PR: 144214 Approved by: re(kib) Modified: stable/8/sys/boot/zfs/zfsimpl.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/amd64/include/xen/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) stable/8/sys/contrib/dev/acpica/ (props changed) stable/8/sys/contrib/pf/ (props changed) stable/8/sys/dev/xen/xenpci/ (props changed) stable/8/sys/geom/sched/ (props changed) Modified: stable/8/sys/boot/zfs/zfsimpl.c ============================================================================== --- stable/8/sys/boot/zfs/zfsimpl.c Mon Jun 7 11:33:20 2010 (r208891) +++ stable/8/sys/boot/zfs/zfsimpl.c Mon Jun 7 13:37:13 2010 (r208892) @@ -958,12 +958,17 @@ zio_read_gang(spa_t *spa, const blkptr_t break; if (!vdev || !vdev->v_read) return (EIO); - if (vdev->v_read(vdev, bp, &zio_gb, offset, SPA_GANGBLOCKSIZE)) + if (vdev->v_read(vdev, NULL, &zio_gb, offset, SPA_GANGBLOCKSIZE)) return (EIO); for (i = 0; i < SPA_GBH_NBLKPTRS; i++) { - if (zio_read(spa, &zio_gb.zg_blkptr[i], buf)) + blkptr_t *gbp = &zio_gb.zg_blkptr[i]; + + if (BP_IS_HOLE(gbp)) + continue; + if (zio_read(spa, gbp, buf)) return (EIO); + buf = (char*)buf + BP_GET_PSIZE(gbp); } return (0); @@ -994,9 +999,8 @@ zio_read(spa_t *spa, const blkptr_t *bp, continue; if (DVA_GET_GANG(dva)) { - printf("ZFS: gang block detected!\n"); if (zio_read_gang(spa, bp, dva, buf)) - return (EIO); + continue; } else { vdevid = DVA_GET_VDEV(dva); offset = DVA_GET_OFFSET(dva); _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Author: avg Date: Mon Jun 7 13:44:04 2010 New Revision: 208893 URL: http://svn.freebsd.org/changeset/base/208893 Log: MFC r208610: boot/zfs: fix gang block reading code - use correct size (512) while reading a gang block - skip holes while reading child blocks - advance buffer pointer while reading child blocks PR: 144214 Modified: stable/7/sys/boot/zfs/zfsimpl.c Directory Properties: stable/7/sys/ (props changed) stable/7/sys/cddl/contrib/opensolaris/ (props changed) stable/7/sys/contrib/dev/acpica/ (props changed) stable/7/sys/contrib/pf/ (props changed) Modified: stable/7/sys/boot/zfs/zfsimpl.c ============================================================================== --- stable/7/sys/boot/zfs/zfsimpl.c Mon Jun 7 13:37:13 2010 (r208892) +++ stable/7/sys/boot/zfs/zfsimpl.c Mon Jun 7 13:44:04 2010 (r208893) @@ -914,12 +914,17 @@ zio_read_gang(spa_t *spa, const blkptr_t break; if (!vdev || !vdev->v_read) return (EIO); - if (vdev->v_read(vdev, bp, &zio_gb, offset, SPA_GANGBLOCKSIZE)) + if (vdev->v_read(vdev, NULL, &zio_gb, offset, SPA_GANGBLOCKSIZE)) return (EIO); for (i = 0; i < SPA_GBH_NBLKPTRS; i++) { - if (zio_read(spa, &zio_gb.zg_blkptr[i], buf)) + blkptr_t *gbp = &zio_gb.zg_blkptr[i]; + + if (BP_IS_HOLE(gbp)) + continue; + if (zio_read(spa, gbp, buf)) return (EIO); + buf = (char*)buf + BP_GET_PSIZE(gbp); } return (0); @@ -950,9 +955,8 @@ zio_read(spa_t *spa, const blkptr_t *bp, continue; if (DVA_GET_GANG(dva)) { - printf("ZFS: gang block detected!\n"); if (zio_read_gang(spa, bp, dva, buf)) - return (EIO); + continue; } else { vdevid = DVA_GET_VDEV(dva); offset = DVA_GET_OFFSET(dva); _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
State Changed From-To: patched->closed Should be resolved now in all stable branches.