I have a mac pro running 9-STABLE. Two disks are part of a bootable zfs mirror. They're MBR based. ------------------------------------------------------------------------------------ (delicious)[8:00pm]~>>gpart show ada1 => 63 1953525105 ada1 MBR (931G) 63 1953525105 1 freebsd [active] (931G) (delicious)[8:01pm]~>>gpart show ada1s1 => 0 1953525105 ada1s1 BSD (931G) 0 1941962752 1 freebsd-zfs (926G) 1941962752 11562353 2 freebsd-swap (5.5G) ------------------------------------------------------------------------------------ The mirror is currently resilvering, unrelated to this bug report. ------------------------------------------------------------------------------------ (delicious)[8:01pm]~>>zpool status zroot pool: zroot state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Sun Apr 1 19:55:59 2012 12.9G scanned out of 523G at 34.7M/s, 4h11m to go 12.9G resilvered, 2.47% done config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada3s1a ONLINE 0 0 0 ada1s1a ONLINE 0 0 0 (resilvering) errors: No known data errors ------------------------------------------------------------------------------------ /boot/loader.conf contains: vfs.root.mountfrom="zfs:zroot" and zroot has it's bootfs set to zroot. This system boots from either disk and runs happily. I tried a zpool split on it zpool split zroot zsplitroot and it booted up until the kernel tried to mount the root filesystem and it failed. Fix: I was able to repair the situation by booting from a 9.0 DVD, loading the zfs kernel module, doing zpool import which showed both pools zpool import -f -o cachefile=/tmp/zpool.cache -o altroot=/mnt zroot mount -t zfs zroot /mnt cp /tmp/zpool.cache /mnt/boot/zfs/zpool.cache then destroying the splitroot pool and attaching it back to the mirror. How-To-Repeat: I believe that setting up a bootable zfs mirror and running zpool split on it should repeat the problem. It does for me.
Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s).
A few things missing from your port: 1. "Doesn't boot" is quite a poor description in comparison with other details that you provided. You should give more detailed information of the boot failure. 2. gpart information for ada3 3. You don't say which disk ended up as zroot and as zsplitroot after the split. 4. You don't say which disk is configured as a boot disk in BIOS. -- Andriy Gapon
Thanks for following up on this. Andriy Gapon writes: > > A few things missing from your port: > > 1. "Doesn't boot" is quite a poor description in comparison with > other details that you provided. You should give more detailed > information of the boot failure. As the kernel is loading it fails to mount the root partition and presents one with the minimal mountroot dialog. Attempting to boot from zfs:zroot or zfs:zsplitroot fails. I remember that a question mark lists various other devices but don't remember the particulars. > 2. gpart information for ada3 Identical to ada1. Both disks have an MBR with one slice, which has a BSD label with two partitions, a (926GB, type freebsd-zfs) and b (5.5GB, type freebsd-swap). (delicious)[8:45am]~>>gpart show ada1 => 63 1953525105 ada1 MBR (931G) 63 1953525105 1 freebsd [active] (931G) (delicious)[8:45am]~>>gpart show ada1s1 => 0 1953525105 ada1s1 BSD (931G) 0 1941962752 1 freebsd-zfs (926G) 1941962752 11562353 2 freebsd-swap (5.5G) (delicious)[8:46am]~>>gpart show ada3 => 63 1953525105 ada3 MBR (931G) 63 1953525105 1 freebsd [active] (931G) (delicious)[8:46am]~>>gpart show ada3s1 => 0 1953525105 ada3s1 BSD (931G) 0 1941962752 1 freebsd-zfs (926G) 1941962752 11562353 2 freebsd-swap (5.5G) Both have boot bits set up like this: gpart bootcode -b /boot/boot0 adaX dd if=/boot/zfsboot of=/dev/adaXs1 count=1 dd if=/boot/zfsboot of=/dev/adaXs1a skip=1 seek=1024 > 3. You don't say which disk ended up as zroot and as zsplitroot > after the split. zpool status showed only zroot ada3s1a and zpool import showed zsplitroot ada1s1a > 4. You don't say which disk is configured as a boot disk in BIOS. This is a mac pro (tower), so BIOS is kind of a slippery concept. I leave the 'startup disk' set to the (other) OS X disks. On power up I hold down the option key and am presented with a dialog from which I can select any of the bootable devices in the box. When things are working correctly I can boot from either of the disks in the ZFS mirror and things go well. Now that I've upgraded I can even pull one of the disks before powering up and boot from the other (older zfs bootstrapping stuff used to have a problem with broken mirrors). After the zfs split I am unable to boot from either disk. g.
on 02/04/2012 18:47 George Hartzell said the following: > > Thanks for following up on this. > > Andriy Gapon writes: > > > > A few things missing from your port: > > > > 1. "Doesn't boot" is quite a poor description in comparison with > > other details that you provided. You should give more detailed > > information of the boot failure. > > As the kernel is loading it fails to mount the root partition and > presents one with the minimal mountroot dialog. Attempting to boot > from zfs:zroot or zfs:zsplitroot fails. I remember that a question > mark lists various other devices but don't remember the particulars. Thank you for additional detailed information. Could you please set vfs.zfs.debug=1 in your loader.conf and reproduce the problem and then report messages that appear just before and during mount attempt? Pictures of your screen would do just fine if you are unable to capture the messages as text. If you are unsure what to report please report more rather than less. -- Andriy Gapon
[restoring bug-followup] on 25/06/2012 19:48 George Hartzell said the following: > Here are two images (more than one screen full) of the lsdev -v output > from the loader that *actually loads the system* (when it's > working...). I have a theory of what's going on. I believe that after zpool split the following items get updated with new information: - vdev label on the disk that remains in the main pool (ada3 + zroot) - vdev label on the disk that goes to the new pool (ada1 + zsplitroot) - zpool.cache file in the main/active/remaining pool (zroot) The following item still has outdated information: - zpool.cache file in the new pool (zsplitroot) This happens because the new pool gets the contents of the original pool at split start time (before any new ids are generated). The file can not be updated automatically because the new pool remains "un-imported" (exported) after the split. If it is desired that the zsplitroot's zpool.cache is updated it has to be done manually - by importing the pool, etc. I believe that what you see is a result of you always booting in such a way that the zfs boot code and zfsloader find zsplitroot pool before zroot pool. This is confirmed by the screenshot which shows that zsplitroot is listed before zroot. Because of that the stale zpool.cache file is used and as a result the ZFS code in kernel can not find disks/pools based on the stale IDs. I think that you have to change the boot order using BIOS, so that you boot from ada3 disk. You should verify at the loader prompt that that is indeed the case and zroot is found first and is used as a boot pool. If your BIOS either doesn't allow to change boot order, or lies about it or doesn't change bios disk numbering such that a boot disk is the first drive (disk0 / "BIOS drive C"), then I recommend thatyou set 'currdev' loader variable to point to zroot pool. Depending on your zfsloader version it should be done in one of the following ways: set currdev=zfs:zroot: set currdev=zfs1 You can examine default value of the variable (with 'show' command) to see which scheme should be used. Please test this. -- Andriy Gapon
Andriy Gapon writes: > > [restoring bug-followup] > > on 25/06/2012 19:48 George Hartzell said the following: > > Here are two images (more than one screen full) of the lsdev -v output > > from the loader that *actually loads the system* (when it's > > working...). > > I have a theory of what's going on. > > I believe that after zpool split the following items get updated with new > information: > - vdev label on the disk that remains in the main pool (ada3 + zroot) > - vdev label on the disk that goes to the new pool (ada1 + zsplitroot) > - zpool.cache file in the main/active/remaining pool (zroot) > > The following item still has outdated information: > - zpool.cache file in the new pool (zsplitroot) > > This happens because the new pool gets the contents of the original pool at > split start time (before any new ids are generated). The file can not be > updated automatically because the new pool remains "un-imported" (exported) > after the split. If it is desired that the zsplitroot's zpool.cache is updated > it has to be done manually - by importing the pool, etc. > > I believe that what you see is a result of you always booting in such a way that > the zfs boot code and zfsloader find zsplitroot pool before zroot pool. This is > confirmed by the screenshot which shows that zsplitroot is listed before zroot. > Because of that the stale zpool.cache file is used and as a result the ZFS code > in kernel can not find disks/pools based on the stale IDs. > > I think that you have to change the boot order using BIOS, so that you boot from > ada3 disk. You should verify at the loader prompt that that is indeed the case > and zroot is found first and is used as a boot pool. > > If your BIOS either doesn't allow to change boot order, or lies about it or > doesn't change bios disk numbering such that a boot disk is the first drive > (disk0 / "BIOS drive C"), then I recommend thatyou set 'currdev' loader variable > to point to zroot pool. Depending on your zfsloader version it should be done > in one of the following ways: > set currdev=zfs:zroot: > set currdev=zfs1 > You can examine default value of the variable (with 'show' command) to see which > scheme should be used. > > Please test this. We're very close. First thing, I discovered that I was wrong about its being able to boot from either disk. I don't think I ever misspoke to you (don't see it in the bug report, can't find it in our personal emails) but I certainly had it in my head that I had tried booting from both disks and neither worked. It turns out that one will boot but the other will not. Some background (for the bug trail). This is a mac pro with four internal SATA disks. When you power on with the option key held down you're presented with a graphic view of things that you can boot from. In my configuration I have two disks labeled "dinky 1" and "dinky 2" (Mac disks in a Mac OS X software raid) and two labeled "Windows" (the boot stuff considers anything with an MBR to be Windows, sigh...). Macs play fast and loose with device numbering/naming and I have no way to tell the Windows devices apart. For the rest of this discussion I'll just refer to the four drives as A, B, C, and D (dinky 1, dinky 2, Windows, Windows) in left to right order. Out of habit, I tend to boot into BSD on drive C (two hits of the right arrow key). While I was playing with setting currdev in the loader I realized that I could boot from disk D but not from disk C (no matter how I set currdev). It turns out that when I boot from drive C and do an lsdev -v at the loader prompt I get ... zsplitroot ada1s1a zroot ada3s1a but/and when I boot from drive D and do an lsdev -v at the loader prompt I get ... zroot ada3s1a zsplitroot ada1s1a Notice that the order is different (confirming your observation). When I boot off of C using vfs.zfs.debug=1 I get messages about mismatched GUID's and failure to open the device it's looking for. When I boot off of D things are fine. This is consistent with your idea that there is incorrect information in the zpool.cache file on the filesystem in the zsplitroot pool. currdev does not seem to have any effect, it looks like something else is being used to find the initial zfs pool(s). I'm not sure what there is to do to make the situation better. It's (probably) not the usual use case that zpool splilt is being used to split the pool that contains the filesystem that contains the zpool.cache file, so it would be an awfully special case to do something special to handle it. It seems like "the right thing" to do would be for the user (me) to do the zpool split with the -R option then copy the correct zpool.cache file into the split-off pool's root filesystem. I'll repair my currently broken mirror and give that a try. Thanks for all the help! g.
on 27/06/2012 00:38 George Hartzell said the following: > currdev does not seem to have any effect, it looks like something else > is being used to find the initial zfs pool(s). Just a note that currdev would not affect the order of the pools in lsdev output. It should affect from which pool the zpool.cache is loaded. Ah! You probably need to issue unload command as well. I keep forgetting that in default configuration loader loads up stuff before presenting its menu. I've changed my loader.rc, so that nothing is loaded before the menu. But, yes, the best course of action seems to be to fix up zsplitroot right after splitting it off. Thank you for your persistence in testing and debugging! -- Andriy Gapon
Andriy Gapon writes: > on 27/06/2012 00:38 George Hartzell said the following: > > currdev does not seem to have any effect, it looks like something else > > is being used to find the initial zfs pool(s). > > Just a note that currdev would not affect the order of the pools in lsdev > output. It should affect from which pool the zpool.cache is loaded. > > Ah! You probably need to issue unload command as well. I keep forgetting that > in default configuration loader loads up stuff before presenting its menu. I've > changed my loader.rc, so that nothing is loaded before the menu. > > But, yes, the best course of action seems to be to fix up zsplitroot right after > splitting it off. > > Thank you for your persistence in testing and debugging! I thought the following would work, but it does not. zpool split -R /zsplitroot zroot zsplitroot zpool status # shows both pools. mount -t zfs zsplitroot /zsplitroot # my zfs stuff doesn't auto mount cp /boot/zfs/zpool.cache /zsplitroot/boot/zfs perl -pi.bak -e 's|zfs:zroot|zfs:zsplitroot|' /zsplitroot/boot/loader.conf umount /zsplitroot It fails to mount zsplitroot. Worse, setting vfs.zfs.debug=1 results in no additional output, just that the error is number 2. Any idea what I'm missing? g.
on 28/06/2012 00:53 George Hartzell said the following: > Andriy Gapon writes: > > on 27/06/2012 00:38 George Hartzell said the following: > > > currdev does not seem to have any effect, it looks like something else > > > is being used to find the initial zfs pool(s). > > > > Just a note that currdev would not affect the order of the pools in lsdev > > output. It should affect from which pool the zpool.cache is loaded. > > > > Ah! You probably need to issue unload command as well. I keep forgetting that > > in default configuration loader loads up stuff before presenting its menu. I've > > changed my loader.rc, so that nothing is loaded before the menu. > > > > But, yes, the best course of action seems to be to fix up zsplitroot right after > > splitting it off. > > > > Thank you for your persistence in testing and debugging! > > I thought the following would work, but it does not. > > zpool split -R /zsplitroot zroot zsplitroot > zpool status # shows both pools. > mount -t zfs zsplitroot /zsplitroot # my zfs stuff doesn't auto mount > cp /boot/zfs/zpool.cache /zsplitroot/boot/zfs > perl -pi.bak -e 's|zfs:zroot|zfs:zsplitroot|' /zsplitroot/boot/loader.conf > umount /zsplitroot > > It fails to mount zsplitroot. Worse, setting vfs.zfs.debug=1 results > in no additional output, just that the error is number 2. > > Any idea what I'm missing? /boot/zfs/zpool.cache after split contains only information about zroot. Thus it's kind of useless on zsplitroot. I think that you need to do zpool import -R ... -c ... zsplitroot and copy the proper cache file. -- Andriy Gapon
Andriy Gapon writes: > on 28/06/2012 00:53 George Hartzell said the following: > [...] > > I thought the following would work, but it does not. > > > > zpool split -R /zsplitroot zroot zsplitroot > > zpool status # shows both pools. > > mount -t zfs zsplitroot /zsplitroot # my zfs stuff doesn't auto mount > > cp /boot/zfs/zpool.cache /zsplitroot/boot/zfs > > perl -pi.bak -e 's|zfs:zroot|zfs:zsplitroot|' /zsplitroot/boot/loader.conf > > umount /zsplitroot > > > > It fails to mount zsplitroot. Worse, setting vfs.zfs.debug=1 results > > in no additional output, just that the error is number 2. > > > > Any idea what I'm missing? > > > /boot/zfs/zpool.cache after split contains only information about zroot. Thus > it's kind of useless on zsplitroot. > I think that you need to do zpool import -R ... -c ... zsplitroot and copy the > proper cache file. I thought that adding the "-R /zsplitroot" arg to the zpool split so that also did the import would result in a zpool.cache file that contained by. zpool status after the split shows both pool, which I didn't think was the case if you don't use -R. g.
on 30/06/2012 02:33 George Hartzell said the following: > I thought that adding the "-R /zsplitroot" arg to the zpool split so > that also did the import would result in a zpool.cache file that > contained by. zpool status after the split shows both pool, which > I didn't think was the case if you don't use -R. With -R zplitroot is added to the main zpool.cache in /boot/zfs (on zroot). Nothing is done with zpool.cache in zplitroot as far as I understand. -- Andriy Gapon
State Changed From-To: open->closed Analysis has not revealed any FreeBSD ZFS bug.
Responsible Changed From-To: freebsd-fs->avg Record interest in further developments on this report.