Created attachment 232513 [details] dmesg I have a 2TB Seagate Barracuda drive with SATA/USB adapter that I use for backups and archives. I created a ZFS pool (named Primary) on this drive using the entire drive, not a partition, as the ZFS docs recommend. This drive was previously used with a Linux system and had a GPT partition scheme with one partition containing an ext4 filesystem. If I have this drive connected to my system when I power it on, the first thing to note is that I get some chatter in /var/log/messages about a corrupted GPT partition table: Mar 17 07:59:38 pangloss kernel: GEOM: da0: the primary GPT table is corrupt or invalid. Mar 17 07:59:38 pangloss kernel: GEOM: da0: using the secondary instead -- recovery strongly advised. da0 is the device address of the disk. When the system comes up, 'df' indicates that the pool is mounted where it is supposed to be and the space utilization numbers are correct. But 'ls' of the mountpoint returns absolutely nothing. None of the files are accessible. As root, I umount the pool and remount and now it mounts correctly. If the drive is not connected to the system at boot time and I later connect it, I get similar messages about a corrupt GPT primary table. If I try to mount with zfs mount Primary the mount fails. I don't have the error messages in front of me as I write this. I will try to duplicate and get the exact sequence and report in a subsequent comment. zfs status Primary said that the pool did not exist. I then rebooted the system with the drive plugged in and got the same behavior described above (blank filesystem at first, everything ok after umount/mount). dmesg attached.
On the not-plugged-in at boot-time case: Plugging the disk in results in the GPT primary table corruption messages previous reported. Attempting zfs mount Primary results in "Cannot open 'Primary': data set does not exist" zpool status Primary: "cannot open 'Primary': no such pool" As I said in my earlier post, rebooting with the drive plugged in results in the pool being mounted: Filesystem Size Used Avail Capacity Mounted on zroot/ROOT/default 219G 7.8G 211G 4% / devfs 1.0K 1.0K 0B 100% /dev zroot/tmp 211G 17M 211G 0% /tmp zroot 211G 96K 211G 0% /zroot zroot/usr/ports 211G 96K 211G 0% /usr/ports zroot/var/mail 211G 144K 211G 0% /var/mail zroot/var/audit 211G 96K 211G 0% /var/audit zroot/var/crash 211G 96K 211G 0% /var/crash zroot/usr/home 215G 4.2G 211G 2% /usr/home zroot/var/log 211G 296K 211G 0% /var/log zroot/var/tmp 211G 96K 211G 0% /var/tmp zroot/usr/src 211G 96K 211G 0% /usr/src Primary 1.8T 346G 1.4T 19% /usr/home/dca/Primary The space utilization numbers are correct but 'ls Primary' or 'ls' of any files I know are there returns nothing. zfs umount Primary zfs mount Primary gets the pool mounted properly.
what does gpart show display about the usb disk ?
root@pangloss:/usr/home/dca # ls /dev/da* /dev/da0 root@pangloss:/usr/home/dca # gpart show /dev/da0 gpart: No such geom: /dev/da0. root@pangloss:/usr/home/dca #
(In reply to donaldcallen from comment #0) > … using the entire drive, not a partition, as the ZFS docs recommend. … Can you recall what was used for partition management? Maybe see also bug 262241.
(In reply to Graham Perrin from comment #4) If you mean how the partitions were set up prior to creating a zfs pool on the whole disk, I used gpart to create a GPT partitioning scheme with one partition containing a UFS2 filesystem. For what it's worth, I am guessing that the messages about the primary GPT being corrupted are a major clue. I am further guessing that the zfs pool creation left the secondary GPT table intact, confusing another part of the system into thinking this device had a GPT setup that had had its primary table corrupted. Note that it says it is using the secondary GPT table, so apparently it thinks that is valid. If I were trying to fix this, I would investigate this line of reasoning (I'm an old -- literally -- OS internals guy, having run the Tenex project for years at BBN in the ARPANet and early Internet days).
There is a possibly related additional issue here. If I disconnect the backup disk prior to system startup, to avoid the problem described in this PR, and then plug the disk in when the system is up, I cannot mount it. 'zfs mount <filesystem label>' fails. 'zpool status <filesystem label>' says it knows nothing about the filesystem. The only way I can get this filesystem mounted is to have the disk plugged in on startup and then go through the umount-mount sequence to get it mounted properly.
(In reply to donaldcallen from comment #5) I tried something other than FreeBSD on the machine I have written about here and that experiment did not work out. So I have re-installed FreeBSD. In doing so, I revisited the problem caused by creating a pool on the whole backup device, rather than using a partition. This works, but produces error messages about the primary gpt partition table being corrupt, as I mentioned in an earlier comment here. I did some googling on this, and I think my speculation in the comment to which I'm replying is correct. Others have seen this issue and one person said that zpool create was not wiping the secondary gpt table, confusing the system at mount time, exactly as I guessed. His recommendation was to use gpart destroy before creating the pool, which wipes both gpt tables. I think this is a bug in zfs create. If it's going to create a pool, it needs to do so in a way that doesn't cause problems when the pool is mounted. That includes wiping the secondary gpt table.
I should also mention that problem of my backup pool not being mounted correctly on startup is still present in 13.1. df says the pool is mounted but ls of the mountpoint shows no files/directories present. umount and re-mount fixes it, as I reported on 2022-03-17.
This problem is still present in 13.2, over a year and a half after I originally reported it. Again, I emphasize that if you do what the documentation says you should do (right at the beginning of the section on ZFS) in creating a pool, you may well have this problem. A simple solution would be to change the documentation, suggesting that people create partition tables and a single full-disk partition, and create their pool in the partition, avoiding this problem. I don't understand why, after a year and a half, this simple, temporary documentation fix hasn't been done. One of the frustrations of using FreeBSD, is that simple problems like this get addressed at glacial speed. The system is great. Why damage its reputation by not taking care of the easy problems quickly?
There is an additional wrinkle to this problem. I tried following my own recommendation and created a GPT partition scheme on a mobile disk and one full-disk partition. I then created a ZFS pool in that partition. The problem is that if I have that file-system mounted when I shut the system down and leave the drive plugged in when I reboot, df tells me that the file-system is mounted in the correct place, as I would expect. Except there are no files visible to ls or any other attempts to access files in that file-system. df reports the space utilization I would expect, but no files are visible. If I zfs unmount the file-system and then re-mount it, all is well. I *think* this is an OpenZFS problem, as some searching turns up reports of similar behavior seen by people running ZFS on Linux.
^Triage: I'm sorry that this PR did not get addressed in a timely fashion. By now, the version that it was created against is out of support. Please re-open if it is still a problem on a supported version.