I have a machine "hal", running release/8.2.0, which is equipped with 6 TB disks, labeled disk31..disk36. These are more or less split in half using gpart(8), yielding partitions disk31p3..disk36p3 and disk31p4..disk36p4, each slightly less than 1 TB in size. The first set of these halves, disk31p3..disk36p3, is combined into a raidz2 zpool "hal.1", yielding approximately 4 TB. This pool is used for production. It was created on 2010-10-22, with zpool version 14 and zfs version 3 ("zpool get version hal.1", "zfs get version hal.1"). The second set of these halves, disk31p4..disk36p4, is exported as iSCSI targets using net/istgt. These are used for experimental purposes. On hal, using emulators/virtualbox-ose, I run a virtual machine v903, which I usually keep quite close to head. In v903, using iscontrol(8), I import the iSCSI targets, yielding (in order) da0..da5. I want to see what will happen when I create a zpool under head and try to duplicate the file systems in hal.1 to it. To test this with the file system "hal.1/backup/databases", I start v903 and on it run: [0]# kldload iscsi_initiator.ko iscsi: version 2.3.1 [0]# for i in disk3{1..6}p4 do echo "*** $i ***" iscontrol -n ${i} || break sleep 1 done *** disk31p4 *** iscontrol[1257]: running iscontrol[1257]: (pass2:iscsi0:0:0:0): tagged openings now 0 *** disk32p4 *** iscontrol[1262]: running iscontrol[1262]: (pass3:iscsi1:0:0:0): tagged openings now 0 *** disk33p4 *** iscontrol[1267]: running iscontrol[1267]: (pass4:iscsi2:0:0:0): tagged openings now 0 *** disk34p4 *** iscontrol[1272]: running iscontrol[1272]: (pass5:iscsi3:0:0:0): tagged openings now 0 *** disk35p4 *** iscontrol[1277]: running iscontrol[1277]: (pass6:iscsi4:0:0:0): tagged openings now 0 *** disk36p4 *** iscontrol[1282]: running iscontrol[1282]: (pass7:iscsi5:0:0:0): tagged openings now 0 [0]# zpool create v903.2 raidz2 da0 da1 da2 da3 da4 da5 ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present; to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf. ZFS filesystem version: 5 ZFS storage pool version: features support (5000) [0]# zfs create v903.2/backup [0]# rsh -n hal "zfs send -vR hal.1/backup/databases@2012-07-21.14:23:36" | zfs receive -v v903.2/backup/databases sending from @ to hal.1/backup/databases@2011-05-01.08:34:45 receiving full stream of hal.1/backup/databases@2011-05-01.08:34:45 into v903.2/backup/databases@2011-05-01.08:34:45 sending from @2011-05-01.08:34:45 to hal.1/backup/databases@2011-06-02.20:42:45 received 48.2MB stream in 13 seconds (3.71MB/sec) receiving incremental stream of hal.1/backup/databases@2011-06-02.20:42:45 into v903.2/backup/databases@2011-06-02.20:42:45 sending from @2011-06-02.20:42:45 to hal.1/backup/databases@2011-10-15.13:56:24 received 254MB stream in 70 seconds (3.63MB/sec) receiving incremental stream of hal.1/backup/databases@2011-10-15.13:56:24 into v903.2/backup/databases@2011-10-15.13:56:24 sending from @2011-10-15.13:56:24 to hal.1/backup/databases@2011-10-27.19:21:00 received 268MB stream in 75 seconds (3.57MB/sec) receiving incremental stream of hal.1/backup/databases@2011-10-27.19:21:00 into v903.2/backup/databases@2011-10-27.19:21:00 sending from @2011-10-27.19:21:00 to hal.1/backup/databases@2012-03-19.21:39:06 received 305MB stream in 82 seconds (3.72MB/sec) receiving incremental stream of hal.1/backup/databases@2012-03-19.21:39:06 into v903.2/backup/databases@2012-03-19.21:39:06 sending from @2012-03-19.21:39:06 to hal.1/backup/databases@2012-07-21.14:23:36 received 345MB stream in 101 seconds (3.42MB/sec) receiving incremental stream of hal.1/backup/databases@2012-07-21.14:23:36 into v903.2/backup/databases@2012-07-21.14:23:36 cannot receive incremental stream: invalid backup stream rsh -n hal "zfs send -vR hal.1/backup/databases@2012-07-21.14:23:36" 0.22s user 21.94s system 5% cpu 6:33.67 total zfs receive -v v903.2/backup/databases 0.02s user 5.46s system 1% cpu 6:37.24 total [1]# As can be seen, importing the snapshot created on 2012-07-21 fails. On hal, I have the following: [0]# zpool history | grep backup/databases 2010-12-04.10:37:37 zfs create hal.1/backup/databases 2011-05-01.08:34:45 zfs snapshot hal.1/backup/databases@2011-05-01.08:34:45 2011-06-02.20:42:46 zfs snapshot hal.1/backup/databases@2011-06-02.20:42:45 2011-10-15.13:56:24 zfs snapshot hal.1/backup/databases@2011-10-15.13:56:24 2011-10-27.19:21:00 zfs snapshot hal.1/backup/databases@2011-10-27.19:21:00 2012-03-19.21:39:07 zfs snapshot hal.1/backup/databases@2012-03-19.21:39:06 2012-07-21.14:23:37 zfs snapshot hal.1/backup/databases@2012-07-21.14:23:36 [0]# Googling reveals very little information about this problem, one possibility might be http://wesunsolve.net/bugid/id/7002362, but as can be seen above, this file system has never been renamed. I have tried a similar command also with another filesystem in hal.1, which already fails when receiving the first intermediate snapshot.
Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s).
There is one more interesting hit with Google: https://wikis.oracle.com/display/FishWorks/ak-2011.04.24.2.0+Release+Notes In it, one finds: --- Issues Addressed The following CRs have been fixed in this release: [...] 7056496 cannot receive new filesystem stream: invalid backup stream --- So it seems Sun/Oracle have already resolved this. Regards, Martin
Good news... I upgraded the FreeBSD head running in the virtual machine v903 to r241507, with the goal to see whether the problem described has been fixed in the meantime. (The server, hal, is still at 8.2 release.) In addition, instead of using iSCSI to make disk{31..36}p4 from hal appear in v903, I attached them as devices directly using VirtualBox (all to one and the same virtual SATA controller). Now the test worked, and I could successfully receive the backup stream. I even made an md5 comparison of all the files in the received file system (> 53000), which went through without a single difference. Very good! What was left was testing with the original test setup again, i.e., using iSCSI. Unfortunately, this still fails in the same way, so it seems that something with iscontrol or istgt is broken. On a hunch, I'd say it is iscontrol, simply because I assume that on the server side (ports/net/istgt) there is not that much that can break, it being a userland application, whereas iscontrol is a kernel module hooking into the SCSI subsystem. Therefore, this defect should possibly be reclassified and instead of [zfs] have some other sticker applied. Regards, Martin p.s. I am using istgt also with disk images of MS Windows XP installations, exporting them to remotely running VirtualBox instances without any issues - to me a hint that istgt is quite o.k. Of course, the zfs scenario uses 6 iSCSI targets at the same time, whereas the XP emulation uses only one, so maybe either istgt or iscontrol have problems with multiple connections/instances.
One more test with the goal of determining whether iscontrol or istgt is the culprit... The test setup is altered as follows: In addition to the 6 partitions which are being used for the zpool on v903, also v903's OS partition (GPT, holding /, /usr, and swap) is being exported from hal via istgt (i.e., as iSCSI targets). Then v903 is started as a VirtualBox client on a Windows 7 host, where the (now 7) partitions are included via VirtualBox's iSCSI facility (http://www.virtualbox.org/manual/ch05.html#idp17868816) as ada0 and ada1..ada6. This setup results in a flawless zfs receive of the sent stream, again checked via md5 of all files. In addition the number (but not md5) of files in all snapshot directories (.zfs/snapshot/*) were checked. This is a strong indication that indeed iscontrol is problematic. Regards, Martin p.s. Some observations using this setup: It is a lot faster than running v903 in a VirtualBox client on hal. I had to do a disk check of v903 because it had crashed, and checking the UFS file systems / and /usr was in fact blazingly fast, much faster than when its containing partition was directly connected in hal as VirtualBox host. Also, the subsequent zfs import test ran with a sustained rate of approximately 23 MB/s, whereas with v903 on hal as host it ran with only 15 MB/s (see my previous test), a speed increase of 50%. One should note that in the setup of this test, there is a 1 Gbps Ethernet between hal as the server and the Windows host, whereas otherwise v903 runs directly on hal, which at least in theory should result in much quicker I/O. So it seems that not only iscontrol needs fixing, but VirtualBox on FreeBSD hosts could use some optimization. But then, FreeBSD is the work of volunteers, and at this point of my tests is certainly the time to thank them for all the great possibilities this system offers!
For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped
^Triage: I'm sorry that this PR did not get addressed in a timely fashion. By now, the version that it was created against is long out of support. As well, many newer versions of ZFS have been imported. Please re-open if it is still a problem on a supported version.
It is hopefully fixed by now, however, since that time I do not dare using "-v" with "zfs send" anymore (and I am using zfs send/receive heavily nowadays, for backup purposes). Therefore, I cannot even confirm whether this is really fixed. :-) -- Martin