Bug 243421 - Zfs send/recv backup over USB 3.0/eSATA crashes /dev after 466 GB
Summary: Zfs send/recv backup over USB 3.0/eSATA crashes /dev after 466 GB
Status: Closed Not A Bug
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.1-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-18 05:18 UTC by Jonathan Vasquez
Modified: 2020-01-19 03:52 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Vasquez 2020-01-18 05:18:45 UTC
Hello all,

I made a post earlier in the week on the FreeBSD forums in extensive detail to see if I could get a solution before having to file a bug report (https://forums.freebsd.org/threads/dev-directory-basically-acompletely-empty-after-4-8-hours-after-i-start-an-external-drive-over-usb-backup-zfs-send-recv.73699/). Since after doing various tests and purchasing new equipment (with FreeBSD compatible parts), it still didn't solve the issue.

TLDR: For whatever reason when I do a zfs send | recv from my OS pool (tank) to my backup pool (backup) over USB 3.0 or eSATA, once the stream hits 466 GB (469 GB in this particular dataset, and a few hundred for GB in other datasets), the zfs program crashes, and the entire /dev directory vanishes. The only thing left in /dev after this is "null". This effectively means that even though FreeBSD is still running, most applications stop functioning since they can't find any resources in /dev (Example: zpool status would no longer work since it can't communicate with the hardware via /dev/zfs). It's very weird since I don't see any error messages in 'dmesg'. It's like it's a "clean crash".

At the moment I will leave FreeBSD running on the box to see if we can resolve this, but if not I will need to switch back to Linux since this is literally preventing me from making my external pool backups.

Thank you and please let me know if there is any other info I can provide.
Comment 1 Jonathan Vasquez 2020-01-18 14:27:18 UTC
Below is a link of an image containing the errors and the nuked /dev dir.

https://imgur.com/ae0mivg
Comment 2 Jonathan Vasquez 2020-01-18 14:52:27 UTC
This person from May 2003 seems to have had a similar issue back during FreeBSD 9.0 days.

https://lists.freebsd.org/pipermail/freebsd-stable/2013-May/073600.html

To add some more info about my server, 'tank' is my main pool and the OS is running on of the datasets on there. My swap is also on ZFS and it is mirrored across the drives. The send|recv is sending the information from a tank snapshot to the receiving pool.
Comment 3 Jonathan Vasquez 2020-01-18 16:58:43 UTC
As a temporary workaround, I was able to successfully back up all 654 GB (639G compressed on main pool) of data I have from my main system to the external zfs pool over rsync. I just can't use zfs replication to do it due to this issue.
Comment 4 Jonathan Vasquez 2020-01-19 03:52:46 UTC
So I'm an idiot. There is no bug. Basically on Linux I always used to do 'zpool create -N <>' or 'zpool create -R <>', or both, because I wanted to avoid mountpoint collisions. On BSD I didn't use those flags and just imported the backup pool directly. So this would make it so that when I sent my 'tank' datasets to 'backup', the /dev directory alongside everything else sitting there, would get collide with the OS side. So that explains why my /dev directory vanished, probably the OS freaked out. Once I used the above, I avoided the collision and the replication now succeeds. Also explains why rsync worked.. just copying it to the other side, no collisions.