I've had a kernel crash with 3 different RC1 kernels. kernel: FreeBSD 11.1-RC1 #71 r313908+b08656c3ba28(releng/11.1): Sun Jul 2 23:15:23 PDT 2017 kernel: FreeBSD 11.1-RC1 #72 r313908+64e0abde4c70(releng/11.1): Mon Jul 3 07:43:42 PDT 2017 kernel: FreeBSD 11.1-RC1 #73 r313908+ff3a058c4f83(releng/11.1): Mon Jul 3 11:56:29 PDT 2017 # kldstat Id Refs Address Size Name 1 42 0xffffffff80200000 1f67128 kernel 2 1 0xffffffff82169000 3161c0 zfs.ko 3 2 0xffffffff82480000 c9e8 opensolaris.ko 4 1 0xffffffff8248d000 21c68 geom_eli.ko 5 1 0xffffffff824af000 bc68 aesni.ko 6 1 0xffffffff824bb000 114f58 nvidia-modeset.ko 7 3 0xffffffff825d0000 ab7f8 linux.ko 8 3 0xffffffff8267c000 ddc0 linux_common.ko 9 2 0xffffffff8268a000 1056708 nvidia.ko 10 1 0xffffffff83821000 3602 ums.ko 11 1 0xffffffff83825000 3c35c linux64.ko 12 1 0xffffffff83862000 eb0d iscsi.ko Excuse human-induced typos in stack-trace: panic: avl_find() succeeded inside avl_add() cpuid=5 KDB: stack backgrace: #0 0xffffffff80aad9c7 at kdb_backtrace+0x67 #1 0xffffffff80a6baa6 at vpanic+0x186 #2 0xffffffff80a6b913 at panic+0x43 #3 0xffffffff82169df4 at avl_add+0x134 #4 0xffffffff821a9bed at dmu_recv_end+0x16d #5 0xffffffff82231e0f at zfs_ioc_recv+0xb1f #6 0xffffffff82233c36 at zfsdev_ioctl+0x6d6 #7 0xffffffff8093ad38 at devfs_ioctl_f+0x128 #8 0xffffffff80ac9315 at kern_ioctl+0x255 #9 0xffffffff80ac904f at sys_ioctl+0x16f #10 0xffffffff80ee0394 at amd64_syscall+0x6c4 #11 0xffffffff80ec35cb at Xfast_syscall+0xfb It is reproducible. Initially saw it with zfs send | zfs recv (kernel #71), reproduced outside of X so I could see the panic, installed kernel #72 to be more current (had it compiled but not installed since I was doing lots of I/O). Didn't happen with zfs send to file, happened on zfs recv from file, grabbed #73 since it looked like it might apply since I have a da* via ISCSI, and can still reproduce the problem.
So I've done a bunch of I/O trying to narrow this down. Between 20170526 and 20170528 on a system where I thought some SSD drives were failing (recoverable checksum errors), I zfs sent | zfs recv the root disk (pool zroot) to a different disk, which I then moved to a different (current) box. Now I'm trying to move it again, and running into problems. It looks like I probably installed from installation media (11.0-RELEASE-p1 #0 r306420) and upgraded to -p10 after using beadm (possibly with some CMOS clock offset looking at the timestamps, but that should all just be relative time). I was running FreeBSD 11.0-RELEASE-p10 #0 r317487+8c96ad701987(releng/11.0) during the time that the bad snapshot I've done similar things before (and similar things after) with no problems. I can send|recv data from that filesystem up to a point (@20170526) without issues, but kernel panic after that. Right now, it seems to be isolated to the "zroot" since I can receive incrementals below that (ROOT, var, usr, etc). I'm trying to narrow it down some more.
I've been extremely frustrated trying to get a kernel dump for this panic. Currently: FreeBSD jormungandr.phouka.net 11.1-RC1 FreeBSD 11.1-RC1 #79 r313908+8df37be70f94(releng/11.1): Thu Jul 6 18:20:07 PDT 2017 warlock@jormungandr.phouka.net:/usr/obj/usr/src/sys/GENERIC amd64 From rc.conf: dumpdev="/dev/ada0p1" dumpdir="/var/crash" savecore_enable="YES" # dumpon -l ada0p1 # savecore -Cv unable to open bounds file, using 0 checking for kernel dump on device /dev/ada0p1 mediasize = 35433480192 bytes sectorsize = 512 bytes magic mismatch on last dump header on /dev/ada0p1 No dump exists I've got a pair of ZFS-mirroed Samsung SSD 960 EVO 250GB drives (/dev/nvd[01]) M.2 drives as zroot. It didn't seem like dumpon liked the encrypted swap partitions on there (even with "late" option), so I threw in a extra drive and created a unencrypted swap partition on it with enough space for all the memory, just in case. # gpart show /dev/nvd0 => 40 488397088 nvd0 GPT (233G) 40 1024 1 freebsd-boot (512K) 1064 984 - free - (492K) 2048 4194304 2 freebsd-swap (2.0G) 4196352 484200448 3 freebsd-zfs (231G) 488396800 328 - free - (164K) # gpart show /dev/ada0 => 40 937703008 ada0 GPT (447G) 40 4056 - free - (2.0M) 4096 69206016 1 freebsd-swap (33G) 69210112 868492936 - free - (414G) # grep ' memory ' /var/log/dmesg.today real memory = 34359738368 (32768 MB) avail memory = 33191440384 (31653 MB) Nothing is saved to /var/crash. It exists, only contains "minfree" (with the contents of "2048"). Booting to single user after panic and running savecore by hand doesn't seem to find anything, either.
The basic layout of the problem-pool is a stock ZFS layout with some additions and boot environment: # zfs list -r zspin/zroot NAME USED AVAIL REFER MOUNTPOINT zspin/zroot 96.6G 370G 96K /zspin/zroot zspin/zroot/ROOT 10.6G 370G 96K none zspin/zroot/ROOT/11.0-releng 48K 370G 3.22G /zspin/zroot zspin/zroot/ROOT/default 10.6G 370G 8.74G /zspin/zroot zspin/zroot/aux 200K 370G 96K /zspin/zroot/aux zspin/zroot/aux/aux 104K 370G 96K /zspin/zroot/aux/aux zspin/zroot/git 136K 370G 96K /zspin/zroot/git zspin/zroot/release 136K 370G 96K /zspin/zroot/release zspin/zroot/tmp 1004K 370G 400K /tmp zspin/zroot/usr 85.9G 370G 96K /usr zspin/zroot/usr/home 73.2G 370G 73.1G /zspin/zroot/usr/home zspin/zroot/usr/ports 9.98G 370G 6.73G /usr/ports zspin/zroot/usr/src 2.77G 370G 2.77G /usr/src zspin/zroot/var 2.20M 370G 96K /var zspin/zroot/var/audit 136K 370G 96K /var/audit zspin/zroot/var/crash 136K 370G 96K /var/crash zspin/zroot/var/log 1.00M 370G 336K /var/log zspin/zroot/var/mail 564K 370G 132K /var/mail zspin/zroot/var/tmp 248K 370G 96K /var/tmp There are some snapshots. Different filesystems have different stamps on them, but my current theory is that "aux" is the problem child: # zfs list -rtall zspin/zroot | fgrep @ | sed -E 's/^[^@]+(@[^ ]+).*$/\1/' | sort | uniq @2017-05-18-03:55:20 @20170526 @20170527 @20170528 @20170528-2 @backup @backup2 # zfs list -rt all zspin/zroot NAME USED AVAIL REFER MOUNTPOINT zspin/zroot 96.6G 370G 96K /zspin/zroot zspin/zroot@backup 8K - 96K - zspin/zroot@backup2 8K - 96K - zspin/zroot@20170526 8K - 96K - zspin/zroot@20170527 8K - 96K - zspin/zroot@20170528 8K - 96K - zspin/zroot@20170528-2 8K - 96K - ... zspin/zroot/aux 200K 370G 96K /zspin/zroot/aux zspin/zroot/aux@20170528 0 - 96K - zspin/zroot/aux/aux 104K 370G 96K /zspin/zroot/aux/aux zspin/zroot/aux/aux@20170528 8K - 96K - zspin/zroot/aux/aux@20170528-2 0 - 96K - ... This is how I can reliably panic my system: zfs destroy -rv zaux/ouroboros/zroot zfs send -RD zspin/zroot@backup | zfs receive -Fu -dv zaux/ouroboros zfs send -RD -I zspin/zroot@backup zspin/zroot@20170527 | zfs receive -Fu -dv zaux/ouroboros # panic during this zfs send -RD -I zspin/zroot@20170527 zspin/zroot@20170528 | zfs receive -Fu -dv zaux/ouroboros On the zfs receive, the last status message before the panic is receiving zspin/zroot/aux/aux@20170528 (#12 below, no summary of received bytes afterwards). # zfs send -RD -I zspin/zroot@20170527 zspin/zroot@20170528 | zfs receive -Fu -dvn zaux/ouroboros | cat -n 1 would receive incremental stream of zspin/zroot@20170528 into zaux/ouroboros/zroot@20170528 2 would receive incremental stream of zspin/zroot/var@20170528 into zaux/ouroboros/zroot/var@20170528 3 would receive incremental stream of zspin/zroot/var/audit@20170528 into zaux/ouroboros/zroot/var/audit@20170528 4 would receive incremental stream of zspin/zroot/var/tmp@20170528 into zaux/ouroboros/zroot/var/tmp@20170528 5 would receive incremental stream of zspin/zroot/var/crash@20170528 into zaux/ouroboros/zroot/var/crash@20170528 6 would receive incremental stream of zspin/zroot/var/mail@20170528 into zaux/ouroboros/zroot/var/mail@20170528 7 would receive incremental stream of zspin/zroot/var/log@20170528 into zaux/ouroboros/zroot/var/log@20170528 8 would receive incremental stream of zspin/zroot/tmp@20170528 into zaux/ouroboros/zroot/tmp@20170528 9 would receive incremental stream of zspin/zroot/release@20170528 into zaux/ouroboros/zroot/release@20170528 10 would receive incremental stream of zspin/zroot/git@20170528 into zaux/ouroboros/zroot/git@20170528 11 would receive full stream of zspin/zroot/aux@20170528 into zaux/ouroboros/zroot/aux@20170528 12 would receive full stream of zspin/zroot/aux/aux@20170528 into zaux/ouroboros/zroot/aux@20170528 13 would receive incremental stream of zspin/zroot/ROOT@20170528 into zaux/ouroboros/zroot/ROOT@20170528 14 would receive incremental stream of zspin/zroot/ROOT/default@20170528 into zaux/ouroboros/zroot/ROOT/default@20170528 15 would receive incremental stream of zspin/zroot/usr@20170528 into zaux/ouroboros/zroot/usr@20170528 16 would receive incremental stream of zspin/zroot/usr/home@20170528 into zaux/ouroboros/zroot/usr/home@20170528 17 would receive incremental stream of zspin/zroot/usr/ports@20170528 into zaux/ouroboros/zroot/usr/ports@20170528 18 would receive incremental stream of zspin/zroot/usr/src@20170528 into zaux/ouroboros/zroot/usr/src@20170528 19 would receive incremental stream of zspin/zroot/ROOT/11.0-releng@20170528 into zaux/ouroboros/zroot/ROOT/11.0-releng@20170528
Someone might want to close this because 4+ years later I couldn't possibly reproduce it and 11.1 is long out of date.