( copy of the problem published in freebsd-fs@ ) I see the following problem on FreeBSD 13.0-CURRENT r345087M amd64, zpool version 28, zfs version 5 If you create a ZVOL, fill in specific data, take a snapshot, clone and resize, you will get a blocked pool (device busy for : export -f, destroy -f ..)--- what i found: a) the problem only appears on clone/snapshot, there is no problem on a simple zvol resize; b) the nature of the data affects the problem: - when zvol no data, problem is missing - If you try to fill zvol with random data, for example: dd if=/dev/random of=<zvol> there are also no problems. But problem exist if I try to fill in with cloud images data ( http://ftp.freebsd.org/pub/FreeBSD/releases/VM-IMAGES/12.0-RELEASE/amd64/Latest/ , http://cloud-images.ubuntu.com/xenial/current/ ) - in fact, these are raw images of the installed OS. c) The problem only occurs when creating zvol + resizing (in two commands). If you try to do atomically and simultaneously: /sbin/zfs clone -o volsize=<new_size> ... there is no problem, but with this: /sbin/zfs clone .. /sbin/zfs set volsize= problem exist. Maybe this is not a ZFS problem and GEOM-related ? Step-by-step how to reproduce (where z01 - active ZFS pool): --- % wget http://ftp.freebsd.org/pub/FreeBSD/releases/VM-IMAGES/12.0-RELEASE/amd64/Latest/FreeBSD-12.0-RELEASE-amd64.raw.xz % xz -d FreeBSD-12.0-RELEASE-amd64.raw.xz > ( /usr/bin/stat -f "%z" FreeBSD-12.0-RELEASE-amd64.raw -- get size) > 33286062080 % /sbin/zfs create -sV 33286062080 -o volmode=dev z01/test1 % dd bs=1m if=FreeBSD-12.0-RELEASE-amd64.raw of=/dev/zvol/z01/test1 > 31744+1 records in > 31744+1 records out > 33286062080 bytes transferred in 127.530635 secs (261004441 bytes/sec) % /sbin/zfs snapshot z01/test1 at snap % /sbin/zfs clone z01/test1 at snap > ( resizing. For example, double up ) > bc -e '33286062080 * 2' > 66572124160 % /sbin/zfs set volsize=66572124160 z01/test2 % /sbin/zfs destroy z01/test2 > cannot destroy 'z01/test2': dataset is busy % zpool export -f z01 > cannot export 'z01': pool is busy --- with truss i can see: .. ioctl(3,0xc0185a15 { IORW 0x5a('Z'), 21, 24 },0x7fffffffcd78) ERR#3 'No such process' ioctl(3,0xc0185a15 { IORW 0x5a('Z'), 21, 24 },0x7fffffffcd78) ERR#3 'No such process' ioctl(6,0xc0185a18 { IORW 0x5a('Z'), 24, 24 },0x7fffffffcdb8) ERR#16 'Device busy'
This could be a ZFS<->GEOM interaction indeed. Maybe related to bug 228384, maybe not.
^Triage: to submitter: is this aging PR still valid?