Bug 236710

Summary: zfs: clone + resize = dataset is busy
Product: Base System Reporter: Oleg Ginzburg <olevole>
Component: binAssignee: freebsd-fs (Nobody) <fs>
Status: New ---    
Severity: Affects Many People    
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Oleg Ginzburg 2019-03-22 08:32:18 UTC
( copy of the problem published in freebsd-fs@ )

I see the following problem on FreeBSD 13.0-CURRENT
r345087M amd64, zpool version 28, zfs version 5

If you create a ZVOL, fill in specific data, take a snapshot, clone
and resize, you will get a blocked pool (device busy for : export -f,
destroy -f ..)---

what i found:

a) the problem only appears on clone/snapshot, there is no problem on
a simple zvol resize;

b) the nature of the data affects the problem:

  - when zvol no data, problem is missing
  - If you try to fill zvol with random data, for example: dd
if=/dev/random of=<zvol> there are also no problems. But problem exist
if I try to fill in with cloud images data (
http://ftp.freebsd.org/pub/FreeBSD/releases/VM-IMAGES/12.0-RELEASE/amd64/Latest/
, http://cloud-images.ubuntu.com/xenial/current/
) - in fact, these are raw images of the installed OS.

c) The problem only occurs when creating zvol + resizing (in two
commands). If you try to do atomically and simultaneously:

 /sbin/zfs clone -o volsize=<new_size> ...

there is no problem, but with this:

/sbin/zfs clone ..
/sbin/zfs set volsize=

problem exist.

Maybe this is not a ZFS problem and GEOM-related ?

Step-by-step how to reproduce (where z01 - active ZFS pool):
---
% wget http://ftp.freebsd.org/pub/FreeBSD/releases/VM-IMAGES/12.0-RELEASE/amd64/Latest/FreeBSD-12.0-RELEASE-amd64.raw.xz

% xz -d FreeBSD-12.0-RELEASE-amd64.raw.xz

> ( /usr/bin/stat -f "%z" FreeBSD-12.0-RELEASE-amd64.raw  -- get size)
>  33286062080

% /sbin/zfs create -sV 33286062080 -o volmode=dev z01/test1

% dd bs=1m if=FreeBSD-12.0-RELEASE-amd64.raw of=/dev/zvol/z01/test1
>  31744+1 records in
>  31744+1 records out
>  33286062080 bytes transferred in 127.530635 secs (261004441 bytes/sec)

% /sbin/zfs snapshot z01/test1 at snap

% /sbin/zfs clone z01/test1 at snap

> ( resizing. For example, double up  )
>  bc -e '33286062080 * 2'
>  66572124160

% /sbin/zfs set volsize=66572124160 z01/test2

% /sbin/zfs destroy z01/test2
> cannot destroy 'z01/test2': dataset is busy

% zpool export -f z01
> cannot export 'z01': pool is busy
---

with truss i can see:

..
ioctl(3,0xc0185a15 { IORW 0x5a('Z'), 21, 24 },0x7fffffffcd78) ERR#3
'No such process'
ioctl(3,0xc0185a15 { IORW 0x5a('Z'), 21, 24 },0x7fffffffcd78) ERR#3
'No such process'
ioctl(6,0xc0185a18 { IORW 0x5a('Z'), 24, 24 },0x7fffffffcdb8) ERR#16
'Device busy'
Comment 1 Andriy Gapon freebsd_committer freebsd_triage 2019-03-22 08:59:24 UTC
This could be a ZFS<->GEOM interaction indeed.
Maybe related to bug 228384, maybe not.