Bug 221060 - zfs: sending/receiving a zvol within same host to the same dataset produces errors and shadow child(!) datasets
Summary: zfs: sending/receiving a zvol within same host to the same dataset produces e...
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.0-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: Bugmeister
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-07-28 04:25 UTC by emz
Modified: 2025-01-19 07:04 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description emz 2017-07-28 04:25:21 UTC
Sending/receiving a zvol within same host produces errors and shadow child (to a zvol !) datasets. 

For instance, it's absolutely normal to send zvol betweend hosts,

(just an example, not real output)
zfs send -Rv foo/bar@snapshot | ssh -l user remote sudo zfs receive -du tank

It's even possible to send a zvol to another pool and/or dataset:

(from now on real output from FreeBSD 11.1-BETA2)

[root@san1:~]# zfs send -Rv zfsroot/userdata/worker121@candidate | zfs receive -e esx/userdata
full send of zfsroot/userdata/worker121@candidate estimated size is 1,03G
total estimated size is 1,03G
TIME        SENT   SNAPSHOT
06:57:59    523M   zfsroot/userdata/worker121@candidate

But it's not possible to overwrite an existing zvol on any pool using -d argument:

[root@san1:~]# zfs create -V 8G esx/userdata/workerX
[root@san1:~]# zfs send -Rv zfsroot/userdata/worker121@candidate | zfs receive -d esx/userdata/workerX   
full send of zfsroot/userdata/worker121@candidate estimated size is 1,03G
total estimated size is 1,03G
TIME        SENT   SNAPSHOT
cannot open 'esx/userdata/workerX': operation not applicable to datasets of this type
cannot receive new filesystem stream: unable to restore to destination
warning: cannot send 'zfsroot/userdata/worker121@candidate': signal received


Okay. May be it's not technically possible and should not be done either. 
But wait.... it's possible when using -e receive argument:


[root@san1:~]# zfs send -Rv zfsroot/userdata/worker121@candidate | zfs receive -e esx/userdata/workerX
full send of zfsroot/userdata/worker121@candidate estimated size is 1,03G
total estimated size is 1,03G
TIME        SENT   SNAPSHOT
07:04:14    559M   zfsroot/userdata/worker121@candidate

Okay. Now lets try to remove this new overwritten zvol:

[root@san1:~]# zfs destroy esx/userdata/workerX
cannot destroy 'esx/userdata/workerX': dataset already exists

Now about the bugs:

1) the error is cryptic and unclear. I found out what it means, I'll show it below.
2) The receive -e operation in the case produces errors in dmesg:

g_dev_taste: make_dev_p() failed (gp->name=zvol/esx/userdata/workerX/worker121, error=17)
g_dev_taste: make_dev_p() failed (gp->name=zvol/esx/userdata/workerX/worker121@candidate, error=17)
g_dev_taste: make_dev_p() failed (gp->name=zvol/esx/userdata/workerX/worker121s1, error=17)
g_dev_taste: make_dev_p() failed (gp->name=zvol/esx/userdata/workerX/worker121@candidates1, error=17)

3) this operation creates a shadow dataset, not visible to zfs list -t all (a child to the zvol - this is weird by itself, isn't it):

[root@san1:~]# zfs list -t all | more
NAME                                   USED  AVAIL  REFER  MOUNTPOINT
esx                                   4,10T  12,5T   500M  /esx
esx/shared                            3,72T  12,5T  3,72T  -

[ ... loads of esx/shared and esx/userdata children, but trust me, it's not there - I removed it so it doen't encumber this PR ... ]

esx/userdata/workerX                  8,84G  12,5T  19,2K  -

But it's visible in zdb -d <pool>:

[root@san1:~]# zdb -d esx | grep workerX
Dataset esx/userdata/workerX/worker121@candidate [ZVOL], ID 472, cr_txg 76762261, 599M, 2 objects
Dataset esx/userdata/workerX/worker121 [ZVOL], ID 462, cr_txg 76762256, 599M, 2 objects
Dataset esx/userdata/workerX [ZVOL], ID 438, cr_txg 76762220, 19.2K, 2 objects

Okay, this is really what the 'zfs destroy' is trying to tell. Funny thing, this shadow dataset should be clearable with destroy -r flag, but it isn't:

[root@san1:~]# zfs destroy -r esx/userdata/workerX          
cannot destroy 'esx/userdata/workerX': dataset already exists

Only explicit destroy kills it (may be because the child to a zvol is an artifact by design):

[root@san1:~]# zfs destroy -r esx/userdata/workerX/worker121
[root@san1:~]#

Things may become even more complicated, and zdb may give you the Input/output error on the pool, or even crash (this state can be cleared with zpool/export/import):


Here's the bt after several 'zdb -d esx' messages 'Input/output error' and the crash:


[root@san1:~]# gdb zdb zdb.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `zdb -d esx/userdata/workerX'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/libnvpair.so.2...Reading symbols from /usr/lib/debug//lib/libnvpair.so.2.debug...done.
done.
Loaded symbols for /lib/libnvpair.so.2
Reading symbols from /lib/libumem.so.2...Reading symbols from /usr/lib/debug//lib/libumem.so.2.debug...done.
done.
Loaded symbols for /lib/libumem.so.2
Reading symbols from /lib/libuutil.so.2...Reading symbols from /usr/lib/debug//lib/libuutil.so.2.debug...done.
done.
Loaded symbols for /lib/libuutil.so.2
Reading symbols from /lib/libzfs.so.2...Reading symbols from /usr/lib/debug//lib/libzfs.so.2.debug...done.
done.
Loaded symbols for /lib/libzfs.so.2
Reading symbols from /lib/libzpool.so.2...Reading symbols from /usr/lib/debug//lib/libzpool.so.2.debug...done.
done.
Loaded symbols for /lib/libzpool.so.2
Reading symbols from /lib/libc.so.7...Reading symbols from /usr/lib/debug//lib/libc.so.7.debug...done.
done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /lib/libmd.so.6...Reading symbols from /usr/lib/debug//lib/libmd.so.6.debug...done.
done.
Loaded symbols for /lib/libmd.so.6
Reading symbols from /lib/libutil.so.9...Reading symbols from /usr/lib/debug//lib/libutil.so.9.debug...done.
done.
Loaded symbols for /lib/libutil.so.9
Reading symbols from /lib/libm.so.5...Reading symbols from /usr/lib/debug//lib/libm.so.5.debug...done.
done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libavl.so.2...Reading symbols from /usr/lib/debug//lib/libavl.so.2.debug...done.
done.
Loaded symbols for /lib/libavl.so.2
Reading symbols from /lib/libbsdxml.so.4...Reading symbols from /usr/lib/debug//lib/libbsdxml.so.4.debug...done.
done.
Loaded symbols for /lib/libbsdxml.so.4
Reading symbols from /lib/libgeom.so.5...Reading symbols from /usr/lib/debug//lib/libgeom.so.5.debug...done.
done.
Loaded symbols for /lib/libgeom.so.5
Reading symbols from /lib/libz.so.6...Reading symbols from /usr/lib/debug//lib/libz.so.6.debug...done.
done.
Loaded symbols for /lib/libz.so.6
Reading symbols from /lib/libzfs_core.so.2...Reading symbols from /usr/lib/debug//lib/libzfs_core.so.2.debug...done.
done.
Loaded symbols for /lib/libzfs_core.so.2
Reading symbols from /lib/libthr.so.3...Reading symbols from /usr/lib/debug//lib/libthr.so.3.debug...done.
done.
Loaded symbols for /lib/libthr.so.3
Reading symbols from /lib/libsbuf.so.6...Reading symbols from /usr/lib/debug//lib/libsbuf.so.6.debug...done.
done.
Loaded symbols for /lib/libsbuf.so.6
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from /usr/lib/debug//libexec/ld-elf.so.1.debug...done.
done.
Loaded symbols for /libexec/ld-elf.so.1
#0  0x000000080158184a in thr_kill () from /lib/libc.so.7
(gdb) bt
#0  0x000000080158184a in thr_kill () from /lib/libc.so.7
#1  0x0000000801581814 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2  0x0000000801581789 in abort () at /usr/src/lib/libc/stdlib/abort.c:65
#3  0x00000008011a1471 in ddt_load (spa=<value optimized out>) at assfail.h:75
#4  0x0000000801124188 in spa_load_impl (spa=0x803052c00, pool_guid=<value optimized out>, config=0x803052ee8, 
    state=<value optimized out>, type=SPA_IMPORT_EXISTING, mosconfig=B_TRUE)
    at /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:2844
#5  0x000000080111ca68 in spa_load (spa=<value optimized out>, state=<value optimized out>, type=SPA_IMPORT_EXISTING, 
    mosconfig=B_TRUE) at /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:2230
#6  0x0000000801123b51 in spa_load_impl (spa=0x803052c00, pool_guid=<value optimized out>, config=0x803052ee8, 
    state=<value optimized out>, type=SPA_IMPORT_EXISTING, mosconfig=B_FALSE)
    at /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:2654
#7  0x000000080111ca68 in spa_load (spa=<value optimized out>, state=<value optimized out>, type=SPA_IMPORT_EXISTING, 
    mosconfig=B_FALSE) at /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:2230
#8  0x000000080111c32a in spa_load_best (spa=0x803052c00, state=SPA_LOAD_OPEN, mosconfig=0, 
    max_request=<value optimized out>, rewind_flags=<value optimized out>)
    at /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:3007
#9  0x0000000801118289 in spa_open_common (pool=<value optimized out>, spapp=<value optimized out>, tag=0x801206642, 
    nvpolicy=<value optimized out>, config=0x0)
    at /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:3159
#10 0x0000000801142292 in dsl_pool_hold (name=<value optimized out>, tag=0x801206642, dp=0x7fffffe5a958)
    at /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:1111
#11 0x000000080116b29f in dmu_objset_own (name=0x7fffffffed87 "esx/userdata/workerX", type=DMU_OST_ANY, 
    readonly=B_TRUE, tag=0x4107dd, osp=0x7fffffe5aa40)
    at /usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:570
#12 0x00000000004062b5 in main (argc=<value optimized out>, argv=<value optimized out>)
    at /usr/src/cddl/usr.sbin/zdb/../../../cddl/contrib/opensolaris/cmd/zdb/zdb.c:3792
#13 0x00000000004053df in _start ()
#14 0x0000000800638000 in ?? ()
#15 0x0000000000000000 in ?? ()
(gdb)
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2025-01-19 07:04:05 UTC
^Triage: I'm sorry that this PR did not get addressed in a timely fashion.

By now, the version that it was created against is long out of support.
As well, many newer versions of ZFS have been imported.

Please re-open if it is still a problem on a supported version.