On 9-stable & 8-stable, zfs/zpool operations hang when trying to work on a readonly pool. I've tried 'zfs set mountpoint' and 'zfs scrub' (the latter accidentally during overnight run of periodic with daily_scrub_zfs_enable=yes). See also https://forums.freebsd.org/showthread.php?t=35505&highlight=readonly+zpool which mentions a panic. I didn't get a panic (yet). Use case when this was noticed: replace old pool with new, setting the new pool to have the old pool's mountpoint (to avoid changing all nfs clients). I think zfs should refuse the operation if readonly is a problem. What I really wanted was for the data to be readonly, but not the zfs metadata (i.e., "_mostly_ readonly"). But I can see how disallowing metadata ops on a readonly pool makes sense. How-To-Repeat: cd /tmp dd if=/dev/zero bs=1m count=100 > ! z0 dd if=/dev/zero bs=1m count=100 > ! z1 sudo mdconfig -f z0 sudo mdconfig -f z1 sudo zpool create -m /tmp/ztmp ztmp mirror /dev/md0 /dev/md1 sudo zpool export ztmp sudo zpool import -o readonly=on ztmp sudo zfs set mountpoint=/tmp/ztmpnew ztmp ... hangs here In another window... % ps -ww -ax -o pid,ppid,%cpu,%mem,vsz,rss,wchan,stat,lstart,time,command | egrep 'zfs|PID' PID PPID %CPU %MEM VSZ RSS WCHAN STAT STARTED TIME COMMAND 45377 0 0.0 0.0 0 32 l2arc_fe DL Wed Feb 6 12:38:30 2013 0:00.01 [zfskern] 45674 1 0.0 0.3 44460 3256 select I Wed Feb 6 12:40:54 2013 0:00.01 sudo zfs set mountpoint=/tmp/z\ tmpnew ztmp 45687 45674 0.0 0.3 33488 3064 tx->tx_s D Wed Feb 6 12:40:54 2013 0:00.00 zfs set mountpoint=/tmp/ztmpne\ w ztmp % sudo procstat -k 45674 45687 PID TID COMM TDNAME KSTACK 45674 100106 sudo - mi_switch sleepq_catch_signals sleepq_wait_sig _cv_wait_sig seltdw\ ait kern_select sys_select amd64_syscall Xfast_syscall 45687 100098 zfs - mi_switch sleepq_wait _cv_wait txg_wait_synced dsl_sync_task_group\ _wait dsl_sync_task_do dsl_props_set zfs_set_prop_nvlist zfs_ioc_set_prop zfsdev_ioctl devfs_ioctl_f kern_ioctl s\ ys_ioctl amd64_syscall Xfast_syscall
Don't hesitate to get a full procstat -kk -a output. -- Andriy Gapon
Andriy Gapon wrote at 10:51 +0200 on Feb 7, 2013: > Don't hesitate to get a full procstat -kk -a output. sudo procstat -kk 8168 8181 PID TID COMM TDNAME KSTACK 8168 100087 sudo - mi_switch+0x190 sleepq_catch_signals+0x27f sleepq_wait_sig+0x16 _cv_wait_sig+0x129 seltdwait+0xac kern_select+0x6ef sys_select+0x5d amd64_syscall+0x25b Xfast_syscall+0xf7 8181 100055 zfs - mi_switch+0x190 sleepq_wait+0x44 _cv_wait+0x114 txg_wait_synced+0x85 dsl_sync_task_group_wait+0x128 dsl_sync_task_do+0x54 dsl_props_set+0x147 zfs_set_prop_nvlist+0x3ad zfs_ioc_set_prop+0x75 zfsdev_ioctl+0xe6 devfs_ioctl_f+0x7a kern_ioctl+0x106 sys_ioctl+0xfd amd64_syscall+0x25b Xfast_syscall+0xf7
on 08/02/2013 04:00 John Hein said the following: > Andriy Gapon wrote at 10:51 +0200 on Feb 7, 2013: > > Don't hesitate to get a full procstat -kk -a output. > > sudo procstat -kk 8168 8181 > PID TID COMM TDNAME KSTACK > 8168 100087 sudo - mi_switch+0x190 sleepq_catch_signals+0x27f sleepq_wait_sig+0x16 _cv_wait_sig+0x129 seltdwait+0xac kern_select+0x6ef sys_select+0x5d amd64_syscall+0x25b Xfast_syscall+0xf7 > 8181 100055 zfs - mi_switch+0x190 sleepq_wait+0x44 _cv_wait+0x114 txg_wait_synced+0x85 dsl_sync_task_group_wait+0x128 dsl_sync_task_do+0x54 dsl_props_set+0x147 zfs_set_prop_nvlist+0x3ad zfs_ioc_set_prop+0x75 zfsdev_ioctl+0xe6 devfs_ioctl_f+0x7a kern_ioctl+0x106 sys_ioctl+0xfd amd64_syscall+0x25b Xfast_syscall+0xf7 > There seems to be some mis-communication. This is not procstat *-a* output. -- Andriy Gapon
Here's the full procstat -kk -a output...
on 09/02/2013 17:53 John Hein said the following: > Here's the full procstat -kk -a output... John, thank you very much! This problem seems to be a weird omission in our ZFS port. This is how pool_status_check function looks in the last open source version of OpenSolaris: int pool_status_check(const char *name, zfs_ioc_namecheck_t type, zfs_ioc_poolcheck_t check) { spa_t *spa; int error; ASSERT(type == POOL_NAME || type == DATASET_NAME); if (check & POOL_CHECK_NONE) return (0); error = spa_open(name, &spa, FTAG); if (error == 0) { if ((check & POOL_CHECK_SUSPENDED) && spa_suspended(spa)) error = EAGAIN; else if ((check & POOL_CHECK_READONLY) && !spa_writeable(spa)) error = EROFS; spa_close(spa, FTAG); } return (error); } In current Illumos the code seems to be the same: http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/zfs_ioctl.c#pool_status_check Here is how the code looks in FreeBSD: int pool_status_check(const char *name, zfs_ioc_namecheck_t type) { spa_t *spa; int error; ASSERT(type == POOL_NAME || type == DATASET_NAME); error = spa_open(name, &spa, FTAG); if (error == 0) { if (spa_suspended(spa)) error = EAGAIN; spa_close(spa, FTAG); } return (error); } The code seems to have been introduced in ZFSv15 import (commit r209962) and has not been changed/updated since then. The spa_writeable() check should have prevented the situation you are seeing. P.S. Exact cause of the hang is that txg threads are not started at all but the thread doing the ioctl waits on txg sync thread to do something. -- Andriy Gapon
Could you please try the following patch: http://people.freebsd.org/~mm/patches/zfs/zfs_ioctl.c.patch Thank you. -- Martin Matuska FreeBSD committer http://blog.vx.sk
Author: mm Date: Mon Feb 11 21:10:55 2013 New Revision: 246688 URL: http://svnweb.freebsd.org/changeset/base/246688 Log: Merge zfs_ioctl.c code that should have been merged together with ZFS v28. Fixes several problems if working with read-only pools. Changed code originaly introduced in onnv-gate 13061:bda0decf867b Contains changes up to illumos-gate 13700:4bc0783f6064 PR: kern/175897 Suggested by: avg MFC after: 2 weeks Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c ============================================================================== --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Mon Feb 11 21:02:49 2013 (r246687) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Mon Feb 11 21:10:55 2013 (r246688) @@ -106,12 +106,18 @@ typedef enum { DATASET_NAME } zfs_ioc_namecheck_t; +typedef enum { + POOL_CHECK_NONE = 1 << 0, + POOL_CHECK_SUSPENDED = 1 << 1, + POOL_CHECK_READONLY = 1 << 2 +} zfs_ioc_poolcheck_t; + typedef struct zfs_ioc_vec { zfs_ioc_func_t *zvec_func; zfs_secpolicy_func_t *zvec_secpolicy; zfs_ioc_namecheck_t zvec_namecheck; boolean_t zvec_his_log; - boolean_t zvec_pool_check; + zfs_ioc_poolcheck_t zvec_pool_check; } zfs_ioc_vec_t; /* This array is indexed by zfs_userquota_prop_t */ @@ -5052,138 +5058,155 @@ zfs_ioc_unjail(zfs_cmd_t *zc) static zfs_ioc_vec_t zfs_ioc_vec[] = { { zfs_ioc_pool_create, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_destroy, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_import, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_export, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_configs, zfs_secpolicy_none, NO_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_stats, zfs_secpolicy_read, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_tryimport, zfs_secpolicy_config, NO_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_scan, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_pool_freeze, zfs_secpolicy_config, NO_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_READONLY }, { zfs_ioc_pool_upgrade, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_pool_get_history, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_vdev_add, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_vdev_remove, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_vdev_set_state, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_FALSE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_vdev_attach, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_vdev_detach, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_vdev_setpath, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_vdev_setfru, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_objset_stats, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_TRUE }, + POOL_CHECK_SUSPENDED }, { zfs_ioc_objset_zplprops, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_dataset_list_next, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_TRUE }, + POOL_CHECK_SUSPENDED }, { zfs_ioc_snapshot_list_next, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_TRUE }, - { zfs_ioc_set_prop, zfs_secpolicy_none, DATASET_NAME, B_TRUE, B_TRUE }, - { zfs_ioc_create, zfs_secpolicy_create, DATASET_NAME, B_TRUE, B_TRUE }, + POOL_CHECK_SUSPENDED }, + { zfs_ioc_set_prop, zfs_secpolicy_none, DATASET_NAME, B_TRUE, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, + { zfs_ioc_create, zfs_secpolicy_create, DATASET_NAME, B_TRUE, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_destroy, zfs_secpolicy_destroy, DATASET_NAME, B_TRUE, - B_TRUE}, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY}, { zfs_ioc_rollback, zfs_secpolicy_rollback, DATASET_NAME, B_TRUE, - B_TRUE }, - { zfs_ioc_rename, zfs_secpolicy_rename, DATASET_NAME, B_TRUE, B_TRUE }, - { zfs_ioc_recv, zfs_secpolicy_receive, DATASET_NAME, B_TRUE, B_TRUE }, - { zfs_ioc_send, zfs_secpolicy_send, DATASET_NAME, B_FALSE, B_FALSE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, + { zfs_ioc_rename, zfs_secpolicy_rename, DATASET_NAME, B_TRUE, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, + { zfs_ioc_recv, zfs_secpolicy_receive, DATASET_NAME, B_TRUE, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, + { zfs_ioc_send, zfs_secpolicy_send, DATASET_NAME, B_FALSE, + POOL_CHECK_NONE }, { zfs_ioc_inject_fault, zfs_secpolicy_inject, NO_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_clear_fault, zfs_secpolicy_inject, NO_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_inject_list_next, zfs_secpolicy_inject, NO_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_error_log, zfs_secpolicy_inject, POOL_NAME, B_FALSE, - B_FALSE }, - { zfs_ioc_clear, zfs_secpolicy_config, POOL_NAME, B_TRUE, B_FALSE }, + POOL_CHECK_NONE }, + { zfs_ioc_clear, zfs_secpolicy_config, POOL_NAME, B_TRUE, + POOL_CHECK_NONE }, { zfs_ioc_promote, zfs_secpolicy_promote, DATASET_NAME, B_TRUE, - B_TRUE }, - { zfs_ioc_destroy_snaps_nvl, zfs_secpolicy_destroy_recursive, DATASET_NAME, - B_TRUE, B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, + { zfs_ioc_destroy_snaps_nvl, zfs_secpolicy_destroy_recursive, + DATASET_NAME, B_TRUE, POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_snapshot, zfs_secpolicy_snapshot, DATASET_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_dsobj_to_dsname, zfs_secpolicy_diff, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_obj_to_path, zfs_secpolicy_diff, DATASET_NAME, B_FALSE, - B_TRUE }, + POOL_CHECK_SUSPENDED }, { zfs_ioc_pool_set_props, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_pool_get_props, zfs_secpolicy_read, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_set_fsacl, zfs_secpolicy_fsacl, DATASET_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_get_fsacl, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_FALSE }, - { zfs_ioc_share, zfs_secpolicy_share, DATASET_NAME, B_FALSE, B_FALSE }, + POOL_CHECK_NONE }, + { zfs_ioc_share, zfs_secpolicy_share, DATASET_NAME, B_FALSE, + POOL_CHECK_NONE }, { zfs_ioc_inherit_prop, zfs_secpolicy_inherit, DATASET_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_smb_acl, zfs_secpolicy_smb_acl, DATASET_NAME, B_FALSE, - B_FALSE }, - { zfs_ioc_userspace_one, zfs_secpolicy_userspace_one, - DATASET_NAME, B_FALSE, B_FALSE }, - { zfs_ioc_userspace_many, zfs_secpolicy_userspace_many, - DATASET_NAME, B_FALSE, B_FALSE }, + POOL_CHECK_NONE }, + { zfs_ioc_userspace_one, zfs_secpolicy_userspace_one, DATASET_NAME, + B_FALSE, POOL_CHECK_NONE }, + { zfs_ioc_userspace_many, zfs_secpolicy_userspace_many, DATASET_NAME, + B_FALSE, POOL_CHECK_NONE }, { zfs_ioc_userspace_upgrade, zfs_secpolicy_userspace_upgrade, - DATASET_NAME, B_FALSE, B_TRUE }, - { zfs_ioc_hold, zfs_secpolicy_hold, DATASET_NAME, B_TRUE, B_TRUE }, + DATASET_NAME, B_FALSE, POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, + { zfs_ioc_hold, zfs_secpolicy_hold, DATASET_NAME, B_TRUE, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_release, zfs_secpolicy_release, DATASET_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_get_holds, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_TRUE }, + POOL_CHECK_SUSPENDED }, { zfs_ioc_objset_recvd_props, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_vdev_split, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_next_obj, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_FALSE }, - { zfs_ioc_diff, zfs_secpolicy_diff, DATASET_NAME, B_FALSE, B_FALSE }, + POOL_CHECK_NONE }, + { zfs_ioc_diff, zfs_secpolicy_diff, DATASET_NAME, B_FALSE, + POOL_CHECK_NONE }, { zfs_ioc_tmp_snapshot, zfs_secpolicy_tmp_snapshot, DATASET_NAME, - B_FALSE, B_FALSE }, + B_FALSE, POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_obj_to_stats, zfs_secpolicy_diff, DATASET_NAME, B_FALSE, - B_TRUE }, - { zfs_ioc_jail, zfs_secpolicy_config, DATASET_NAME, B_TRUE, B_FALSE }, - { zfs_ioc_unjail, zfs_secpolicy_config, DATASET_NAME, B_TRUE, B_FALSE }, + POOL_CHECK_SUSPENDED }, + { zfs_ioc_jail, zfs_secpolicy_config, DATASET_NAME, B_TRUE, + POOL_CHECK_NONE }, + { zfs_ioc_unjail, zfs_secpolicy_config, DATASET_NAME, B_TRUE, + POOL_CHECK_NONE }, { zfs_ioc_pool_reguid, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY }, { zfs_ioc_space_written, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_TRUE }, + POOL_CHECK_SUSPENDED }, { zfs_ioc_space_snaps, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_TRUE }, + POOL_CHECK_SUSPENDED }, { zfs_ioc_send_progress, zfs_secpolicy_read, DATASET_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_reopen, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_TRUE }, + POOL_CHECK_SUSPENDED }, }; int -pool_status_check(const char *name, zfs_ioc_namecheck_t type) +pool_status_check(const char *name, zfs_ioc_namecheck_t type, + zfs_ioc_poolcheck_t check) { spa_t *spa; int error; ASSERT(type == POOL_NAME || type == DATASET_NAME); + if (check & POOL_CHECK_NONE) + return (0); + error = spa_open(name, &spa, FTAG); if (error == 0) { - if (spa_suspended(spa)) + if ((check & POOL_CHECK_SUSPENDED) && spa_suspended(spa)) error = EAGAIN; + else if ((check & POOL_CHECK_READONLY) && !spa_writeable(spa)) + error = EROFS; spa_close(spa, FTAG); } return (error); @@ -5353,17 +5376,19 @@ zfsdev_ioctl(struct cdev *dev, u_long cm case POOL_NAME: if (pool_namecheck(zc->zc_name, NULL, NULL) != 0) error = EINVAL; - if (zfs_ioc_vec[vec].zvec_pool_check) + else error = pool_status_check(zc->zc_name, - zfs_ioc_vec[vec].zvec_namecheck); + zfs_ioc_vec[vec].zvec_namecheck, + zfs_ioc_vec[vec].zvec_pool_check); break; case DATASET_NAME: if (dataset_namecheck(zc->zc_name, NULL, NULL) != 0) error = EINVAL; - if (zfs_ioc_vec[vec].zvec_pool_check) + else error = pool_status_check(zc->zc_name, - zfs_ioc_vec[vec].zvec_namecheck); + zfs_ioc_vec[vec].zvec_namecheck, + zfs_ioc_vec[vec].zvec_pool_check); break; case NO_NAME: _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s).
Author: mm Date: Wed Feb 27 19:20:50 2013 New Revision: 247406 URL: http://svnweb.freebsd.org/changeset/base/247406 Log: MFC r246631,246651,246666,246675,246678,246688: Merge various ZFS bugfixes MFC r246631: Import vendor bugfixes Illumos ZFS issues: 3422 zpool create/syseventd race yield non-importable pool 3425 first write to a new zvol can fail with EFBIG MFC r246651: Import minor type change in refcount.h header from vendor (illumos). MFC r246666: Import vendor ZFS bugfix fixing a problem in arc_read(). Illumos ZFS issues: 3498 panic in arc_read(): !refcount_is_zero(&pbuf->b_hdr->b_refcnt) MFC r246675: Add tunable to allow block allocation on degraded vdevs. Illumos ZFS issues: 3507 Tunable to allow block allocation even on degraded vdevs MFC r246678: Import vendor bugfixes regarding SA rounding, header size and layout. This was already partially fixed by avg. Illumos ZFS issues: 3512 rounding discrepancy in sa_find_sizes() 3513 mismatch between SA header size and layout MFC r246688 [1]: Merge zfs_ioctl.c code that should have been merged together with ZFS v28. Fixes several problems if working with read-only pools. Changed code originaly introduced in onnv-gate 13061:bda0decf867b Contains changes up to illumos-gate 13700:4bc0783f6064 PR: kern/175897 [1] Suggested by: avg [1] Modified: stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Directory Properties: stable/9/cddl/contrib/opensolaris/ (props changed) stable/9/cddl/contrib/opensolaris/lib/libzfs/ (props changed) stable/9/sys/ (props changed) stable/9/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Feb 27 19:20:50 2013 (r247406) @@ -983,7 +983,7 @@ visit_indirect(spa_t *spa, const dnode_p arc_buf_t *buf; uint64_t fill = 0; - err = arc_read_nolock(NULL, spa, bp, arc_getbuf_func, &buf, + err = arc_read(NULL, spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); @@ -2001,9 +2001,8 @@ zdb_count_block(zdb_cb_t *zcb, zilog_t * bp, NULL, NULL, ZIO_FLAG_CANFAIL)), ==, 0); } -/* ARGSUSED */ static int -zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { zdb_cb_t *zcb = arg; @@ -2410,7 +2409,7 @@ typedef struct zdb_ddt_entry { /* ARGSUSED */ static int zdb_ddt_add_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { avl_tree_t *t = arg; avl_index_t where; Modified: stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c ============================================================================== --- stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Feb 27 19:20:50 2013 (r247406) @@ -526,13 +526,12 @@ get_configs(libzfs_handle_t *hdl, pool_l * version * pool guid * name - * pool txg (if available) * comment (if available) * pool state * hostid (if available) * hostname (if available) */ - uint64_t state, version, pool_txg; + uint64_t state, version; char *comment = NULL; version = fnvlist_lookup_uint64(tmp, @@ -548,11 +547,6 @@ get_configs(libzfs_handle_t *hdl, pool_l fnvlist_add_string(config, ZPOOL_CONFIG_POOL_NAME, name); - if (nvlist_lookup_uint64(tmp, - ZPOOL_CONFIG_POOL_TXG, &pool_txg) == 0) - fnvlist_add_uint64(config, - ZPOOL_CONFIG_POOL_TXG, pool_txg); - if (nvlist_lookup_string(tmp, ZPOOL_CONFIG_COMMENT, &comment) == 0) fnvlist_add_string(config, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Feb 27 19:20:50 2013 (r247406) @@ -940,7 +940,6 @@ buf_cons(void *vbuf, void *unused, int k bzero(buf, sizeof (arc_buf_t)); mutex_init(&buf->b_evict_lock, NULL, MUTEX_DEFAULT, NULL); - rw_init(&buf->b_data_lock, NULL, RW_DEFAULT, NULL); arc_space_consume(sizeof (arc_buf_t), ARC_SPACE_HDRS); return (0); @@ -970,7 +969,6 @@ buf_dest(void *vbuf, void *unused) arc_buf_t *buf = vbuf; mutex_destroy(&buf->b_evict_lock); - rw_destroy(&buf->b_data_lock); arc_space_return(sizeof (arc_buf_t), ARC_SPACE_HDRS); } @@ -2968,42 +2966,11 @@ arc_read_done(zio_t *zio) * * arc_read_done() will invoke all the requested "done" functions * for readers of this block. - * - * Normal callers should use arc_read and pass the arc buffer and offset - * for the bp. But if you know you don't need locking, you can use - * arc_read_nolock. */ int -arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - int err; - - if (pbuf == NULL) { - /* - * XXX This happens from traverse callback funcs, for - * the objset_phys_t block. - */ - return (arc_read_nolock(pio, spa, bp, done, private, priority, - zio_flags, arc_flags, zb)); - } - - ASSERT(!refcount_is_zero(&pbuf->b_hdr->b_refcnt)); - ASSERT3U((char *)bp - (char *)pbuf->b_data, <, pbuf->b_hdr->b_size); - rw_enter(&pbuf->b_data_lock, RW_READER); - - err = arc_read_nolock(pio, spa, bp, done, private, priority, - zio_flags, arc_flags, zb); - rw_exit(&pbuf->b_data_lock); - - return (err); -} - -int -arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) +arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, + void *private, int priority, int zio_flags, uint32_t *arc_flags, + const zbookmark_t *zb) { arc_buf_hdr_t *hdr; arc_buf_t *buf; @@ -3482,19 +3449,6 @@ arc_release(arc_buf_t *buf, void *tag) } } -/* - * Release this buffer. If it does not match the provided BP, fill it - * with that block's contents. - */ -/* ARGSUSED */ -int -arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa, - zbookmark_t *zb) -{ - arc_release(buf, tag); - return (0); -} - int arc_released(arc_buf_t *buf) { Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c Wed Feb 27 19:20:50 2013 (r247406) @@ -135,7 +135,7 @@ bptree_add(objset_t *os, uint64_t obj, b /* ARGSUSED */ static int -bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { int err; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Feb 27 19:20:50 2013 (r247406) @@ -513,7 +513,6 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t spa_t *spa; zbookmark_t zb; uint32_t aflags = ARC_NOWAIT; - arc_buf_t *pbuf; DB_DNODE_ENTER(db); dn = DB_DNODE(db); @@ -575,14 +574,8 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t db->db.db_object, db->db_level, db->db_blkid); dbuf_add_ref(db, NULL); - /* ZIO_FLAG_CANFAIL callers have to check the parent zio's error */ - if (db->db_parent) - pbuf = db->db_parent->db_buf; - else - pbuf = db->db_objset->os_phys_buf; - - (void) dsl_read(zio, spa, db->db_blkptr, pbuf, + (void) arc_read(zio, spa, db->db_blkptr, dbuf_read_done, db, ZIO_PRIORITY_SYNC_READ, (*flags & DB_RF_CANFAIL) ? ZIO_FLAG_CANFAIL : ZIO_FLAG_MUSTSUCCEED, &aflags, &zb); @@ -982,7 +975,6 @@ void dbuf_release_bp(dmu_buf_impl_t *db) { objset_t *os; - zbookmark_t zb; DB_GET_OBJSET(&os, db); ASSERT(dsl_pool_sync_context(dmu_objset_pool(os))); @@ -990,13 +982,7 @@ dbuf_release_bp(dmu_buf_impl_t *db) list_link_active(&os->os_dsl_dataset->ds_synced_link)); ASSERT(db->db_parent == NULL || arc_released(db->db_parent->db_buf)); - zb.zb_objset = os->os_dsl_dataset ? - os->os_dsl_dataset->ds_object : 0; - zb.zb_object = db->db.db_object; - zb.zb_level = db->db_level; - zb.zb_blkid = db->db_blkid; - (void) arc_release_bp(db->db_buf, db, - db->db_blkptr, os->os_spa, &zb); + (void) arc_release(db->db_buf, db); } dbuf_dirty_record_t * @@ -1831,7 +1817,6 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki if (bp && !BP_IS_HOLE(bp)) { int priority = dn->dn_type == DMU_OT_DDT_ZAP ? ZIO_PRIORITY_DDT_PREFETCH : ZIO_PRIORITY_ASYNC_READ; - arc_buf_t *pbuf; dsl_dataset_t *ds = dn->dn_objset->os_dsl_dataset; uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH; zbookmark_t zb; @@ -1839,13 +1824,8 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki SET_BOOKMARK(&zb, ds ? ds->ds_object : DMU_META_OBJSET, dn->dn_object, 0, blkid); - if (db) - pbuf = db->db_buf; - else - pbuf = dn->dn_objset->os_phys_buf; - - (void) dsl_read(NULL, dn->dn_objset->os_spa, - bp, pbuf, NULL, NULL, priority, + (void) arc_read(NULL, dn->dn_objset->os_spa, + bp, NULL, NULL, priority, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, &zb); } Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c Wed Feb 27 19:20:50 2013 (r247406) @@ -128,7 +128,7 @@ report_dnode(struct diffarg *da, uint64_ /* ARGSUSED */ static int -diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { struct diffarg *da = arg; @@ -155,9 +155,9 @@ diff_cb(spa_t *spa, zilog_t *zilog, cons int blksz = BP_GET_LSIZE(bp); int i; - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); blk = abuf->b_data; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Feb 27 19:20:50 2013 (r247406) @@ -276,12 +276,7 @@ dmu_objset_open_impl(spa_t *spa, dsl_dat aflags |= ARC_L2CACHE; dprintf_bp(os->os_rootbp, "reading %s", ""); - /* - * XXX when bprewrite scrub can change the bp, - * and this is called from dmu_objset_open_ds_os, the bp - * could change, and we'll need a lock. - */ - err = dsl_read_nolock(NULL, spa, os->os_rootbp, + err = arc_read(NULL, spa, os->os_rootbp, arc_getbuf_func, &os->os_phys_buf, ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_CANFAIL, &aflags, &zb); if (err) { @@ -1124,8 +1119,7 @@ dmu_objset_sync(objset_t *os, zio_t *pio SET_BOOKMARK(&zb, os->os_dsl_dataset ? os->os_dsl_dataset->ds_object : DMU_META_OBJSET, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - VERIFY3U(0, ==, arc_release_bp(os->os_phys_buf, &os->os_phys_buf, - os->os_rootbp, os->os_spa, &zb)); + arc_release(os->os_phys_buf, &os->os_phys_buf); dmu_write_policy(os, NULL, 0, 0, &zp); @@ -1764,7 +1758,7 @@ dmu_objset_prefetch(const char *name, vo SET_BOOKMARK(&zb, ds->ds_object, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - (void) dsl_read_nolock(NULL, dsl_dataset_get_spa(ds), + (void) arc_read(NULL, dsl_dataset_get_spa(ds), &ds->ds_phys->ds_bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Feb 27 19:20:50 2013 (r247406) @@ -317,7 +317,7 @@ dump_dnode(dmu_sendarg_t *dsp, uint64_t /* ARGSUSED */ static int -backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { dmu_sendarg_t *dsp = arg; @@ -346,9 +346,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co uint32_t aflags = ARC_WAIT; arc_buf_t *abuf; - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); blk = abuf->b_data; @@ -365,9 +365,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co arc_buf_t *abuf; int blksz = BP_GET_LSIZE(bp); - if (arc_read_nolock(NULL, spa, bp, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); err = dump_spill(dsp, zb->zb_object, blksz, abuf->b_data); @@ -377,9 +377,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co arc_buf_t *abuf; int blksz = BP_GET_LSIZE(bp); - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) { + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) { if (zfs_send_corrupt_data) { /* Send a block filled with 0x"zfs badd bloc" */ abuf = arc_buf_alloc(spa, blksz, &abuf, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Wed Feb 27 19:20:50 2013 (r247406) @@ -62,9 +62,9 @@ typedef struct traverse_data { } traverse_data_t; static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object); + uint64_t objset, uint64_t object); static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *, - arc_buf_t *buf, uint64_t objset, uint64_t object); + uint64_t objset, uint64_t object); static int traverse_zil_block(zilog_t *zilog, blkptr_t *bp, void *arg, uint64_t claim_txg) @@ -81,7 +81,7 @@ traverse_zil_block(zilog_t *zilog, blkpt SET_BOOKMARK(&zb, td->td_objset, ZB_ZIL_OBJECT, ZB_ZIL_LEVEL, bp->blk_cksum.zc_word[ZIL_ZC_SEQ]); - (void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, td->td_arg); + (void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg); return (0); } @@ -105,7 +105,7 @@ traverse_zil_record(zilog_t *zilog, lr_t SET_BOOKMARK(&zb, td->td_objset, lr->lr_foid, ZB_ZIL_LEVEL, lr->lr_offset / BP_GET_LSIZE(bp)); - (void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, + (void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg); } return (0); @@ -182,7 +182,7 @@ traverse_pause(traverse_data_t *td, cons static void traverse_prefetch_metadata(traverse_data_t *td, - arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb) + const blkptr_t *bp, const zbookmark_t *zb) { uint32_t flags = ARC_NOWAIT | ARC_PREFETCH; @@ -200,14 +200,13 @@ traverse_prefetch_metadata(traverse_data if (BP_GET_LEVEL(bp) == 0 && BP_GET_TYPE(bp) != DMU_OT_DNODE) return; - (void) arc_read(NULL, td->td_spa, bp, - pbuf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &flags, zb); + (void) arc_read(NULL, td->td_spa, bp, NULL, NULL, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); } static int traverse_visitbp(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb) + const blkptr_t *bp, const zbookmark_t *zb) { zbookmark_t czb; int err = 0, lasterr = 0; @@ -228,8 +227,7 @@ traverse_visitbp(traverse_data_t *td, co } if (BP_IS_HOLE(bp)) { - err = td->td_func(td->td_spa, NULL, NULL, pbuf, zb, dnp, - td->td_arg); + err = td->td_func(td->td_spa, NULL, NULL, zb, dnp, td->td_arg); return (err); } @@ -249,7 +247,7 @@ traverse_visitbp(traverse_data_t *td, co } if (td->td_flags & TRAVERSE_PRE) { - err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp, + err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg); if (err == TRAVERSE_VISIT_NO_CHILDREN) return (0); @@ -265,8 +263,7 @@ traverse_visitbp(traverse_data_t *td, co blkptr_t *cbp; int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT; - err = dsl_read(NULL, td->td_spa, bp, pbuf, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); @@ -276,7 +273,7 @@ traverse_visitbp(traverse_data_t *td, co SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object, zb->zb_level - 1, zb->zb_blkid * epb + i); - traverse_prefetch_metadata(td, buf, &cbp[i], &czb); + traverse_prefetch_metadata(td, &cbp[i], &czb); } /* recursively visitbp() blocks below this */ @@ -284,7 +281,7 @@ traverse_visitbp(traverse_data_t *td, co SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object, zb->zb_level - 1, zb->zb_blkid * epb + i); - err = traverse_visitbp(td, dnp, buf, &cbp[i], &czb); + err = traverse_visitbp(td, dnp, &cbp[i], &czb); if (err) { if (!hard) break; @@ -296,21 +293,20 @@ traverse_visitbp(traverse_data_t *td, co int i; int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT; - err = dsl_read(NULL, td->td_spa, bp, pbuf, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); dnp = buf->b_data; for (i = 0; i < epb; i++) { - prefetch_dnode_metadata(td, &dnp[i], buf, zb->zb_objset, + prefetch_dnode_metadata(td, &dnp[i], zb->zb_objset, zb->zb_blkid * epb + i); } /* recursively visitbp() blocks below this */ for (i = 0; i < epb; i++) { - err = traverse_dnode(td, &dnp[i], buf, zb->zb_objset, + err = traverse_dnode(td, &dnp[i], zb->zb_objset, zb->zb_blkid * epb + i); if (err) { if (!hard) @@ -323,24 +319,23 @@ traverse_visitbp(traverse_data_t *td, co objset_phys_t *osp; dnode_phys_t *dnp; - err = dsl_read_nolock(NULL, td->td_spa, bp, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); osp = buf->b_data; dnp = &osp->os_meta_dnode; - prefetch_dnode_metadata(td, dnp, buf, zb->zb_objset, + prefetch_dnode_metadata(td, dnp, zb->zb_objset, DMU_META_DNODE_OBJECT); if (arc_buf_size(buf) >= sizeof (objset_phys_t)) { prefetch_dnode_metadata(td, &osp->os_userused_dnode, - buf, zb->zb_objset, DMU_USERUSED_OBJECT); + zb->zb_objset, DMU_USERUSED_OBJECT); prefetch_dnode_metadata(td, &osp->os_groupused_dnode, - buf, zb->zb_objset, DMU_USERUSED_OBJECT); + zb->zb_objset, DMU_USERUSED_OBJECT); } - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_META_DNODE_OBJECT); if (err && hard) { lasterr = err; @@ -348,7 +343,7 @@ traverse_visitbp(traverse_data_t *td, co } if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) { dnp = &osp->os_userused_dnode; - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_USERUSED_OBJECT); } if (err && hard) { @@ -357,7 +352,7 @@ traverse_visitbp(traverse_data_t *td, co } if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) { dnp = &osp->os_groupused_dnode; - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_GROUPUSED_OBJECT); } } @@ -367,8 +362,7 @@ traverse_visitbp(traverse_data_t *td, co post: if (err == 0 && lasterr == 0 && (td->td_flags & TRAVERSE_POST)) { - err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp, - td->td_arg); + err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg); if (err == ERESTART) pause = B_TRUE; } @@ -384,25 +378,25 @@ post: static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object) + uint64_t objset, uint64_t object) { int j; zbookmark_t czb; for (j = 0; j < dnp->dn_nblkptr; j++) { SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j); - traverse_prefetch_metadata(td, buf, &dnp->dn_blkptr[j], &czb); + traverse_prefetch_metadata(td, &dnp->dn_blkptr[j], &czb); } if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) { SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID); - traverse_prefetch_metadata(td, buf, &dnp->dn_spill, &czb); + traverse_prefetch_metadata(td, &dnp->dn_spill, &czb); } } static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object) + uint64_t objset, uint64_t object) { int j, err = 0, lasterr = 0; zbookmark_t czb; @@ -410,7 +404,7 @@ traverse_dnode(traverse_data_t *td, cons for (j = 0; j < dnp->dn_nblkptr; j++) { SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j); - err = traverse_visitbp(td, dnp, buf, &dnp->dn_blkptr[j], &czb); + err = traverse_visitbp(td, dnp, &dnp->dn_blkptr[j], &czb); if (err) { if (!hard) break; @@ -420,7 +414,7 @@ traverse_dnode(traverse_data_t *td, cons if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) { SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID); - err = traverse_visitbp(td, dnp, buf, &dnp->dn_spill, &czb); + err = traverse_visitbp(td, dnp, &dnp->dn_spill, &czb); if (err) { if (!hard) return (err); @@ -433,8 +427,7 @@ traverse_dnode(traverse_data_t *td, cons /* ARGSUSED */ static int traverse_prefetcher(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, - void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { prefetch_data_t *pfd = arg; uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH; @@ -455,10 +448,8 @@ traverse_prefetcher(spa_t *spa, zilog_t cv_broadcast(&pfd->pd_cv); mutex_exit(&pfd->pd_mtx); - (void) dsl_read(NULL, spa, bp, pbuf, NULL, NULL, - ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, - &aflags, zb); + (void) arc_read(NULL, spa, bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, + ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, zb); return (0); } @@ -476,7 +467,7 @@ traverse_prefetch_thread(void *arg) SET_BOOKMARK(&czb, td.td_objset, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - (void) traverse_visitbp(&td, NULL, NULL, td.td_rootbp, &czb); + (void) traverse_visitbp(&td, NULL, td.td_rootbp, &czb); mutex_enter(&td_main->td_pfd->pd_mtx); td_main->td_pfd->pd_exited = B_TRUE; @@ -540,7 +531,7 @@ traverse_impl(spa_t *spa, dsl_dataset_t SET_BOOKMARK(&czb, td.td_objset, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - err = traverse_visitbp(&td, NULL, NULL, rootbp, &czb); + err = traverse_visitbp(&td, NULL, rootbp, &czb); mutex_enter(&pd.pd_mtx); pd.pd_cancel = B_TRUE; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Feb 27 19:20:50 2013 (r247406) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. */ #include <sys/dmu.h> @@ -284,6 +284,7 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u delta = P2NPHASE(off, dn->dn_datablksz); } + min_ibs = max_ibs = dn->dn_indblkshift; if (dn->dn_maxblkid > 0) { /* * The blocksize can't change, @@ -291,13 +292,6 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u */ ASSERT(dn->dn_datablkshift != 0); min_bs = max_bs = dn->dn_datablkshift; - min_ibs = max_ibs = dn->dn_indblkshift; - } else if (dn->dn_indblkshift > max_ibs) { - /* - * This ensures that if we reduce DN_MAX_INDBLKSHIFT, - * the code will still work correctly on older pools. - */ - min_ibs = max_ibs = dn->dn_indblkshift; } /* Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Feb 27 19:20:50 2013 (r247406) @@ -1308,7 +1308,7 @@ struct killarg { /* ARGSUSED */ static int -kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { struct killarg *ka = arg; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Feb 27 19:20:50 2013 (r247406) @@ -396,24 +396,6 @@ dsl_free_sync(zio_t *pio, dsl_pool_t *dp zio_nowait(zio_free_sync(pio, dp->dp_spa, txg, bpp, pio->io_flags)); } -int -dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - return (arc_read(pio, spa, bpp, pbuf, done, private, - priority, zio_flags, arc_flags, zb)); -} - -int -dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - return (arc_read_nolock(pio, spa, bpp, done, private, - priority, zio_flags, arc_flags, zb)); -} - static uint64_t dsl_scan_ds_maxtxg(dsl_dataset_t *ds) { @@ -584,12 +566,8 @@ dsl_scan_prefetch(dsl_scan_t *scn, arc_b SET_BOOKMARK(&czb, objset, object, BP_GET_LEVEL(bp), blkid); - /* - * XXX need to make sure all of these arc_read() prefetches are - * done before setting xlateall (similar to dsl_read()) - */ (void) arc_read(scn->scn_zio_root, scn->scn_dp->dp_spa, bp, - buf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, + NULL, NULL, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_SCAN_THREAD, &flags, &czb); } @@ -647,8 +625,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da blkptr_t *cbp; int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -670,8 +647,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da } else if (BP_GET_TYPE(bp) == DMU_OT_USERGROUP_USED) { uint32_t flags = ARC_WAIT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -683,8 +659,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da int i, j; int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -706,8 +681,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da uint32_t flags = ARC_WAIT; objset_phys_t *osp; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Feb 27 19:20:50 2013 (r247406) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ #include <sys/zfs_context.h> @@ -97,6 +98,15 @@ int metaslab_prefetch_limit = SPA_DVAS_P int metaslab_smo_bonus_pct = 150; /* + * Should we be willing to write data to degraded vdevs? + */ +boolean_t zfs_write_to_degraded = B_FALSE; +SYSCTL_INT(_vfs_zfs, OID_AUTO, write_to_degraded, CTLFLAG_RW, + &zfs_write_to_degraded, 0, + "Allow writing data to degraded vdevs"); +TUNABLE_INT("vfs.zfs.write_to_degraded", &zfs_write_to_degraded); + +/* * ========================================================================== * Metaslab classes * ========================================================================== @@ -1383,10 +1393,13 @@ top: /* * Avoid writing single-copy data to a failing vdev + * unless the user instructs us that it is okay. */ if ((vd->vdev_stat.vs_write_errors > 0 || vd->vdev_state < VDEV_STATE_HEALTHY) && - d == 0 && dshift == 3) { + d == 0 && dshift == 3 && + !(zfs_write_to_degraded && vd->vdev_state == + VDEV_STATE_DEGRADED)) { all_zero = B_FALSE; goto next; } Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Feb 27 19:20:50 2013 (r247406) @@ -553,6 +553,7 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ { int var_size = 0; int i; + int j = -1; int full_space; int hdrsize; boolean_t done = B_FALSE; @@ -574,11 +575,13 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ sizeof (sa_hdr_phys_t); full_space = (buftype == SA_BONUS) ? DN_MAX_BONUSLEN : db->db_size; + ASSERT(IS_P2ALIGNED(full_space, 8)); for (i = 0; i != attr_count; i++) { boolean_t is_var_sz; - *total += P2ROUNDUP(attr_desc[i].sa_length, 8); + *total = P2ROUNDUP(*total, 8); + *total += attr_desc[i].sa_length; if (done) goto next; @@ -590,7 +593,14 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ if (is_var_sz && var_size > 1) { if (P2ROUNDUP(hdrsize + sizeof (uint16_t), 8) + *total < full_space) { + /* + * Account for header space used by array of + * optional sizes of variable-length attributes. + * Record the index in case this increase needs + * to be reversed due to spill-over. + */ hdrsize += sizeof (uint16_t); + j = i; } else { done = B_TRUE; *index = i; @@ -619,6 +629,14 @@ next: *will_spill = B_TRUE; } + /* + * j holds the index of the last variable-sized attribute for + * which hdrsize was increased. Reverse the increase if that + * attribute will be relocated to the spill block. + */ + if (*will_spill && j == *index) + hdrsize -= sizeof (uint16_t); + hdrsize = P2ROUNDUP(hdrsize, 8); return (hdrsize); } @@ -709,6 +727,8 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu for (i = 0, len_idx = 0, hash = -1ULL; i != attr_count; i++) { uint16_t length; + ASSERT(IS_P2ALIGNED(data_start, 8)); + ASSERT(IS_P2ALIGNED(buf_space, 8)); attrs[i] = attr_desc[i].sa_attr; length = SA_REGISTERED_LEN(sa, attrs[i]); if (length == 0) @@ -717,6 +737,7 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu VERIFY(length == attr_desc[i].sa_length); if (buf_space < length) { /* switch to spill buffer */ + VERIFY(spilling); VERIFY(bonustype == DMU_OT_SA); if (buftype == SA_BONUS && !sa->sa_force_spill) { sa_find_layout(hdl->sa_os, hash, attrs_start, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Feb 27 19:20:50 2013 (r247406) @@ -1764,7 +1764,7 @@ spa_load_verify_done(zio_t *zio) /*ARGSUSED*/ static int spa_load_verify_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { if (bp != NULL) { zio_t *rio = arg; Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Feb 27 19:20:50 2013 (r247406) @@ -49,7 +49,6 @@ struct arc_buf { arc_buf_hdr_t *b_hdr; arc_buf_t *b_next; kmutex_t b_evict_lock; - krwlock_t b_data_lock; void *b_data; arc_evict_func_t *b_efunc; void *b_private; @@ -93,8 +92,6 @@ void arc_buf_add_ref(arc_buf_t *buf, voi int arc_buf_remove_ref(arc_buf_t *buf, void *tag); int arc_buf_size(arc_buf_t *buf); void arc_release(arc_buf_t *buf, void *tag); -int arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa, - zbookmark_t *zb); int arc_released(arc_buf_t *buf); int arc_has_callback(arc_buf_t *buf); void arc_buf_freeze(arc_buf_t *buf); @@ -103,10 +100,7 @@ void arc_buf_thaw(arc_buf_t *buf); int arc_referenced(arc_buf_t *buf); #endif -int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); -int arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp, +int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, void *priv, int priority, int flags, uint32_t *arc_flags, const zbookmark_t *zb); zio_t *arc_write(zio_t *pio, spa_t *spa, uint64_t txg, Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h Wed Feb 27 19:20:50 2013 (r247406) @@ -40,8 +40,7 @@ struct zilog; struct arc_buf; typedef int (blkptr_cb_t)(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - struct arc_buf *pbuf, const zbookmark_t *zb, const struct dnode_phys *dnp, - void *arg); + const zbookmark_t *zb, const struct dnode_phys *dnp, void *arg); #define TRAVERSE_PRE (1<<0) #define TRAVERSE_POST (1<<1) Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Feb 27 19:20:50 2013 (r247406) @@ -134,12 +134,6 @@ void dsl_pool_willuse_space(dsl_pool_t * void dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp); void dsl_free_sync(zio_t *pio, dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp); -int dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); -int dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); void dsl_pool_create_origin(dsl_pool_t *dp, dmu_tx_t *tx); void dsl_pool_upgrade_clones(dsl_pool_t *dp, dmu_tx_t *tx); void dsl_pool_upgrade_dir_clones(dsl_pool_t *dp, dmu_tx_t *tx); Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h Wed Feb 27 19:20:50 2013 (r247406) @@ -20,6 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2012 by Delphix. All rights reserved. */ #ifndef _SYS_REFCOUNT_H @@ -54,8 +55,8 @@ typedef struct refcount { kmutex_t rc_mtx; list_t rc_list; list_t rc_removed; - int64_t rc_count; - int64_t rc_removed_count; + uint64_t rc_count; + uint64_t rc_removed_count; } refcount_t; /* Note: refcount_t must be initialized with refcount_create() */ Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Feb 27 19:20:50 2013 (r247406) @@ -1328,7 +1328,8 @@ vdev_validate(vdev_t *vd, boolean_t stri if (vd->vdev_ops->vdev_op_leaf && vdev_readable(vd)) { uint64_t aux_guid = 0; nvlist_t *nvl; - uint64_t txg = strict ? spa->spa_config_txg : -1ULL; + uint64_t txg = spa_last_synced_txg(spa) != 0 ? + spa_last_synced_txg(spa) : -1ULL; if ((label = vdev_label_read_config(vd, txg)) == NULL) { vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN, @@ -1512,7 +1513,7 @@ vdev_reopen(vdev_t *vd) !l2arc_vdev_present(vd)) l2arc_add_vdev(spa, vd); } else { - (void) vdev_validate(vd, spa_last_synced_txg(spa)); + (void) vdev_validate(vd, B_TRUE); } /* Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c ============================================================================== --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Feb 27 19:03:31 2013 (r247405) +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Feb 27 19:20:50 2013 (r247406) @@ -106,12 +106,18 @@ typedef enum { DATASET_NAME } zfs_ioc_namecheck_t; +typedef enum { + POOL_CHECK_NONE = 1 << 0, + POOL_CHECK_SUSPENDED = 1 << 1, + POOL_CHECK_READONLY = 1 << 2 +} zfs_ioc_poolcheck_t; + typedef struct zfs_ioc_vec { zfs_ioc_func_t *zvec_func; zfs_secpolicy_func_t *zvec_secpolicy; zfs_ioc_namecheck_t zvec_namecheck; boolean_t zvec_his_log; - boolean_t zvec_pool_check; + zfs_ioc_poolcheck_t zvec_pool_check; } zfs_ioc_vec_t; /* This array is indexed by zfs_userquota_prop_t */ @@ -5033,138 +5039,155 @@ zfs_ioc_unjail(zfs_cmd_t *zc) static zfs_ioc_vec_t zfs_ioc_vec[] = { { zfs_ioc_pool_create, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_destroy, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_import, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_export, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Author: mm Date: Wed Feb 27 19:22:27 2013 New Revision: 247407 URL: http://svnweb.freebsd.org/changeset/base/247407 Log: MFC r246631,246651,246666,246675,246678,246688: Merge various ZFS bugfixes MFC r246631: Import vendor bugfixes Illumos ZFS issues: 3422 zpool create/syseventd race yield non-importable pool 3425 first write to a new zvol can fail with EFBIG MFC r246651: Import minor type change in refcount.h header from vendor (illumos). MFC r246666: Import vendor ZFS bugfix fixing a problem in arc_read(). Illumos ZFS issues: 3498 panic in arc_read(): !refcount_is_zero(&pbuf->b_hdr->b_refcnt) MFC r246675: Add tunable to allow block allocation on degraded vdevs. Illumos ZFS issues: 3507 Tunable to allow block allocation even on degraded vdevs MFC r246678: Import vendor bugfixes regarding SA rounding, header size and layout. This was already partially fixed by avg. Illumos ZFS issues: 3512 rounding discrepancy in sa_find_sizes() 3513 mismatch between SA header size and layout MFC r246688 [1]: Merge zfs_ioctl.c code that should have been merged together with ZFS v28. Fixes several problems if working with read-only pools. Changed code originaly introduced in onnv-gate 13061:bda0decf867b Contains changes up to illumos-gate 13700:4bc0783f6064 PR: kern/175897 [1] Suggested by: avg [1] Modified: stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Directory Properties: stable/8/cddl/contrib/opensolaris/ (props changed) stable/8/cddl/contrib/opensolaris/lib/libzfs/ (props changed) stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Feb 27 19:22:27 2013 (r247407) @@ -983,7 +983,7 @@ visit_indirect(spa_t *spa, const dnode_p arc_buf_t *buf; uint64_t fill = 0; - err = arc_read_nolock(NULL, spa, bp, arc_getbuf_func, &buf, + err = arc_read(NULL, spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); @@ -2001,9 +2001,8 @@ zdb_count_block(zdb_cb_t *zcb, zilog_t * bp, NULL, NULL, ZIO_FLAG_CANFAIL)), ==, 0); } -/* ARGSUSED */ static int -zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { zdb_cb_t *zcb = arg; @@ -2410,7 +2409,7 @@ typedef struct zdb_ddt_entry { /* ARGSUSED */ static int zdb_ddt_add_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { avl_tree_t *t = arg; avl_index_t where; Modified: stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Feb 27 19:22:27 2013 (r247407) @@ -526,13 +526,12 @@ get_configs(libzfs_handle_t *hdl, pool_l * version * pool guid * name - * pool txg (if available) * comment (if available) * pool state * hostid (if available) * hostname (if available) */ - uint64_t state, version, pool_txg; + uint64_t state, version; char *comment = NULL; version = fnvlist_lookup_uint64(tmp, @@ -548,11 +547,6 @@ get_configs(libzfs_handle_t *hdl, pool_l fnvlist_add_string(config, ZPOOL_CONFIG_POOL_NAME, name); - if (nvlist_lookup_uint64(tmp, - ZPOOL_CONFIG_POOL_TXG, &pool_txg) == 0) - fnvlist_add_uint64(config, - ZPOOL_CONFIG_POOL_TXG, pool_txg); - if (nvlist_lookup_string(tmp, ZPOOL_CONFIG_COMMENT, &comment) == 0) fnvlist_add_string(config, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Feb 27 19:22:27 2013 (r247407) @@ -940,7 +940,6 @@ buf_cons(void *vbuf, void *unused, int k bzero(buf, sizeof (arc_buf_t)); mutex_init(&buf->b_evict_lock, NULL, MUTEX_DEFAULT, NULL); - rw_init(&buf->b_data_lock, NULL, RW_DEFAULT, NULL); arc_space_consume(sizeof (arc_buf_t), ARC_SPACE_HDRS); return (0); @@ -970,7 +969,6 @@ buf_dest(void *vbuf, void *unused) arc_buf_t *buf = vbuf; mutex_destroy(&buf->b_evict_lock); - rw_destroy(&buf->b_data_lock); arc_space_return(sizeof (arc_buf_t), ARC_SPACE_HDRS); } @@ -2968,42 +2966,11 @@ arc_read_done(zio_t *zio) * * arc_read_done() will invoke all the requested "done" functions * for readers of this block. - * - * Normal callers should use arc_read and pass the arc buffer and offset - * for the bp. But if you know you don't need locking, you can use - * arc_read_nolock. */ int -arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - int err; - - if (pbuf == NULL) { - /* - * XXX This happens from traverse callback funcs, for - * the objset_phys_t block. - */ - return (arc_read_nolock(pio, spa, bp, done, private, priority, - zio_flags, arc_flags, zb)); - } - - ASSERT(!refcount_is_zero(&pbuf->b_hdr->b_refcnt)); - ASSERT3U((char *)bp - (char *)pbuf->b_data, <, pbuf->b_hdr->b_size); - rw_enter(&pbuf->b_data_lock, RW_READER); - - err = arc_read_nolock(pio, spa, bp, done, private, priority, - zio_flags, arc_flags, zb); - rw_exit(&pbuf->b_data_lock); - - return (err); -} - -int -arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) +arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, + void *private, int priority, int zio_flags, uint32_t *arc_flags, + const zbookmark_t *zb) { arc_buf_hdr_t *hdr; arc_buf_t *buf; @@ -3482,19 +3449,6 @@ arc_release(arc_buf_t *buf, void *tag) } } -/* - * Release this buffer. If it does not match the provided BP, fill it - * with that block's contents. - */ -/* ARGSUSED */ -int -arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa, - zbookmark_t *zb) -{ - arc_release(buf, tag); - return (0); -} - int arc_released(arc_buf_t *buf) { Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c Wed Feb 27 19:22:27 2013 (r247407) @@ -135,7 +135,7 @@ bptree_add(objset_t *os, uint64_t obj, b /* ARGSUSED */ static int -bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +bptree_visit_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { int err; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Feb 27 19:22:27 2013 (r247407) @@ -513,7 +513,6 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t spa_t *spa; zbookmark_t zb; uint32_t aflags = ARC_NOWAIT; - arc_buf_t *pbuf; DB_DNODE_ENTER(db); dn = DB_DNODE(db); @@ -575,14 +574,8 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t db->db.db_object, db->db_level, db->db_blkid); dbuf_add_ref(db, NULL); - /* ZIO_FLAG_CANFAIL callers have to check the parent zio's error */ - if (db->db_parent) - pbuf = db->db_parent->db_buf; - else - pbuf = db->db_objset->os_phys_buf; - - (void) dsl_read(zio, spa, db->db_blkptr, pbuf, + (void) arc_read(zio, spa, db->db_blkptr, dbuf_read_done, db, ZIO_PRIORITY_SYNC_READ, (*flags & DB_RF_CANFAIL) ? ZIO_FLAG_CANFAIL : ZIO_FLAG_MUSTSUCCEED, &aflags, &zb); @@ -982,7 +975,6 @@ void dbuf_release_bp(dmu_buf_impl_t *db) { objset_t *os; - zbookmark_t zb; DB_GET_OBJSET(&os, db); ASSERT(dsl_pool_sync_context(dmu_objset_pool(os))); @@ -990,13 +982,7 @@ dbuf_release_bp(dmu_buf_impl_t *db) list_link_active(&os->os_dsl_dataset->ds_synced_link)); ASSERT(db->db_parent == NULL || arc_released(db->db_parent->db_buf)); - zb.zb_objset = os->os_dsl_dataset ? - os->os_dsl_dataset->ds_object : 0; - zb.zb_object = db->db.db_object; - zb.zb_level = db->db_level; - zb.zb_blkid = db->db_blkid; - (void) arc_release_bp(db->db_buf, db, - db->db_blkptr, os->os_spa, &zb); + (void) arc_release(db->db_buf, db); } dbuf_dirty_record_t * @@ -1831,7 +1817,6 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki if (bp && !BP_IS_HOLE(bp)) { int priority = dn->dn_type == DMU_OT_DDT_ZAP ? ZIO_PRIORITY_DDT_PREFETCH : ZIO_PRIORITY_ASYNC_READ; - arc_buf_t *pbuf; dsl_dataset_t *ds = dn->dn_objset->os_dsl_dataset; uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH; zbookmark_t zb; @@ -1839,13 +1824,8 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki SET_BOOKMARK(&zb, ds ? ds->ds_object : DMU_META_OBJSET, dn->dn_object, 0, blkid); - if (db) - pbuf = db->db_buf; - else - pbuf = dn->dn_objset->os_phys_buf; - - (void) dsl_read(NULL, dn->dn_objset->os_spa, - bp, pbuf, NULL, NULL, priority, + (void) arc_read(NULL, dn->dn_objset->os_spa, + bp, NULL, NULL, priority, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, &zb); } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c Wed Feb 27 19:22:27 2013 (r247407) @@ -128,7 +128,7 @@ report_dnode(struct diffarg *da, uint64_ /* ARGSUSED */ static int -diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +diff_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { struct diffarg *da = arg; @@ -155,9 +155,9 @@ diff_cb(spa_t *spa, zilog_t *zilog, cons int blksz = BP_GET_LSIZE(bp); int i; - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); blk = abuf->b_data; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Feb 27 19:22:27 2013 (r247407) @@ -276,12 +276,7 @@ dmu_objset_open_impl(spa_t *spa, dsl_dat aflags |= ARC_L2CACHE; dprintf_bp(os->os_rootbp, "reading %s", ""); - /* - * XXX when bprewrite scrub can change the bp, - * and this is called from dmu_objset_open_ds_os, the bp - * could change, and we'll need a lock. - */ - err = dsl_read_nolock(NULL, spa, os->os_rootbp, + err = arc_read(NULL, spa, os->os_rootbp, arc_getbuf_func, &os->os_phys_buf, ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_CANFAIL, &aflags, &zb); if (err) { @@ -1124,8 +1119,7 @@ dmu_objset_sync(objset_t *os, zio_t *pio SET_BOOKMARK(&zb, os->os_dsl_dataset ? os->os_dsl_dataset->ds_object : DMU_META_OBJSET, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - VERIFY3U(0, ==, arc_release_bp(os->os_phys_buf, &os->os_phys_buf, - os->os_rootbp, os->os_spa, &zb)); + arc_release(os->os_phys_buf, &os->os_phys_buf); dmu_write_policy(os, NULL, 0, 0, &zp); @@ -1764,7 +1758,7 @@ dmu_objset_prefetch(const char *name, vo SET_BOOKMARK(&zb, ds->ds_object, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - (void) dsl_read_nolock(NULL, dsl_dataset_get_spa(ds), + (void) arc_read(NULL, dsl_dataset_get_spa(ds), &ds->ds_phys->ds_bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Feb 27 19:22:27 2013 (r247407) @@ -317,7 +317,7 @@ dump_dnode(dmu_sendarg_t *dsp, uint64_t /* ARGSUSED */ static int -backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +backup_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { dmu_sendarg_t *dsp = arg; @@ -346,9 +346,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co uint32_t aflags = ARC_WAIT; arc_buf_t *abuf; - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); blk = abuf->b_data; @@ -365,9 +365,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co arc_buf_t *abuf; int blksz = BP_GET_LSIZE(bp); - if (arc_read_nolock(NULL, spa, bp, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) return (EIO); err = dump_spill(dsp, zb->zb_object, blksz, abuf->b_data); @@ -377,9 +377,9 @@ backup_cb(spa_t *spa, zilog_t *zilog, co arc_buf_t *abuf; int blksz = BP_GET_LSIZE(bp); - if (dsl_read(NULL, spa, bp, pbuf, - arc_getbuf_func, &abuf, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &aflags, zb) != 0) { + if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) { if (zfs_send_corrupt_data) { /* Send a block filled with 0x"zfs badd bloc" */ abuf = arc_buf_alloc(spa, blksz, &abuf, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Wed Feb 27 19:22:27 2013 (r247407) @@ -62,9 +62,9 @@ typedef struct traverse_data { } traverse_data_t; static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object); + uint64_t objset, uint64_t object); static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *, - arc_buf_t *buf, uint64_t objset, uint64_t object); + uint64_t objset, uint64_t object); static int traverse_zil_block(zilog_t *zilog, blkptr_t *bp, void *arg, uint64_t claim_txg) @@ -81,7 +81,7 @@ traverse_zil_block(zilog_t *zilog, blkpt SET_BOOKMARK(&zb, td->td_objset, ZB_ZIL_OBJECT, ZB_ZIL_LEVEL, bp->blk_cksum.zc_word[ZIL_ZC_SEQ]); - (void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, td->td_arg); + (void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg); return (0); } @@ -105,7 +105,7 @@ traverse_zil_record(zilog_t *zilog, lr_t SET_BOOKMARK(&zb, td->td_objset, lr->lr_foid, ZB_ZIL_LEVEL, lr->lr_offset / BP_GET_LSIZE(bp)); - (void) td->td_func(td->td_spa, zilog, bp, NULL, &zb, NULL, + (void) td->td_func(td->td_spa, zilog, bp, &zb, NULL, td->td_arg); } return (0); @@ -182,7 +182,7 @@ traverse_pause(traverse_data_t *td, cons static void traverse_prefetch_metadata(traverse_data_t *td, - arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb) + const blkptr_t *bp, const zbookmark_t *zb) { uint32_t flags = ARC_NOWAIT | ARC_PREFETCH; @@ -200,14 +200,13 @@ traverse_prefetch_metadata(traverse_data if (BP_GET_LEVEL(bp) == 0 && BP_GET_TYPE(bp) != DMU_OT_DNODE) return; - (void) arc_read(NULL, td->td_spa, bp, - pbuf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL, &flags, zb); + (void) arc_read(NULL, td->td_spa, bp, NULL, NULL, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); } static int traverse_visitbp(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *pbuf, const blkptr_t *bp, const zbookmark_t *zb) + const blkptr_t *bp, const zbookmark_t *zb) { zbookmark_t czb; int err = 0, lasterr = 0; @@ -228,8 +227,7 @@ traverse_visitbp(traverse_data_t *td, co } if (BP_IS_HOLE(bp)) { - err = td->td_func(td->td_spa, NULL, NULL, pbuf, zb, dnp, - td->td_arg); + err = td->td_func(td->td_spa, NULL, NULL, zb, dnp, td->td_arg); return (err); } @@ -249,7 +247,7 @@ traverse_visitbp(traverse_data_t *td, co } if (td->td_flags & TRAVERSE_PRE) { - err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp, + err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg); if (err == TRAVERSE_VISIT_NO_CHILDREN) return (0); @@ -265,8 +263,7 @@ traverse_visitbp(traverse_data_t *td, co blkptr_t *cbp; int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT; - err = dsl_read(NULL, td->td_spa, bp, pbuf, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); @@ -276,7 +273,7 @@ traverse_visitbp(traverse_data_t *td, co SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object, zb->zb_level - 1, zb->zb_blkid * epb + i); - traverse_prefetch_metadata(td, buf, &cbp[i], &czb); + traverse_prefetch_metadata(td, &cbp[i], &czb); } /* recursively visitbp() blocks below this */ @@ -284,7 +281,7 @@ traverse_visitbp(traverse_data_t *td, co SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object, zb->zb_level - 1, zb->zb_blkid * epb + i); - err = traverse_visitbp(td, dnp, buf, &cbp[i], &czb); + err = traverse_visitbp(td, dnp, &cbp[i], &czb); if (err) { if (!hard) break; @@ -296,21 +293,20 @@ traverse_visitbp(traverse_data_t *td, co int i; int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT; - err = dsl_read(NULL, td->td_spa, bp, pbuf, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); dnp = buf->b_data; for (i = 0; i < epb; i++) { - prefetch_dnode_metadata(td, &dnp[i], buf, zb->zb_objset, + prefetch_dnode_metadata(td, &dnp[i], zb->zb_objset, zb->zb_blkid * epb + i); } /* recursively visitbp() blocks below this */ for (i = 0; i < epb; i++) { - err = traverse_dnode(td, &dnp[i], buf, zb->zb_objset, + err = traverse_dnode(td, &dnp[i], zb->zb_objset, zb->zb_blkid * epb + i); if (err) { if (!hard) @@ -323,24 +319,23 @@ traverse_visitbp(traverse_data_t *td, co objset_phys_t *osp; dnode_phys_t *dnp; - err = dsl_read_nolock(NULL, td->td_spa, bp, - arc_getbuf_func, &buf, + err = arc_read(NULL, td->td_spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); osp = buf->b_data; dnp = &osp->os_meta_dnode; - prefetch_dnode_metadata(td, dnp, buf, zb->zb_objset, + prefetch_dnode_metadata(td, dnp, zb->zb_objset, DMU_META_DNODE_OBJECT); if (arc_buf_size(buf) >= sizeof (objset_phys_t)) { prefetch_dnode_metadata(td, &osp->os_userused_dnode, - buf, zb->zb_objset, DMU_USERUSED_OBJECT); + zb->zb_objset, DMU_USERUSED_OBJECT); prefetch_dnode_metadata(td, &osp->os_groupused_dnode, - buf, zb->zb_objset, DMU_USERUSED_OBJECT); + zb->zb_objset, DMU_USERUSED_OBJECT); } - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_META_DNODE_OBJECT); if (err && hard) { lasterr = err; @@ -348,7 +343,7 @@ traverse_visitbp(traverse_data_t *td, co } if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) { dnp = &osp->os_userused_dnode; - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_USERUSED_OBJECT); } if (err && hard) { @@ -357,7 +352,7 @@ traverse_visitbp(traverse_data_t *td, co } if (err == 0 && arc_buf_size(buf) >= sizeof (objset_phys_t)) { dnp = &osp->os_groupused_dnode; - err = traverse_dnode(td, dnp, buf, zb->zb_objset, + err = traverse_dnode(td, dnp, zb->zb_objset, DMU_GROUPUSED_OBJECT); } } @@ -367,8 +362,7 @@ traverse_visitbp(traverse_data_t *td, co post: if (err == 0 && lasterr == 0 && (td->td_flags & TRAVERSE_POST)) { - err = td->td_func(td->td_spa, NULL, bp, pbuf, zb, dnp, - td->td_arg); + err = td->td_func(td->td_spa, NULL, bp, zb, dnp, td->td_arg); if (err == ERESTART) pause = B_TRUE; } @@ -384,25 +378,25 @@ post: static void prefetch_dnode_metadata(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object) + uint64_t objset, uint64_t object) { int j; zbookmark_t czb; for (j = 0; j < dnp->dn_nblkptr; j++) { SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j); - traverse_prefetch_metadata(td, buf, &dnp->dn_blkptr[j], &czb); + traverse_prefetch_metadata(td, &dnp->dn_blkptr[j], &czb); } if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) { SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID); - traverse_prefetch_metadata(td, buf, &dnp->dn_spill, &czb); + traverse_prefetch_metadata(td, &dnp->dn_spill, &czb); } } static int traverse_dnode(traverse_data_t *td, const dnode_phys_t *dnp, - arc_buf_t *buf, uint64_t objset, uint64_t object) + uint64_t objset, uint64_t object) { int j, err = 0, lasterr = 0; zbookmark_t czb; @@ -410,7 +404,7 @@ traverse_dnode(traverse_data_t *td, cons for (j = 0; j < dnp->dn_nblkptr; j++) { SET_BOOKMARK(&czb, objset, object, dnp->dn_nlevels - 1, j); - err = traverse_visitbp(td, dnp, buf, &dnp->dn_blkptr[j], &czb); + err = traverse_visitbp(td, dnp, &dnp->dn_blkptr[j], &czb); if (err) { if (!hard) break; @@ -420,7 +414,7 @@ traverse_dnode(traverse_data_t *td, cons if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) { SET_BOOKMARK(&czb, objset, object, 0, DMU_SPILL_BLKID); - err = traverse_visitbp(td, dnp, buf, &dnp->dn_spill, &czb); + err = traverse_visitbp(td, dnp, &dnp->dn_spill, &czb); if (err) { if (!hard) return (err); @@ -433,8 +427,7 @@ traverse_dnode(traverse_data_t *td, cons /* ARGSUSED */ static int traverse_prefetcher(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, - void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { prefetch_data_t *pfd = arg; uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH; @@ -455,10 +448,8 @@ traverse_prefetcher(spa_t *spa, zilog_t cv_broadcast(&pfd->pd_cv); mutex_exit(&pfd->pd_mtx); - (void) dsl_read(NULL, spa, bp, pbuf, NULL, NULL, - ZIO_PRIORITY_ASYNC_READ, - ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, - &aflags, zb); + (void) arc_read(NULL, spa, bp, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, + ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, zb); return (0); } @@ -476,7 +467,7 @@ traverse_prefetch_thread(void *arg) SET_BOOKMARK(&czb, td.td_objset, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - (void) traverse_visitbp(&td, NULL, NULL, td.td_rootbp, &czb); + (void) traverse_visitbp(&td, NULL, td.td_rootbp, &czb); mutex_enter(&td_main->td_pfd->pd_mtx); td_main->td_pfd->pd_exited = B_TRUE; @@ -540,7 +531,7 @@ traverse_impl(spa_t *spa, dsl_dataset_t SET_BOOKMARK(&czb, td.td_objset, ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); - err = traverse_visitbp(&td, NULL, NULL, rootbp, &czb); + err = traverse_visitbp(&td, NULL, rootbp, &czb); mutex_enter(&pd.pd_mtx); pd.pd_cancel = B_TRUE; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Feb 27 19:22:27 2013 (r247407) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. */ #include <sys/dmu.h> @@ -284,6 +284,7 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u delta = P2NPHASE(off, dn->dn_datablksz); } + min_ibs = max_ibs = dn->dn_indblkshift; if (dn->dn_maxblkid > 0) { /* * The blocksize can't change, @@ -291,13 +292,6 @@ dmu_tx_count_write(dmu_tx_hold_t *txh, u */ ASSERT(dn->dn_datablkshift != 0); min_bs = max_bs = dn->dn_datablkshift; - min_ibs = max_ibs = dn->dn_indblkshift; - } else if (dn->dn_indblkshift > max_ibs) { - /* - * This ensures that if we reduce DN_MAX_INDBLKSHIFT, - * the code will still work correctly on older pools. - */ - min_ibs = max_ibs = dn->dn_indblkshift; } /* Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Feb 27 19:22:27 2013 (r247407) @@ -1308,7 +1308,7 @@ struct killarg { /* ARGSUSED */ static int -kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, arc_buf_t *pbuf, +kill_blkptr(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { struct killarg *ka = arg; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Feb 27 19:22:27 2013 (r247407) @@ -396,24 +396,6 @@ dsl_free_sync(zio_t *pio, dsl_pool_t *dp zio_nowait(zio_free_sync(pio, dp->dp_spa, txg, bpp, pio->io_flags)); } -int -dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - return (arc_read(pio, spa, bpp, pbuf, done, private, - priority, zio_flags, arc_flags, zb)); -} - -int -dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp, - arc_done_func_t *done, void *private, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb) -{ - return (arc_read_nolock(pio, spa, bpp, done, private, - priority, zio_flags, arc_flags, zb)); -} - static uint64_t dsl_scan_ds_maxtxg(dsl_dataset_t *ds) { @@ -584,12 +566,8 @@ dsl_scan_prefetch(dsl_scan_t *scn, arc_b SET_BOOKMARK(&czb, objset, object, BP_GET_LEVEL(bp), blkid); - /* - * XXX need to make sure all of these arc_read() prefetches are - * done before setting xlateall (similar to dsl_read()) - */ (void) arc_read(scn->scn_zio_root, scn->scn_dp->dp_spa, bp, - buf, NULL, NULL, ZIO_PRIORITY_ASYNC_READ, + NULL, NULL, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_SCAN_THREAD, &flags, &czb); } @@ -647,8 +625,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da blkptr_t *cbp; int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -670,8 +647,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da } else if (BP_GET_TYPE(bp) == DMU_OT_USERGROUP_USED) { uint32_t flags = ARC_WAIT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -683,8 +659,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da int i, j; int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; @@ -706,8 +681,7 @@ dsl_scan_recurse(dsl_scan_t *scn, dsl_da uint32_t flags = ARC_WAIT; objset_phys_t *osp; - err = arc_read_nolock(NULL, dp->dp_spa, bp, - arc_getbuf_func, bufp, + err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, bufp, ZIO_PRIORITY_ASYNC_READ, zio_flags, &flags, zb); if (err) { scn->scn_phys.scn_errors++; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Feb 27 19:22:27 2013 (r247407) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ #include <sys/zfs_context.h> @@ -97,6 +98,15 @@ int metaslab_prefetch_limit = SPA_DVAS_P int metaslab_smo_bonus_pct = 150; /* + * Should we be willing to write data to degraded vdevs? + */ +boolean_t zfs_write_to_degraded = B_FALSE; +SYSCTL_INT(_vfs_zfs, OID_AUTO, write_to_degraded, CTLFLAG_RW, + &zfs_write_to_degraded, 0, + "Allow writing data to degraded vdevs"); +TUNABLE_INT("vfs.zfs.write_to_degraded", &zfs_write_to_degraded); + +/* * ========================================================================== * Metaslab classes * ========================================================================== @@ -1383,10 +1393,13 @@ top: /* * Avoid writing single-copy data to a failing vdev + * unless the user instructs us that it is okay. */ if ((vd->vdev_stat.vs_write_errors > 0 || vd->vdev_state < VDEV_STATE_HEALTHY) && - d == 0 && dshift == 3) { + d == 0 && dshift == 3 && + !(zfs_write_to_degraded && vd->vdev_state == + VDEV_STATE_DEGRADED)) { all_zero = B_FALSE; goto next; } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Feb 27 19:22:27 2013 (r247407) @@ -553,6 +553,7 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ { int var_size = 0; int i; + int j = -1; int full_space; int hdrsize; boolean_t done = B_FALSE; @@ -574,11 +575,13 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ sizeof (sa_hdr_phys_t); full_space = (buftype == SA_BONUS) ? DN_MAX_BONUSLEN : db->db_size; + ASSERT(IS_P2ALIGNED(full_space, 8)); for (i = 0; i != attr_count; i++) { boolean_t is_var_sz; - *total += P2ROUNDUP(attr_desc[i].sa_length, 8); + *total = P2ROUNDUP(*total, 8); + *total += attr_desc[i].sa_length; if (done) goto next; @@ -590,7 +593,14 @@ sa_find_sizes(sa_os_t *sa, sa_bulk_attr_ if (is_var_sz && var_size > 1) { if (P2ROUNDUP(hdrsize + sizeof (uint16_t), 8) + *total < full_space) { + /* + * Account for header space used by array of + * optional sizes of variable-length attributes. + * Record the index in case this increase needs + * to be reversed due to spill-over. + */ hdrsize += sizeof (uint16_t); + j = i; } else { done = B_TRUE; *index = i; @@ -619,6 +629,14 @@ next: *will_spill = B_TRUE; } + /* + * j holds the index of the last variable-sized attribute for + * which hdrsize was increased. Reverse the increase if that + * attribute will be relocated to the spill block. + */ + if (*will_spill && j == *index) + hdrsize -= sizeof (uint16_t); + hdrsize = P2ROUNDUP(hdrsize, 8); return (hdrsize); } @@ -709,6 +727,8 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu for (i = 0, len_idx = 0, hash = -1ULL; i != attr_count; i++) { uint16_t length; + ASSERT(IS_P2ALIGNED(data_start, 8)); + ASSERT(IS_P2ALIGNED(buf_space, 8)); attrs[i] = attr_desc[i].sa_attr; length = SA_REGISTERED_LEN(sa, attrs[i]); if (length == 0) @@ -717,6 +737,7 @@ sa_build_layouts(sa_handle_t *hdl, sa_bu VERIFY(length == attr_desc[i].sa_length); if (buf_space < length) { /* switch to spill buffer */ + VERIFY(spilling); VERIFY(bonustype == DMU_OT_SA); if (buftype == SA_BONUS && !sa->sa_force_spill) { sa_find_layout(hdl->sa_os, hash, attrs_start, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Feb 27 19:22:27 2013 (r247407) @@ -1764,7 +1764,7 @@ spa_load_verify_done(zio_t *zio) /*ARGSUSED*/ static int spa_load_verify_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - arc_buf_t *pbuf, const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) + const zbookmark_t *zb, const dnode_phys_t *dnp, void *arg) { if (bp != NULL) { zio_t *rio = arg; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Feb 27 19:22:27 2013 (r247407) @@ -49,7 +49,6 @@ struct arc_buf { arc_buf_hdr_t *b_hdr; arc_buf_t *b_next; kmutex_t b_evict_lock; - krwlock_t b_data_lock; void *b_data; arc_evict_func_t *b_efunc; void *b_private; @@ -93,8 +92,6 @@ void arc_buf_add_ref(arc_buf_t *buf, voi int arc_buf_remove_ref(arc_buf_t *buf, void *tag); int arc_buf_size(arc_buf_t *buf); void arc_release(arc_buf_t *buf, void *tag); -int arc_release_bp(arc_buf_t *buf, void *tag, blkptr_t *bp, spa_t *spa, - zbookmark_t *zb); int arc_released(arc_buf_t *buf); int arc_has_callback(arc_buf_t *buf); void arc_buf_freeze(arc_buf_t *buf); @@ -103,10 +100,7 @@ void arc_buf_thaw(arc_buf_t *buf); int arc_referenced(arc_buf_t *buf); #endif -int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_buf_t *pbuf, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); -int arc_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bp, +int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, void *priv, int priority, int flags, uint32_t *arc_flags, const zbookmark_t *zb); zio_t *arc_write(zio_t *pio, spa_t *spa, uint64_t txg, Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_traverse.h Wed Feb 27 19:22:27 2013 (r247407) @@ -40,8 +40,7 @@ struct zilog; struct arc_buf; typedef int (blkptr_cb_t)(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, - struct arc_buf *pbuf, const zbookmark_t *zb, const struct dnode_phys *dnp, - void *arg); + const zbookmark_t *zb, const struct dnode_phys *dnp, void *arg); #define TRAVERSE_PRE (1<<0) #define TRAVERSE_POST (1<<1) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Feb 27 19:22:27 2013 (r247407) @@ -134,12 +134,6 @@ void dsl_pool_willuse_space(dsl_pool_t * void dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp); void dsl_free_sync(zio_t *pio, dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp); -int dsl_read(zio_t *pio, spa_t *spa, const blkptr_t *bpp, arc_buf_t *pbuf, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); -int dsl_read_nolock(zio_t *pio, spa_t *spa, const blkptr_t *bpp, - arc_done_func_t *done, void *priv, int priority, int zio_flags, - uint32_t *arc_flags, const zbookmark_t *zb); void dsl_pool_create_origin(dsl_pool_t *dp, dmu_tx_t *tx); void dsl_pool_upgrade_clones(dsl_pool_t *dp, dmu_tx_t *tx); void dsl_pool_upgrade_dir_clones(dsl_pool_t *dp, dmu_tx_t *tx); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h Wed Feb 27 19:22:27 2013 (r247407) @@ -20,6 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2012 by Delphix. All rights reserved. */ #ifndef _SYS_REFCOUNT_H @@ -54,8 +55,8 @@ typedef struct refcount { kmutex_t rc_mtx; list_t rc_list; list_t rc_removed; - int64_t rc_count; - int64_t rc_removed_count; + uint64_t rc_count; + uint64_t rc_removed_count; } refcount_t; /* Note: refcount_t must be initialized with refcount_create() */ Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Feb 27 19:22:27 2013 (r247407) @@ -1328,7 +1328,8 @@ vdev_validate(vdev_t *vd, boolean_t stri if (vd->vdev_ops->vdev_op_leaf && vdev_readable(vd)) { uint64_t aux_guid = 0; nvlist_t *nvl; - uint64_t txg = strict ? spa->spa_config_txg : -1ULL; + uint64_t txg = spa_last_synced_txg(spa) != 0 ? + spa_last_synced_txg(spa) : -1ULL; if ((label = vdev_label_read_config(vd, txg)) == NULL) { vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN, @@ -1512,7 +1513,7 @@ vdev_reopen(vdev_t *vd) !l2arc_vdev_present(vd)) l2arc_add_vdev(spa, vd); } else { - (void) vdev_validate(vd, spa_last_synced_txg(spa)); + (void) vdev_validate(vd, B_TRUE); } /* Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Feb 27 19:20:50 2013 (r247406) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Feb 27 19:22:27 2013 (r247407) @@ -106,12 +106,18 @@ typedef enum { DATASET_NAME } zfs_ioc_namecheck_t; +typedef enum { + POOL_CHECK_NONE = 1 << 0, + POOL_CHECK_SUSPENDED = 1 << 1, + POOL_CHECK_READONLY = 1 << 2 +} zfs_ioc_poolcheck_t; + typedef struct zfs_ioc_vec { zfs_ioc_func_t *zvec_func; zfs_secpolicy_func_t *zvec_secpolicy; zfs_ioc_namecheck_t zvec_namecheck; boolean_t zvec_his_log; - boolean_t zvec_pool_check; + zfs_ioc_poolcheck_t zvec_pool_check; } zfs_ioc_vec_t; /* This array is indexed by zfs_userquota_prop_t */ @@ -5033,138 +5039,155 @@ zfs_ioc_unjail(zfs_cmd_t *zc) static zfs_ioc_vec_t zfs_ioc_vec[] = { { zfs_ioc_pool_create, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_destroy, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_import, zfs_secpolicy_config, POOL_NAME, B_TRUE, - B_FALSE }, + POOL_CHECK_NONE }, { zfs_ioc_pool_export, zfs_secpolicy_config, POOL_NAME, B_FALSE, - B_FALSE }, *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped
There was a commit referencing this bug, but it's still not closed and has been inactive for some time. Closing as fixed. Please re-open it if the issue hasn't been completely resolved.