FreeBSD 12.1-RELEASE-p2 One of our servers seems to have developed some kind of problem in one of the ZFS pools. It claims to be resilvering data, but I can't see any progress at all in the output from "zpool status". "zpool iostat" and "iostat -x" does indicate a lot of reads being done though. # zpool status -v DATA2 pool: DATA2 state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Sat Mar 7 13:12:06 2020 10.4M scanned at 0/s, 56.1M issued at 0/s, 390T total 0 resilvered, 0.00% done, no estimated completion time config: NAME STATE READ WRITE CKSUM DATA2 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 diskid/DISK-7PH8LBKG ONLINE 0 0 0 diskid/DISK-7PH8N1EG ONLINE 0 0 0 diskid/DISK-7PH3EZDG ONLINE 0 0 0 diskid/DISK-7PGN0VJG ONLINE 0 0 0 diskid/DISK-7PH8LVMG ONLINE 0 0 0 diskid/DISK-7PH8MZVG ONLINE 0 0 0 diskid/DISK-7PH5RASG ONLINE 0 0 0 diskid/DISK-7PH3EVJG ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 diskid/DISK-7PH3EXRG ONLINE 0 0 0 diskid/DISK-7PH8LV5G ONLINE 0 0 0 diskid/DISK-7PH3EX2G ONLINE 0 0 0 diskid/DISK-7PH77Z6G ONLINE 0 0 0 diskid/DISK-7PH8LWXG ONLINE 0 0 0 diskid/DISK-7PH81L1G ONLINE 0 0 0 diskid/DISK-7PH8N0BG ONLINE 0 0 0 diskid/DISK-7PH8MNMG ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 diskid/DISK-7PH3EWHG ONLINE 0 0 0 diskid/DISK-7PH8LTWG ONLINE 0 0 0 diskid/DISK-7PH5R02G ONLINE 0 0 0 diskid/DISK-7PH8L88G ONLINE 0 0 0 diskid/DISK-7PH81MUG ONLINE 0 0 0 diskid/DISK-7PH8LXMG ONLINE 0 0 0 diskid/DISK-7PH8N26G ONLINE 0 0 0 diskid/DISK-7PH93ZWG ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 diskid/DISK-7PH8LEVG ONLINE 0 0 0 diskid/DISK-7PH8MYKG ONLINE 0 0 0 diskid/DISK-7PH8MXRG ONLINE 0 0 0 diskid/DISK-7PH4TB1G ONLINE 0 0 0 diskid/DISK-7PH8G8XG ONLINE 0 0 0 diskid/DISK-7PH9SP9G ONLINE 0 0 0 diskid/DISK-7PH8LARG ONLINE 0 0 0 diskid/DISK-7PH8KRYG ONLINE 0 0 0 raidz2-4 ONLINE 0 0 0 diskid/DISK-7PH8LWMG ONLINE 0 0 0 diskid/DISK-7PH8LVXG ONLINE 0 0 0 diskid/DISK-7PH8LU1G ONLINE 0 0 0 diskid/DISK-7PH9451G ONLINE 0 0 0 diskid/DISK-7PH8ESMG ONLINE 0 0 0 diskid/DISK-7PH3EWVG ONLINE 0 0 0 diskid/DISK-7PH8EMKG ONLINE 0 0 0 diskid/DISK-7PH8LG9G ONLINE 0 0 0 raidz2-5 ONLINE 0 0 0 diskid/DISK-7PH93X6G ONLINE 0 0 0 diskid/DISK-7PH8MXJG ONLINE 0 0 0 diskid/DISK-7PH3EY8G ONLINE 0 0 0 diskid/DISK-7PHAHG3G ONLINE 0 0 0 diskid/DISK-7PH8LVYG ONLINE 0 0 0 diskid/DISK-7PH8L8KG ONLINE 0 0 0 diskid/DISK-7PH8LVNG ONLINE 0 0 0 diskid/DISK-7PH8L9SG ONLINE 0 0 0 spares diskid/DISK-7PH8G10G AVAIL diskid/DISK-7PH8GJ2G AVAIL diskid/DISK-7PH8MZ8G AVAIL diskid/DISK-7PGRM7TG AVAIL diskid/DISK-7PH7UUAG AVAIL diskid/DISK-7PH8GJTG AVAIL errors: Permanent errors have been detected in the following files: <metadata>:<0x0> <metadata>:<0xc49c12> <metadata>:<0x19> <metadata>:<0x1d> <metadata>:<0x20> <metadata>:<0x65714b> <metadata>:<0x657157> <metadata>:<0x65717b> <metadata>:<0x657193> <metadata>:<0xc530fe> DATA2/filur06.it.liu.se/DATA/old/isy/nobackup-server@auto-2020-01-21.23:00:00:/cvl/GARNICS_DATA/20120123sorted/plant__69683/pos1/cam9181310_TS1327623708156387.tif # procstat -kka|egrep zfs 0 100212 kernel zfsvfs mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 taskqueue_thread_loop+0xf1 fork_exit+0x83 fork_trampoline+0xe 0 100519 kernel zfs_vn_rele_taskq mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 taskqueue_thread_loop+0xf1 fork_exit+0x83 fork_trampoline+0xe 0 100891 kernel zfs_vn_rele_taskq mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 taskqueue_thread_loop+0xf1 fork_exit+0x83 fork_trampoline+0xe 0 102051 kernel zfs_vn_rele_taskq mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 taskqueue_thread_loop+0xf1 fork_exit+0x83 fork_trampoline+0xe 32 100190 zfskern solthread 0xfffffff mi_switch+0xe2 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a zthr_procedure+0xff fork_exit+0x83 fork_trampoline+0xe 32 100191 zfskern solthread 0xfffffff mi_switch+0xe2 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a zthr_procedure+0xff fork_exit+0x83 fork_trampoline+0xe 32 100192 zfskern arc_dnlc_evicts_thr mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 arc_dnlc_evicts_thread+0x14f fork_exit+0x83 fork_trampoline+0xe 32 100194 zfskern dbuf_evict_thread mi_switch+0xe2 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a dbuf_evict_thread+0x1c8 fork_exit+0x83 fork_trampoline+0xe 32 100211 zfskern l2arc_feed_thread mi_switch+0xe2 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a l2arc_feed_thread+0x239 fork_exit+0x83 fork_trampoline+0xe 32 100482 zfskern trim zroot mi_switch+0xe2 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a trim_thread+0x120 fork_exit+0x83 fork_trampoline+0xe 32 100552 zfskern txg_thread_enter mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 txg_quiesce_thread+0xbb fork_exit+0x83 fork_trampoline+0xe 32 100553 zfskern txg_thread_enter mi_switch+0xe2 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a txg_sync_thread+0x47f fork_exit+0x83 fork_trampoline+0xe 32 100554 zfskern solthread 0xfffffff mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 zthr_procedure+0x117 fork_exit+0x83 fork_trampoline+0xe 32 100555 zfskern solthread 0xfffffff mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 zthr_procedure+0x117 fork_exit+0x83 fork_trampoline+0xe 32 100808 zfskern trim DATA2 mi_switch+0xe2 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a trim_thread+0x120 fork_exit+0x83 fork_trampoline+0xe 32 101517 zfskern txg_thread_enter mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 txg_quiesce_thread+0xbb fork_exit+0x83 fork_trampoline+0xe 32 101518 zfskern txg_thread_enter mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 zio_wait+0x9b dbuf_read+0x669 dnode_hold_impl+0x1af dmu_bonus_hold+0x1d dsl_deadlist_open+0x49 dsl_dataset_hold_obj+0x3d5 dsl_scan_sync+0xf65 spa_sync+0xb57 txg_sync_thread+0x238 fork_exit+0x83 fork_trampoline+0xe 32 101519 zfskern solthread 0xfffffff mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 zthr_procedure+0x117 fork_exit+0x83 fork_trampoline+0xe 32 101520 zfskern solthread 0xfffffff mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 zthr_procedure+0x117 fork_exit+0x83 fork_trampoline+0xe 32 102005 zfskern trim DATA3 mi_switch+0xe2 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a trim_thread+0x120 fork_exit+0x83 fork_trampoline+0xe 32 102261 zfskern txg_thread_enter mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 txg_quiesce_thread+0xbb fork_exit+0x83 fork_trampoline+0xe 32 102262 zfskern txg_thread_enter mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 taskqueue_quiesce+0x114 dsl_scan_sync+0xd2a spa_sync+0xb57 txg_sync_thread+0x238 fork_exit+0x83 fork_trampoline+0xe 32 102263 zfskern solthread 0xfffffff mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 zthr_procedure+0x117 fork_exit+0x83 fork_trampoline+0xe 32 102264 zfskern solthread 0xfffffff mi_switch+0xe2 sleepq_wait+0x2c _cv_wait+0x152 zthr_procedure+0x117 fork_exit+0x83 fork_trampoline+0xe 32 102265 zfskern zvol DATA3/sekur-is mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 zvol_geom_worker+0x16d fork_exit+0x83 fork_trampoline+0xe 32 102266 zfskern zvol DATA3/sekur-is mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 zvol_geom_worker+0x16d fork_exit+0x83 fork_trampoline+0xe 32 102267 zfskern zvol DATA3/sekur-is mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 zvol_geom_worker+0x16d fork_exit+0x83 fork_trampoline+0xe ... 32 102415 zfskern zvol DATA3/sekur-is mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 zvol_geom_worker+0x16d fork_exit+0x83 fork_trampoline+0xe 32 102416 zfskern zvol DATA3/sekur-is mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 zvol_geom_worker+0x16d fork_exit+0x83 fork_trampoline+0xe 2640 100568 zfsd - mi_switch+0xe2 sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _cv_wait_sig+0x154 seltdwait+0xbf kern_poll+0x43d sys_poll+0x50 amd64_syscall+0x364 fast_syscall_common+0x101 20980 100567 zfs.orig - zap_cursor_retrieve+0x70 zap_value_search+0x8f dsl_dataset_get_snapname+0x90 dsl_dataset_name+0x34 dsl_dataset_stats+0xf4 dmu_objset_stats+0x1d zfs_ioc_objset_stats_impl+0x50 zfs_ioc_dataset_list_next+0x143 zfsdev_ioctl+0x72e devfs_ioctl+0xad VOP_IOCTL_APV+0x7c vn_ioctl+0x16a devfs_ioctl_f+0x1f kern_ioctl+0x2be sys_ioctl+0x15d amd64_syscall+0x364 fast_syscall_common+0x101 # zpool iostat -v DATA2 10 capacity operations bandwidth pool alloc free read write read write ------------------------ ----- ----- ----- ----- ----- ----- DATA2 390T 44.8T 871 0 3.68M 5.56K raidz2 66.4T 6.10T 167 0 726K 883 diskid/DISK-7PH8LBKG - - 22 0 92.6K 287 diskid/DISK-7PH8N1EG - - 22 0 92.4K 296 diskid/DISK-7PH3EZDG - - 22 0 93.6K 284 diskid/DISK-7PGN0VJG - - 22 0 92.6K 248 diskid/DISK-7PH8LVMG - - 22 0 92.9K 251 diskid/DISK-7PH8MZVG - - 22 0 93.2K 269 diskid/DISK-7PH5RASG - - 22 0 92.5K 284 diskid/DISK-7PH3EVJG - - 22 0 92.3K 287 raidz2 66.3T 6.16T 163 0 714K 883 diskid/DISK-7PH3EXRG - - 21 0 91.7K 248 diskid/DISK-7PH8LV5G - - 21 0 91.9K 251 diskid/DISK-7PH3EX2G - - 21 0 91.0K 275 diskid/DISK-7PH77Z6G - - 21 0 92.1K 287 diskid/DISK-7PH8LWXG - - 21 0 90.8K 320 diskid/DISK-7PH81L1G - - 21 0 91.0K 284 diskid/DISK-7PH8N0BG - - 21 0 91.7K 263 diskid/DISK-7PH8MNMG - - 21 0 91.5K 266 raidz2 66.3T 6.22T 166 0 719K 1013 diskid/DISK-7PH3EWHG - - 22 0 92.3K 246 diskid/DISK-7PH8LTWG - - 21 0 91.0K 231 diskid/DISK-7PH5R02G - - 21 0 91.8K 231 diskid/DISK-7PH8L88G - - 21 0 93.1K 219 diskid/DISK-7PH81MUG - - 21 0 92.1K 225 diskid/DISK-7PH8LXMG - - 21 0 92.1K 237 diskid/DISK-7PH8N26G - - 21 0 92.5K 237 diskid/DISK-7PH93ZWG - - 21 0 91.0K 240 raidz2 66.4T 6.14T 132 0 581K 1013 diskid/DISK-7PH8LEVG - - 17 0 73.0K 237 diskid/DISK-7PH8MYKG - - 17 0 73.3K 219 diskid/DISK-7PH8MXRG - - 17 0 73.4K 192 diskid/DISK-7PH4TB1G - - 17 0 74.5K 222 diskid/DISK-7PH8G8XG - - 17 0 73.9K 231 diskid/DISK-7PH9SP9G - - 17 0 73.4K 219 diskid/DISK-7PH8LARG - - 17 0 73.8K 201 diskid/DISK-7PH8KRYG - - 17 0 73.6K 231 raidz2 62.4T 10.1T 69 0 298K 1013 diskid/DISK-7PH8LWMG - - 9 0 38.4K 210 diskid/DISK-7PH8LVXG - - 9 0 38.3K 225 diskid/DISK-7PH8LU1G - - 9 0 37.7K 213 diskid/DISK-7PH9451G - - 9 0 38.2K 210 diskid/DISK-7PH8ESMG - - 9 0 38.2K 237 diskid/DISK-7PH3EWVG - - 9 0 37.8K 240 diskid/DISK-7PH8EMKG - - 9 0 38.1K 228 diskid/DISK-7PH8LG9G - - 9 0 38.2K 213 raidz2 62.7T 10.1T 172 0 733K 883 diskid/DISK-7PH93X6G - - 22 0 94.1K 246 diskid/DISK-7PH8MXJG - - 22 0 93.7K 266 diskid/DISK-7PH3EY8G - - 22 0 93.2K 257 diskid/DISK-7PHAHG3G - - 22 0 93.9K 281 diskid/DISK-7PH8LVYG - - 22 0 93.9K 281 diskid/DISK-7PH8L8KG - - 22 0 94.4K 237 diskid/DISK-7PH8LVNG - - 22 0 93.1K 243 diskid/DISK-7PH8L9SG - - 22 0 93.4K 266 ------------------------ ----- ----- ----- ----- ----- -----
Oh, after many hours it seems to have finished: # zpool status DATA2 pool: DATA2 state: ONLINE status: Some supported features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(7) for details. scan: resilvered 152K in 0 days 06:07:55 with 0 errors on Sat Mar 7 19:20:01 2020 config: .. diskid/DISK-7PH8GJTG AVAIL errors: No known data errors (The reason it didn't have all features enabled was that I upgraded this server from 11.3-p6 till 12.1-p2 as one of way to try to fix the problem... # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT DATA2 435T 390T 44.8T - - 23% 89% 1.00x ONLINE - DATA3 145T 75.5T 69.5T - - 11% 52% 1.00x ONLINE - zroot 113G 35.5G 77.5G - - 56% 31% 1.00x ONLINE - and now zfs commands don't hang any more... Now, a wild guess is that it was busy destroying defered-destroyed snapshots. Since before I restarted this server we had many thousands of those "deleted" with "zfs destroy -d FS@SNAPSHOT". Hmm.. I wonder if it would be possible to make this more visible somehow - perhaps a "destroyed-snapshots" counter via a sysctl or something? And/or have some way to pause the snapshot-deletion thread (similar to "zfs scrub -p").
This isn't FreeBSD specific. I think you should open a feature request upstream at https://github.com/openzfs/zfs and close this issue.
(In reply to Alan Somers from comment #2) Agreed, especially given the transition to OpenZFS: * an upstream report may be appropriate. Thank you