Bug 229694 - [zfs] unkillable "zpool scrub" in [tx->tx_sync_done_cv] state for damaged data
Summary: [zfs] unkillable "zpool scrub" in [tx->tx_sync_done_cv] state for damaged data
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.2-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-fs mailing list
Depends on:
Reported: 2018-07-11 11:10 UTC by Eugene Grosbein
Modified: 2019-02-13 10:10 UTC (History)
2 users (show)

See Also:

procstat -kk -a output (4.96 KB, application/x-xz)
2018-07-11 13:58 UTC, Eugene Grosbein
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene Grosbein freebsd_committer 2018-07-11 11:10:29 UTC

"zpool scrub" may hang in an uninterruptable disk i/o state in case of damaged pool data for 11.2-STABLE/amd64 r335757. This is easily reproduceable using file-backed ZFS pool when files reside on another ("real") pool:

cd dir # resides on ZFS
rm -f vdev1 vdev2
truncate -s ${size}m vdev1 vdev2
zpool create ztest $(realpath vdev1)
zpool add ztest $(realpath vdev2)
# simulate data corruption
dd if=/dev/urandom of=vdev2 bs=1m count=${size}
zpool scrub ztest

The last command "zpool scrub" always hangs here:

load: 0.53  cmd: zpool 2130 [tx->tx_sync_done_cv] 34.59r 0.00u 0.00s 0% 3692k

"kill -9" cannot kill it.
Comment 1 Andriy Gapon freebsd_committer 2018-07-11 12:08:48 UTC
I am not too surprised.  The pool configuration is not redundant and the whole top level vdev is corrupted.  I suspect that the scrub command needs to write something to the pool to record the initial scrub state.  And it's quite likely that it needs to perform Read-Modify-Write.  And the read fails and the pool gets suspended.  zpool scrub command is stuck waiting for confirmation that the scrub is actually started.

procstat -kk -a would paint a fuller picture.
Maybe there is something reported in dmesg too, but not sure.
Comment 2 Eugene Grosbein freebsd_committer 2018-07-11 13:58:22 UTC
(In reply to Andriy Gapon from comment #1)

Nothing in the dmesg output. Procstat output is huge, so I compressed it, see attachment.
Comment 3 Eugene Grosbein freebsd_committer 2018-07-11 13:58:55 UTC
Created attachment 195052 [details]
procstat -kk -a output
Comment 4 Rodney W. Grimes freebsd_committer 2019-02-13 02:00:51 UTC
Please do not put bugs on stable@, current@, hackers@, etc
Comment 5 Andriy Gapon freebsd_committer 2019-02-13 10:10:45 UTC
(In reply to Eugene Grosbein from comment #3)
    5 101937 zfskern             txg_thread_enter    mi_switch+0xc5 sleepq_wait+0x2c _cv_wait+0x160 zio_resume_wait+0x4b spa_sync+0xd46 txg_sync_thread+0x25e fork_exit+0x75 fork_trampoline+0xe 

 3249 101681 zpool               -                   mi_switch+0xc5 sleepq_wait+0x2c _cv_wait+0x160 txg_wait_synced+0xa5 dsl_sync_task_common+0x219 dsl_sync_task+0x14 dsl_scan+0x9e zfs_ioc_pool_scan+0x5a zfsdev_ioctl+0x6c2 devfs_ioctl_f+0x12d kern_ioctl+0x212 sys_ioctl+0x15c amd64_syscall+0x25c fast_syscall_common+0x101

So, unfortunately, this is how ZFS works now.