Summary: | [zfs] unkillable "zpool scrub" in [tx->tx_sync_done_cv] state for damaged data | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Eugene Grosbein <eugen> | ||||
Component: | kern | Assignee: | freebsd-fs (Nobody) <fs> | ||||
Status: | New --- | ||||||
Severity: | Affects Some People | CC: | pi, stable | ||||
Priority: | --- | ||||||
Version: | 11.2-STABLE | ||||||
Hardware: | Any | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Eugene Grosbein
2018-07-11 11:10:29 UTC
I am not too surprised. The pool configuration is not redundant and the whole top level vdev is corrupted. I suspect that the scrub command needs to write something to the pool to record the initial scrub state. And it's quite likely that it needs to perform Read-Modify-Write. And the read fails and the pool gets suspended. zpool scrub command is stuck waiting for confirmation that the scrub is actually started. procstat -kk -a would paint a fuller picture. Maybe there is something reported in dmesg too, but not sure. (In reply to Andriy Gapon from comment #1) Nothing in the dmesg output. Procstat output is huge, so I compressed it, see attachment. Created attachment 195052 [details]
procstat -kk -a output
Please do not put bugs on stable@, current@, hackers@, etc (In reply to Eugene Grosbein from comment #3) 5 101937 zfskern txg_thread_enter mi_switch+0xc5 sleepq_wait+0x2c _cv_wait+0x160 zio_resume_wait+0x4b spa_sync+0xd46 txg_sync_thread+0x25e fork_exit+0x75 fork_trampoline+0xe 3249 101681 zpool - mi_switch+0xc5 sleepq_wait+0x2c _cv_wait+0x160 txg_wait_synced+0xa5 dsl_sync_task_common+0x219 dsl_sync_task+0x14 dsl_scan+0x9e zfs_ioc_pool_scan+0x5a zfsdev_ioctl+0x6c2 devfs_ioctl_f+0x12d kern_ioctl+0x212 sys_ioctl+0x15c amd64_syscall+0x25c fast_syscall_common+0x101 So, unfortunately, this is how ZFS works now. It is reproduceacble exactly same way under 13.2-PRERELEASE/amd64 with stock ZFS. |