Bug 253063 - Hanging zfs processes after upgrade from 12.1 to 12.2-stable
Summary: Hanging zfs processes after upgrade from 12.1 to 12.2-stable
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.2-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2021-01-28 13:35 UTC by Markus Wild
Modified: 2021-01-29 07:29 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Wild 2021-01-28 13:35:03 UTC
After a recent upgrade from 12.1 to 12.2-stable (stable/12-c1-ge82353f84),
several zfs processes hang in different wait channels. 

This is a system with short of 1000 zfs filesystems, making frequent snapshot
based send/receive backups from a primary data pool to a backup pool using
znapzend. 

ps axl | awk '/zfs / { print $9}' | sort | uniq -c
  20 rrl->rr_
  28 tq_qdrai
   4 tx->tx_s

some sample for each:

   0 95220 94050 23  28  0   13000    3372 rrl->rr_ D     -      0:00.74 zfs recv -F backup1/servi
   0 87914 85482 22  25  0   13000    3308 tq_qdrai D     -      0:00.55 zfs recv -F backup1/servi
   0 77834 77117  3  27  0   13000    2716 tx->tx_s D     -      0:00.74 zfs recv -F backup1/servi

  PID    TID COMM                TDNAME              KSTACK                       
95220 104268 zfs                 -                   mi_switch+0xd4 sleepq_wait+0x2c _cv_wait+0xf2 rrw_enter_read_impl+0x8b zfs_register_callbacks+0x1c6 zfsvfs_setup+0x18 zfs_resume_fs+0xc0 zfs_ioc_recv+0xb53 zfsdev_ioctl+0x62d devfs_ioctl+0xb0 VOP_IOCTL_APV+0x7b vn_ioctl+0x16a devfs_ioctl_f+0x1e kern_ioctl+0x2b7 sys_ioctl+0xfa amd64_syscall+0x387 fast_syscall_common+0xf8 

87914 103712 zfs                 -                   mi_switch+0xd4 sleepq_wait+0x2c _sleep+0x253 taskqueue_drain_all+0xe1 zfsdev_ioctl+0x7e3 devfs_ioctl+0xb0 VOP_IOCTL_APV+0x7b vn_ioctl+0x16a devfs_ioctl_f+0x1e kern_ioctl+0x2b7 sys_ioctl+0xfa amd64_syscall+0x387 fast_syscall_common+0xf8 

77834 104829 zfs                 -                   mi_switch+0xd4 sleepq_wait+0x2c _cv_wait+0xf2 txg_wait_synced_impl+0xa9 txg_wait_synced+0xb dsl_sync_task_common+0x230 dsl_sync_task+0x1a dmu_recv_end+0x67 zfs_ioc_recv+0xb3d zfsdev_ioctl+0x62d devfs_ioctl+0xb0 VOP_IOCTL_APV+0x7b vn_ioctl+0x16a devfs_ioctl_f+0x1e kern_ioctl+0x2b7 sys_ioctl+0xfa amd64_syscall+0x387 fast_syscall_common+0xf8 


this starts to happen after a couple of hours of uptime, not immediately. I 
wanted to check my previous 12.1 version, but bectl hangs as well..

These processes are unkillable, and I'll be forced to reboot the system hard,
because it won't shut down properly (at least not within reasonable amount of
time).
Comment 1 Peter Eriksson 2021-01-28 18:59:34 UTC
(In reply to Markus Wild from comment #0)

How much free space is there in the filesystems - any filesystems near quota limits?
Comment 2 Markus Wild 2021-01-28 19:41:08 UTC
(In reply to Peter Eriksson from comment #1)
Both data pools are in single digit capacity percentage, and no filesystem with a 
quota is anywhere near the limit. I've now rebooted the system, so I can 
query some more zfs-related info if that helps.
Comment 3 Markus Wild 2021-01-29 07:29:37 UTC
Seeing the recent FreeBSD-EN-21:04.zfs advisory about changes to
zfs receive, could this have caused different locking behavior, somehow
causing deadlocks on a busy system? My 12.2-STABLE system is post this patch.