I ran out of swap state, so I added more swap (md98 none swap sw,file=/thedisk/swap,late 0 0), swapon -aL. At this time these swap_pager messages appeared: > Feb 2 09:00:00 yv newsyslog[23791]: logfile turned over due to size>100K > Feb 2 09:03:09 yv kernel: pid 23931 (chrome), uid 1001: exited on signal 10 > Feb 2 09:03:34 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1049391, size: 8192 > Feb 2 09:03:34 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1049444, size: 8192 > Feb 2 09:04:03 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1048721, size: 4096 > Feb 2 09:04:03 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1063141, size: 8192 > Feb 2 09:04:03 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1049652, size: 4096 > Feb 2 09:04:38 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1048880, size: 16384 > Feb 2 09:04:38 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1049638, size: 4096 > Feb 2 09:04:38 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1062692, size: 12288 > Feb 2 09:04:38 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1068865, size: 8192 > Feb 2 09:04:41 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1078142, size: 45056 > Feb 2 09:04:46 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1049896, size: 4096 > Feb 2 09:04:51 yv kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1079764, size: 61440 > Feb 2 09:28:20 yv kernel: Limiting closed port RST response from 295 to 200 packets/sec > Feb 2 10:31:41 yv kernel: Limiting closed port RST response from 234 to 200 packets/sec Soon after, I manually ran sync(8) and sync froze in D+ state: > $ ps ax | grep sync > 22 - DL 28:57.73 [syncer] > 40832 7 D+ 0:00.08 sync > 88531 9 S+ 0:00.00 grep sync Please feel free to close it if you think this isn't actionable. But I thought I should report this, because now this system needs to be rebooted to get rid of that process with pid=40832.
What was a filesystem for that swap file? Also, procstat -k[k] could be of use to get more insight about where the thread is stuck.
(In reply to Andriy Gapon from comment #1) > What was a filesystem for that swap file? The newly added swap file is on zfs. The previously existed swap file is on ufs. > $ procstat -kk 40832 > PID TID COMM TDNAME KSTACK > 40832 101728 sync - mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_synced+0xa5 dmu_tx_assign+0x48 zil_commit_impl+0x8e zfs_sync+0xa6 sys_sync+0x116 amd64_syscall+0xa38 fast_syscall_common+0x101 > $ procstat -kk 88582 > PID TID COMM TDNAME KSTACK > 88582 103868 sync - mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_synced+0xa5 dmu_tx_assign+0x48 zil_commit_impl+0x8e zfs_sync+0xa6 sys_sync+0x116 amd64_syscall+0xa38 fast_syscall_common+0x101
That's probably the reason. Swap-on-file-on-ZFS and swap-on-ZFS as well are known problematic configurations for FreeBSD. Most likely the txg_sync thread is stuck in memory allocation. You can (could) check that with procstat as well.
(In reply to Andriy Gapon from comment #3) > $ procstat -t 40832 > PID TID COMM TDNAME CPU PRI STATE WCHAN > 40832 101728 sync - -1 120 sleep tx->tx_s > $ procstat -t 88582 > PID TID COMM TDNAME CPU PRI STATE WCHAN > 88582 103868 sync - -1 120 sleep tx->tx_s
(In reply to Yuri Victorovich from comment #4) procstat -k -a | fgrep txg_sync
(In reply to Andriy Gapon from comment #5) This procstat command didn't return anything: > $ procstat -k -a | fgrep txg_sync
(In reply to Yuri Victorovich from comment #6) As root?
(In reply to Andriy Gapon from comment #7) > $ sudo procstat -k -a | fgrep txg_sync > 9 100439 zfskern txg_thread_enter mi_switch sleepq_timedwait _cv_timedwait_sbt txg_sync_thread fork_exit fork_trampoline > 9 100946 zfskern txg_thread_enter mi_switch sleepq_timedwait _cv_timedwait_sbt txg_sync_thread fork_exit fork_trampoline > 9 101226 zfskern txg_thread_enter mi_switch sleepq_wait _cv_wait zio_wait dsl_pool_sync spa_sync txg_sync_thread fork_exit fork_trampoline
(In reply to Yuri Victorovich from comment #8) Try to save the complete procstat -kk -a output, if you haven't already, and search for arc_ calls. Or/and just attach it here.
P.S. https://wiki.freebsd.org/AndriyGapon/AvgZfsDeadlockDebug
(In reply to Andriy Gapon from comment #9) > $ sudo procstat -kk -a | grep arc_ > 9 100125 zfskern arc_reclaim_thread mi_switch+0xe6 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x18c arc_reclaim_thread+0x286 fork_exit+0x83 fork_trampoline+0xe > 9 100126 zfskern arc_dnlc_evicts_thr mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e arc_dnlc_evicts_thread+0xff fork_exit+0x83 fork_trampoline+0xe > 9 100145 zfskern l2arc_feed_thread mi_switch+0xe6 sleepq_timedwait+0x2f _cv_timedwait_sbt+0x18c l2arc_feed_thread+0x2e9 fork_exit+0x83 fork_trampoline+0xe