Bug 250576 - PHP-FPM processes hang in UFS-state
Summary: PHP-FPM processes hang in UFS-state
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.2-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-24 12:30 UTC by Christos Chatzaras
Modified: 2020-11-23 01:06 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christos Chatzaras 2020-10-24 12:30:30 UTC
This issue happened in 3 different servers the last months.

Some PHP-FPM processes hang in UFS-state.

The websites that cause the hang are all Prestashop and it's related to their var/cache files (I believe it happens when they clear the cache and the files are regenerated).


This is a php-fpm process that hang:

procstat -kk 73548

  PID    TID COMM                TDNAME              KSTACK
73548 100630 php-fpm             -                   mi_switch+0xe2 sleepq_wait+0x2c sleeplk+0x185 lockmgr_xlock_hard+0x1f5 __lockmgr_args+0x2bf ffs_lock+0xa5 VOP_LOCK1_APV+0x7c _vn_lock+0x44 vget+0x4a vfs_hash_get+0xb0 ffs_vgetf+0x3f softdep_sync_buf+0xbee ffs_syncvnode+0x26f ffs_truncate+0x71b ufs_direnter+0x5f8 ufs_makeinode+0x5fb ufs_create+0x34 VOP_CREATE_APV+0x76 


procstat -kk -a :

https://www.coderz.gr/ufs/procstat.txt


mount | grep home :

https://www.coderz.gr/ufs/mount.txt


dumpfs /dev/mirror/gm0p7 :

https://www.coderz.gr/ufs/dumpfs.txt


I have to force reboot the server as /home is not possible to clean unmount. Then I run fsck which fixes some issues because of the unclean reboot. Then I run fsck again that shows no errors. Then I mount /home and first rename the cache folders and finally delete var/cache and allow it to regenerate prestashop files. But the same issue happens again after few days.


Any idea how to troubleshoot it?
Comment 1 Christos Chatzaras 2020-10-25 09:55:26 UTC
After discussion in freebsd-fs mailing list maybe it caused by a quota file lock in syncer:

30 100097 syncer              -                   mi_switch+0xe2 sleepq_wait+0x2c sleeplk+0x185 lockmgr_xlock_hard+0x1f5 __lockmgr_args+0x2bf ffs_lock+0xa5 VOP_LOCK1_APV+0x7c _vn_lock+0x44 vget+0x4a qsync+0x99 ffs_sync+0x7d5 sync_fsync+0x10f VOP_FSYNC_APV+0x7c sched_sync+0x379 fork_exit+0x83 fork_trampoline+0xe
Comment 2 Li-Wen Hsu freebsd_committer 2020-10-29 02:13:08 UTC
Can you try to check if this is still happening on 12.2 and even -CURRENT?
Comment 3 Christos Chatzaras 2020-10-29 06:27:32 UTC
I already upgrade to 12.2 but I can't easily reproduce the issue. I have to wait some time for my clients to do Prestashop string translations, which regenerates the Prestashop cache every time they save the translations.
Comment 4 Christos Chatzaras 2020-11-17 09:32:17 UTC
It happen again with 12.2-RELEASE.
Comment 5 Christos Chatzaras 2020-11-17 14:18:02 UTC
I setup a test server and successfully reproduce the issue.

I will compile a new kernel with these options:

makeoptions 	DEBUG=-g
options 	INVARIANTS
options 	INVARIANT_SUPPORT
options 	WITNESS
options 	WITNESS_SKIPSPIN
options 	DEBUG_LOCKS
options 	DEBUG_VFS_LOCKS
options 	DIAGNOSTIC

and try again to reproduce the issue.
Comment 6 Christos Chatzaras 2020-11-18 11:41:54 UTC
With "quotaoff -a" the issue still exist.

The issue on 12.2 does NOT exist if I disable Soft Updates.

Then I upgrade from 12.2 to CURRENT and the issue does NOT exist at all, even with Soft Updates enabled.
Comment 7 Christos Chatzaras 2020-11-18 16:45:01 UTC
With 12-STABLE the issue exist.
Comment 8 andrew 2020-11-19 21:46:37 UTC
I have been helping to investigate this, and while I don't yet understand it completely, the key factors of the deadlock seem to be this:

Given the directory structure a/b/ where files are being created and removed as a/b/XXX, while at the same time another process is doing fstat("a/b") or possibly fstat("a/b/XXX"), we get these two backtraces (note that this can only happen if softupdates is enabled, softupdates journalling is NOT enabled, and "slowdown" mode is in effect):

Process A:

#11 0xffffffff80c8d31a in vget (vp=0xfffff802cae1c3c0, flags=540672, td=0xfffff80026ea3740) at /usr/src/sys/kern/vfs_subr.c:2710
#12 0xffffffff80c7f0d0 in vfs_hash_get (mp=0xfffff8000402c000, hash=31059293, flags=524288, td=0xfffff80026ea3740, 
    vpp=0xfffffe00c59661d0, fn=0x0, arg=0x0) at /usr/src/sys/kern/vfs_hash.c:92
#13 0xffffffff80ecc62f in ffs_vgetf (mp=0xfffff8000402c000, ino=31059293, flags=524288, vpp=0xfffffe00c59661d0, ffs_flags=1)
    at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1662
#14 0xffffffff80ec1dbf in flush_pagedep_deps (pvp=0xfffff8028a1f6d20, mp=<optimized out>, diraddhdp=<optimized out>)
    at /usr/src/sys/ufs/ffs/ffs_softdep.c:13206
#15 softdep_sync_buf (vp=<optimized out>, bp=<optimized out>, waitfor=<optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:12835
#16 0xffffffff80ed22a0 in ffs_syncvnode (vp=0xfffff8028a1f6d20, waitfor=1, flags=<optimized out>)
    at /usr/src/sys/ufs/ffs/ffs_vnops.c:300
#17 0xffffffff80ea521d in ffs_truncate (vp=<optimized out>, length=83456, flags=<optimized out>, cred=0xfffff80218f71500)
    at /usr/src/sys/ufs/ffs/ffs_inode.c:410
#18 0xffffffff80ed9d3e in ufs_direnter (dvp=0xfffff8028a1f6d20, tvp=0xfffff8018dbf2780, dirp=<optimized out>, cnp=<optimized out>, 
    newdirbp=0x0, isrename=<optimized out>) at /usr/src/sys/ufs/ufs/ufs_lookup.c:1134

Process B:

#11 0xffffffff80c8d31a in vget (vp=0xfffff8028a1f6d20, flags=2121728, td=0xfffff802180d8000) at /usr/src/sys/kern/vfs_subr.c:2710
#12 0xffffffff80c72ce7 in cache_lookup (dvp=0xfffff802cae1c3c0, vpp=<optimized out>, cnp=<optimized out>, tsp=<optimized out>, 
    ticksp=<optimized out>) at /usr/src/sys/kern/vfs_cache.c:1404
#13 0xffffffff80c77651 in vfs_cache_lookup (ap=<optimized out>) at /usr/src/sys/kern/vfs_cache.c:2162
#14 0xffffffff81247975 in VOP_LOOKUP_APV (vop=0xffffffff81b61c80 <ufs_vnodeops>, a=0xfffffe00c5bc3568) at vnode_if.c:127
#15 0xffffffff80c81331 in VOP_LOOKUP (dvp=0xfffff802cae1c3c0, vpp=0xfffffe00c5bc3810, cnp=0xfffffe00c5bc3840) at ./vnode_if.h:54
#16 lookup (ndp=0xfffffe00c5bc37b0) at /usr/src/sys/kern/vfs_lookup.c:989

Here, vnode 0x...6d20 is the directory "b", while 0x...c3c0 is directory "a".

So, somehow (and this part I haven't found out yet), directory "b" has acquired a pagedep entry pointing to directory "a". I suppose this could be for the ".." entry but I haven't managed to verify that yet. Process A has directory "b" exclusively locked, and is trying to do a vget(LK_EXCLUSIVE) on "a"; process B has directory "a" share-locked, and is trying to vget(LK_SHARED) directory "b", which of course gives us a classic deadlock scenario.
Comment 9 Christos Chatzaras 2020-11-20 07:48:45 UTC
I enable journaling and the issue still exist.
Comment 10 andrew 2020-11-20 09:56:39 UTC
(In reply to Christos Chatzaras from comment #9)

Can you show the procstat -kk output for this case? Need to see if it's the same.
Comment 11 Christos Chatzaras 2020-11-20 10:13:14 UTC
With SU+J:


procstat -kk -a | grep syncer

   27 100093 syncer              -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_xlock_hard+0x1a4 __lockmgr_args+0x61c ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a qsync+0x98 ffs_sync+0x77f sync_fsync+0x10d VOP_FSYNC_APV+0x7b sched_sync+0x3bd fork_exit+0x7e fork_trampoline+0xe 

-----

procstat -kk -a | grep php-fpm

 8939 100373 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_slock_hard+0x19b __lockmgr_args+0xe6 ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a cache_lookup+0x4c7 vfs_cache_lookup+0x81 VOP_LOOKUP_APV+0x75 lookup+0x7a1 namei+0x544 kern_statat+0x97 sys_fstatat+0x2f amd64_syscall+0x387 fast_syscall_common+0xf8

16925 100271 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_slock_hard+0x19b __lockmgr_args+0xe6 ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a cache_lookup+0x4c7 vfs_cache_lookup+0x81 VOP_LOOKUP_APV+0x75 lookup+0x7a1 namei+0x544 kern_statat+0x97 sys_fstatat+0x2f amd64_syscall+0x387 fast_syscall_common+0xf8

17481 100276 php-fpm             -                   mi_switch+0xd4 sleepq_catch_signals+0x403 sleepq_timedwait_sig+0x14 _sleep+0x1b3 kqueue_kevent+0xaee kern_kevent+0xb5 kern_kevent_generic+0x70 sys_kevent+0x61 amd64_syscall+0x387 fast_syscall_common+0xf8

22299 100301 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_slock_hard+0x19b __lockmgr_args+0xe6 ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a cache_lookup+0x4c7 vfs_cache_lookup+0x81 VOP_LOOKUP_APV+0x75 lookup+0x7a1 namei+0x544 kern_statat+0x97 sys_fstatat+0x2f amd64_syscall+0x387 fast_syscall_common+0xf8

23872 100332 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_slock_hard+0x19b __lockmgr_args+0xe6 ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a cache_lookup+0x4c7 vfs_cache_lookup+0x81 VOP_LOOKUP_APV+0x75 lookup+0x7a1 namei+0x544 kern_statat+0x97 sys_fstatat+0x2f amd64_syscall+0x387 fast_syscall_common+0xf8

25070 100513 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_slock_hard+0x19b __lockmgr_args+0xe6 ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a cache_lookup+0x4c7 vfs_cache_lookup+0x81 VOP_LOOKUP_APV+0x75 lookup+0x7a1 namei+0x544 kern_statat+0x97 sys_fstatat+0x2f amd64_syscall+0x387 fast_syscall_common+0xf8

27452 100328 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_slock_hard+0x19b __lockmgr_args+0xe6 ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a cache_lookup+0x4c7 vfs_cache_lookup+0x81 VOP_LOOKUP_APV+0x75 lookup+0x7a1 namei+0x544 kern_statat+0x97 sys_fstatat+0x2f amd64_syscall+0x387 fast_syscall_common+0xf8

40915 100315 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_slock_hard+0x19b __lockmgr_args+0xe6 ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a cache_lookup+0x4c7 vfs_cache_lookup+0x81 VOP_LOOKUP_APV+0x75 lookup+0x7a1 namei+0x544 kern_accessat+0x13c amd64_syscall+0x387 fast_syscall_common+0xf8

41912 100218 php-fpm             -                   mi_switch+0xd4 sleepq_catch_signals+0x403 sleepq_timedwait_sig+0x14 _sleep+0x1b3 kqueue_kevent+0xaee kern_kevent+0xb5 kern_kevent_generic+0x70 sys_kevent+0x61 amd64_syscall+0x387 fast_syscall_common+0xf8

45217 100372 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_xlock_hard+0x1a4 __lockmgr_args+0x61c ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a vfs_hash_get+0xb0 ffs_vgetf+0x3f softdep_sync_buf+0xaaf ffs_syncvnode+0x270 ffs_truncate+0x6ad ufs_direnter+0x5fe ufs_makeinode+0x633 ufs_create+0x34 VOP_CREATE_APV+0x75

51905 100228 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_slock_hard+0x19b __lockmgr_args+0xe6 ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a cache_lookup+0x4c7 vfs_cache_lookup+0x81 VOP_LOOKUP_APV+0x75 lookup+0x7a1 namei+0x544 kern_statat+0x97 sys_fstatat+0x2f amd64_syscall+0x387 fast_syscall_common+0xf8
64907 100233 php-fpm             -                   mi_switch+0xd4 sleepq_catch_signals+0x403 sleepq_timedwait_sig+0x14 _sleep+0x1b3 kqueue_kevent+0xaee kern_kevent+0xb5 kern_kevent_generic+0x70 sys_kevent+0x61 amd64_syscall+0x387 fast_syscall_common+0xf8

77697 100232 php-fpm             -                   mi_switch+0xd4 sleepq_wait+0x2c sleeplk+0x11d lockmgr_slock_hard+0x19b __lockmgr_args+0xe6 ffs_lock+0x7c VOP_LOCK1_APV+0x7b _vn_lock+0x44 vget+0x4a cache_lookup+0x4c7 vfs_cache_lookup+0x81 VOP_LOOKUP_APV+0x75 lookup+0x7a1 namei+0x544 kern_fchmodat+0xb9 amd64_syscall+0x387 fast_syscall_common+0xf8

93356 100282 php-fpm             -                   mi_switch+0xd4 sleepq_catch_signals+0x403 sleepq_timedwait_sig+0x14 _sleep+0x1b3 kqueue_kevent+0xaee kern_kevent+0xb5 kern_kevent_generic+0x70 freebsd11_kevent+0x61 amd64_syscall+0x387 fast_syscall_common+0xf8
Comment 12 andrew 2020-11-20 11:32:20 UTC
(In reply to Christos Chatzaras from comment #11)

> ffs_vgetf+0x3f softdep_sync_buf+0xaaf ffs_syncvnode+0x270 ffs_truncate+0x6ad ufs_direnter+0x5fe ufs_makeinode+0x633 ufs_create+0x34 VOP_CREATE_APV+0x75

Ok, so that's the same sequence of calls, but it's getting there via a different path; rather than the ffs_syncvnode call on line 410 of ffs_inode.c, which is what I found in the previous stack trace and which isn't called when journalling is on, it must have hit the one at line 458, which is called even when journalling is on.

Can you get the crash dump for this one like before, and I'll see about getting more info out of it? It seems to be the same basic problem (of a subdir having a softupdates dependency on its parent dir), just a slightly different code path.
Comment 13 andrew 2020-11-20 14:07:23 UTC
(In reply to andrew from comment #12)

I was wrong about it taking a different path there, and I've uncovered a bug, but I don't think the bug I found accounts for the original problem.

These lines of code (ffs_truncate, lines 216 and 259 of ffs_inode.c in my 12-stable snapshot):

>     int softdeptrunc, journaltrunc;

>     journaltrunc = DOINGSUJ(vp);

and this definition of DOINGSUJ(vp) from line 198 of ufs_inode.h:

> #define	DOINGSUJ(vp)	   ((vp)->v_mount->mnt_flag & MNT_SUJ)

and this definition of MNT_SUJ (from mount.h line 302):

> #define	MNT_SUJ		0x0000000100000000ULL /* using journaled soft updates */

combine to cause the value of journaltrunc to always be 0, because the result of DOINGSUJ() is out of range for an integer.

So DOINGSUJ needs to be defined as  (((vp)->v_mount->mnt_flag & MNT_SUJ) != 0)  or something of that ilk.

That explains why enabling journalling did not prevent the ffs_syncvnode call from line 410, but there's still the original problem to track down.
Comment 14 andrew 2020-11-20 14:48:24 UTC
(In reply to andrew from comment #13)

Also, the above seems to have been fixed in -current, and then MFC'd, and then the MFC was reverted? what's with that?
Comment 15 Christos Chatzaras 2020-11-22 15:38:28 UTC
The last revision that issue exist:

https://svnweb.freebsd.org/changeset/base/367667 - kernel version 1300129

The first revision that issue does NOT exist:

https://svnweb.freebsd.org/changeset/base/367672 - kernel version 1300129

Between these revisions I see these commits:

svn commit: r367668 - in head/sys/ufs: ffs ufs   Konstantin Belousov
svn commit: r367669 - head/sys/ufs/ffs   Konstantin Belousov
svn commit: r367670 - head/sys/ufs/ffs   Konstantin Belousov
svn commit: r367671 - head/sys/ufs/ffs   Konstantin Belousov
svn commit: r367672 - in head/sys/ufs: ffs ufs   Konstantin Belousov

Also please check this:

https://reviews.freebsd.org/D26136

Could this be related?
Comment 16 Mark Johnston freebsd_committer 2020-11-23 00:54:22 UTC
(In reply to Christos Chatzaras from comment #15)
Indeed, based on the backtraces you showed earlier in the PR I suspect that you are hitting the VFS lock order reversal that was fixed by kib's patch series.  Currently they are not tagged for MFC to stable/12, but I'm not sure exactly what might be blocking that.
Comment 17 Mark Johnston freebsd_committer 2020-11-23 01:06:20 UTC
(In reply to andrew from comment #13)
A fix for this bug was attempted in r359612 but had to be reverted in r359760.  I'm not familiar with the details.