Created attachment 215581 [details] example mmap usage I'm the author of the FUSE based filesystem mergerfs. A user recently reported that after updating to 12.1 some client software to mergerfs was locking up. rtorrent in particular. rtorrent uses mmap. Linux's FUSE implementation is unable to handle mmap when direct_io is enabled so I recommend to users not enable direct_io who need any software leveraging mmap. The user hadn't read the docs and was using direct_io on FreeBSD. It appears that mmap does work on FreeBSD's FUSE implementation when direct_io is used so he had no issues. Once updated his setup started blocking and the apps became unkillable (waiting on IO). I wrote a simple mmap example to read and write to a shared mapped file and could reproduce the issue. When direct_io is off everything works fine. If I enable direct_io I will see read requests come into the mergerfs server and so long as I only read from the mapped memory it works fine. When writing however as soon as it needs to flush it seems to lock up. I don't see any write commands come in, the client app blocks and is unkillable, other calls into the filesystem seem to work briefly and then block. A stack trace of mergerfs seems to indicate that it is working as normal and could take requests. If mergerfs is killed none of the clients receive an error from the syscalls they are blocked on. I've been able to reproduce this with sshfs by simply adding `direct_io` to the mount options. Attached is an example that triggers it.
Thank you for the report and reproducer. If you are able to obtain a kernel syscall trace exhibiting the lockup, that might also prove handy
Thanks for the bug report! Just to be clear, how should we use the attached program? If it's meant to be used with sshfs, could you please provide the exact sshfs command line you used?
(In reply to Alan Somers from comment #2) Yes, it will work with sshfs. $ sshfs -o direct_io <src> <dst> $ gcc -o /tmp/mmap-write mmap-write.c $ cd <dst> $ /tmp/mmap-write It'll print the address offset and a value written into the location. It just blocks at the end.
FYI, `procstat -kk <pid>` can be used to show the kernel stack of a stuck process.
(In reply to Conrad Meyer from comment #4) Thanks. PID TID COMM TDNAME KSTACK 783 100068 mmap-write - mi_switch+0xe2 sleepq_wait+0x2c _sleep+0x247 vm_page_busy_sleep+0x8f vm_object_page_remove+0x203 vn_pages_remove+0x52 fuse_io_dispatch+0xebd VOP_WRITE_APV+0xec vnode_pager_generic_putpages+0x6ba VOP_PUTPAGES_APV+0x7c vnode_pager_putpages+0x84 vm_pageout_flush+0xed vm_object_page_collect_flush+0x1f2 vm_object_page_clean+0x146 vinactive+0xae vputx+0x2c3 vn_close1+0x181 vn_closefile+0x4c
Isn't this somehow related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246886 ? I thought there was something wrong with 12's sendfile but if fusefs also has a problem, the above problem gets more complicated. In my case, the problem occurs even if direct_io is disabled, and sendfile deadlocks with vm_page_busy_sleep, not a fusefs program. However, everything seems OK with CURRENT.
Although I posted to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246886, I'd like to also add a comment here. I back-ported 12.0-R's fusefs to 12-STABLE and tested my program. As a result, no error occurred and I cannot help guessing there is something wrong with 12.1's fusefs. I can test new fusefs if modified.
No Hiroshi I don't think this is related to 246886. The stacks are completely different, and this bug is definitely particular to the use of direct_io.
(In reply to Alan Somers from comment #8) OK. However, I'll check the difference of 12.0 and 12.1 fusefs codes.
Reproduced on head with a minimal test case.
Code review in progress
A commit references this bug: Author: asomers Date: Thu Sep 24 16:27:53 UTC 2020 New revision: 366121 URL: https://svnweb.freebsd.org/changeset/base/366121 Log: fusefs: fix mmap'd writes in direct_io mode If a FUSE server returns FOPEN_DIRECT_IO in response to FUSE_OPEN, that instructs the kernel to bypass the page cache for that file. This feature is also known by libfuse's name: "direct_io". However, when accessing a file via mmap, there is no possible way to bypass the cache completely. This change fixes a deadlock that would happen when an mmap'd write tried to invalidate a portion of the cache, wrongly assuming that a write couldn't possibly come from cache if direct_io were set. Arguably, we could instead disable mmap for files with FOPEN_DIRECT_IO set. But allowing it is less likely to cause user complaints, and is more in keeping with the spirit of open(2), where O_DIRECT instructs the kernel to "reduce", not "eliminate" cache effects. PR: 247276 Reported by: trapexit@spawn.link Reviewed by: cem MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D26485 Changes: head/sys/fs/fuse/fuse_io.c head/tests/sys/fs/fusefs/write.cc
A commit references this bug: Author: asomers Date: Sun Sep 27 02:59:29 UTC 2020 New revision: 366190 URL: https://svnweb.freebsd.org/changeset/base/366190 Log: MFC r366121: fusefs: fix mmap'd writes in direct_io mode If a FUSE server returns FOPEN_DIRECT_IO in response to FUSE_OPEN, that instructs the kernel to bypass the page cache for that file. This feature is also known by libfuse's name: "direct_io". However, when accessing a file via mmap, there is no possible way to bypass the cache completely. This change fixes a deadlock that would happen when an mmap'd write tried to invalidate a portion of the cache, wrongly assuming that a write couldn't possibly come from cache if direct_io were set. Arguably, we could instead disable mmap for files with FOPEN_DIRECT_IO set. But allowing it is less likely to cause user complaints, and is more in keeping with the spirit of open(2), where O_DIRECT instructs the kernel to "reduce", not "eliminate" cache effects. PR: 247276 Reported by: trapexit@spawn.link Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D26485 Changes: _U stable/12/ stable/12/sys/fs/fuse/fuse_io.c stable/12/tests/sys/fs/fusefs/write.cc
A commit references this bug: Author: asomers Date: Mon Sep 28 00:24:00 UTC 2020 New revision: 366211 URL: https://svnweb.freebsd.org/changeset/base/366211 Log: MF stable/12 r366190: fusefs: fix mmap'd writes in direct_io mode If a FUSE server returns FOPEN_DIRECT_IO in response to FUSE_OPEN, that instructs the kernel to bypass the page cache for that file. This feature is also known by libfuse's name: "direct_io". However, when accessing a file via mmap, there is no possible way to bypass the cache completely. This change fixes a deadlock that would happen when an mmap'd write tried to invalidate a portion of the cache, wrongly assuming that a write couldn't possibly come from cache if direct_io were set. Arguably, we could instead disable mmap for files with FOPEN_DIRECT_IO set. But allowing it is less likely to cause user complaints, and is more in keeping with the spirit of open(2), where O_DIRECT instructs the kernel to "reduce", not "eliminate" cache effects. PR: 247276 Approved by: re (gjb) Reported by: trapexit@spawn.link Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D26485 Changes: _U releng/12.2/ releng/12.2/sys/fs/fuse/fuse_io.c releng/12.2/tests/sys/fs/fusefs/write.cc