fuse(4) ignores errors from FUSE_RELEASE, which means that close(2) of a fuse file always succeeds (except for stuff like EBADF, of course). This is a problem for fuse filesystems that have their own write caches and may need to return errors like EIO on close.
Is FUSE_RELEASE synchronous or asynchronous anyway? There seems to be some confusion in its documentation / behavior over time. It looks like we *could* detect and check some errors at FUSE_RELEASE time, assuming FUSE_RELEASE is synchronous, in fuse_filehandle_close via fdi.answ_stat. However, fuse_vnop_close (= VOP_CLOSE, = close(2)) doesn't actually close ordinary file filehandles; therefore, it doesn't issue a FUSE_RELEASE at all. We leave filehandles open until VOP_INACTIVE, at which point it is impossible to return an error to users anyway. The vnode has no references and therefore userspace has no fds associated with it. This pattern makes it impossible for close() to fail. If users want to ensure consistency it seems like they must manually fsync() before close(). UFS has the same behavior.
(In reply to Conrad Meyer from comment #1) FUSE doesn't distinguish between synchronous and asynchronous opcodes. It only distinguishes between "response expected" and "fire and forget" opcodes. I think FUSE_FORGET is the only one in the latter category as of protocol 7.8. Maybe we need to move the FUSE_RELEASE from fuse_vnop_inactive to fuse_vnop_close, because the libfuse documentation is pretty clear that filesystems can choose to return errors at that time.
Notably, close(2) manual page does not document that all cached data associated with the file must be synced by the time it returns, nor does it document an EIO error return. The main function of close() is to deallocate the filehandle associated with a given fd number. On FreeBSD, IIRC it always succeeds at that, even if EINTR is returned. POSIX 2008-2017 (latest, I think) has a longer description of the behavior POSIX systems are allowed or required to implement for close(2): http://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html It notes that EIO /may/ be returned, but if it is, the state of fildes is unspecified. It notes that "The close() operation itself need not block awaiting such [pending asynchronous] I/O completion." There are some special semantics required for sockets, STREAMS, ttys, pipes and FIFOs, but happily we only implement regular files and directories. I suspect this may be "Not a Bug."
(In reply to Alan Somers from comment #2) > FUSE doesn't distinguish between synchronous and asynchronous opcodes. It > only distinguishes between "response expected" and "fire and forget" > opcodes. I think FUSE_FORGET is the only one in the latter category as of > protocol 7.8. Sure. Initially I think I was confusing FORGET with RELEASE. But I also found some mailing list discussion that didn't make it clear whether a libfuse implementation is expected to synchronously RELEASE, or immediately return success and sync/close the file later. > Maybe we need to move the FUSE_RELEASE from fuse_vnop_inactive to > fuse_vnop_close, because the libfuse documentation is pretty clear that > filesystems can choose to return errors at that time. That violates FreeBSD VFS semantics, AFAIK. VOP_INACTIVE() is called when there are no more open handles. VOP_CLOSE() is called when a single user handle is closed. Yes, that means any error in INACTIVE cannot be percolated to userspace. The same is true of UFS. Applications in FreeBSD that want to guarantee a file is committed without IO error should issue fsync before close; that's true of UFS and FUSE.
(In reply to Alan Somers from comment #0) > fuse(4) ignores errors from FUSE_RELEASE, which means that close(2) of a > fuse file always succeeds (except for stuff like EBADF, of course). E.g., the same is true for UFS close(). > This is > a problem for fuse filesystems that have their own write caches and may need > to return errors like EIO on close. The same applies to e.g. dirty bufcache contents, or underlying media with dirty WB caching. Any flush error that after close() just cannot be reported to userspace. I think maybe this just isn't a bug.
So I got confused and wrote this bug wrong. The problem isn't with FUSE_RELEASE, it's with FUSE_FLUSH. FUSE_FLUSH is supposed to be called on every close of a file descriptor, unlike FUSE_RELEASE which is for the last close. fuse(4) could choose to implement FUSE_FLUSH and return errors during close(2) since VOP_CLOSE is synchronous with close(2). It seems that most VOP_CLOSE implementations currently throw away errors and return 0. Not all, however. nfs_close returns errors including EDQUOT and ENOSPC.
If there's an operation for flushing on each close, and it can report an error status (sounds like yes and yes), I think we should invoke it and propagate the error in our FUSE VOP_CLOSE(). Regardless of other VOP_CLOSE implementations. (OTOH, many VOP_CLOSE *callers* ignore the return value of VOP_CLOSE, but importantly vn_close1 <- vn_close <- vn_closefile <- fo_close <- _fdrop <- closef <- closefp <- kern_close propagates errors all the way from VOP_CLOSE to close(2).)
A commit references this bug: Author: asomers Date: Wed Apr 3 19:59:49 UTC 2019 New revision: 345852 URL: https://svnweb.freebsd.org/changeset/base/345852 Log: fusefs: send FUSE_FLUSH during VOP_CLOSE The FUSE protocol says that FUSE_FLUSH should be send every time a file descriptor is closed. That's not quite possible in FreeBSD because multiple file descriptors can share a single struct file, and closef doesn't call fo_close until the last close. However, we can still send FUSE_FLUSH on every VOP_CLOSE, which is probably good enough. There are two purposes for FUSE_FLUSH. One is to allow file systems to return EIO if they have an error when writing data that's cached server-side. The other is to release POSIX file locks (which fusefs(5) does not yet support). PR: 236405, 236327 Sponsored by: The FreeBSD Foundation Changes: projects/fuse2/sys/fs/fuse/fuse_file.c projects/fuse2/sys/fs/fuse/fuse_file.h projects/fuse2/sys/fs/fuse/fuse_io.c projects/fuse2/sys/fs/fuse/fuse_node.c projects/fuse2/sys/fs/fuse/fuse_vnops.c projects/fuse2/tests/sys/fs/fusefs/allow_other.cc projects/fuse2/tests/sys/fs/fusefs/flush.cc projects/fuse2/tests/sys/fs/fusefs/fsync.cc projects/fuse2/tests/sys/fs/fusefs/mockfs.cc projects/fuse2/tests/sys/fs/fusefs/open.cc projects/fuse2/tests/sys/fs/fusefs/release.cc projects/fuse2/tests/sys/fs/fusefs/utils.cc projects/fuse2/tests/sys/fs/fusefs/utils.hh projects/fuse2/tests/sys/fs/fusefs/write.cc
This is complete on the fuse2 branch.
A commit references this bug: Author: asomers Date: Wed Aug 7 00:38:30 UTC 2019 New revision: 350665 URL: https://svnweb.freebsd.org/changeset/base/350665 Log: fusefs: merge from projects/fuse2 This commit imports the new fusefs driver. It raises the protocol level from 7.8 to 7.23, fixes many bugs, adds a test suite for the driver, and adds many new features. New features include: * Optional kernel-side permissions checks (-o default_permissions) * Implement VOP_MKNOD, VOP_BMAP, and VOP_ADVLOCK * Allow interrupting FUSE operations * Support named pipes and unix-domain sockets in fusefs file systems * Forward UTIME_NOW during utimensat(2) to the daemon * kqueue support for /dev/fuse * Allow updating mounts with "mount -u" * Allow exporting fusefs file systems over NFS * Server-initiated invalidation of the name cache or data cache * Respect RLIMIT_FSIZE * Try to support servers as old as protocol 7.4 Performance enhancements include: * Implement FUSE's FOPEN_KEEP_CACHE and FUSE_ASYNC_READ flags * Cache file attributes * Cache lookup entries, both positive and negative * Server-selectable cache modes: writethrough, writeback, or uncached * Write clustering * Readahead * Use counter(9) for statistical reporting PR: 199934 216391 233783 234581 235773 235774 235775 PR: 236226 236231 236236 236291 236329 236381 236405 PR: 236327 236466 236472 236473 236474 236530 236557 PR: 236560 236844 237052 237181 237588 238565 Reviewed by: bcr (man pages) Reviewed by: cem, ngie, rpokala, glebius, kib, bde, emaste (post-commit review on project branch) MFC after: 3 weeks Relnotes: yes Sponsored by: The FreeBSD Foundation Pull Request: https://reviews.freebsd.org/D21110 Changes: _U head/ head/MAINTAINERS head/UPDATING head/etc/mtree/BSD.tests.dist head/sbin/mount_fusefs/mount_fusefs.8 head/sbin/mount_fusefs/mount_fusefs.c head/share/man/man5/fusefs.5 head/sys/fs/fuse/fuse.h head/sys/fs/fuse/fuse_device.c head/sys/fs/fuse/fuse_file.c head/sys/fs/fuse/fuse_file.h head/sys/fs/fuse/fuse_internal.c head/sys/fs/fuse/fuse_internal.h head/sys/fs/fuse/fuse_io.c head/sys/fs/fuse/fuse_io.h head/sys/fs/fuse/fuse_ipc.c head/sys/fs/fuse/fuse_ipc.h head/sys/fs/fuse/fuse_kernel.h head/sys/fs/fuse/fuse_main.c head/sys/fs/fuse/fuse_node.c head/sys/fs/fuse/fuse_node.h head/sys/fs/fuse/fuse_param.h head/sys/fs/fuse/fuse_vfsops.c head/sys/fs/fuse/fuse_vnops.c head/sys/sys/param.h head/tests/sys/fs/Makefile head/tests/sys/fs/fusefs/
A commit references this bug: Author: asomers Date: Sun Sep 15 04:14:35 UTC 2019 New revision: 352351 URL: https://svnweb.freebsd.org/changeset/base/352351 Log: MFC the new fusefs driver MFC r350665, r350990, r350992, r351039, r351042, r351061, r351066, r351113, r351560, r351961, r351963, r352021, r352025, r352230 r350665: fusefs: merge from projects/fuse2 This commit imports the new fusefs driver. It raises the protocol level from 7.8 to 7.23, fixes many bugs, adds a test suite for the driver, and adds many new features. New features include: * Optional kernel-side permissions checks (-o default_permissions) * Implement VOP_MKNOD, VOP_BMAP, and VOP_ADVLOCK * Allow interrupting FUSE operations * Support named pipes and unix-domain sockets in fusefs file systems * Forward UTIME_NOW during utimensat(2) to the daemon * kqueue support for /dev/fuse * Allow updating mounts with "mount -u" * Allow exporting fusefs file systems over NFS * Server-initiated invalidation of the name cache or data cache * Respect RLIMIT_FSIZE * Try to support servers as old as protocol 7.4 Performance enhancements include: * Implement FUSE's FOPEN_KEEP_CACHE and FUSE_ASYNC_READ flags * Cache file attributes * Cache lookup entries, both positive and negative * Server-selectable cache modes: writethrough, writeback, or uncached * Write clustering * Readahead * Use counter(9) for statistical reporting PR: 199934 216391 233783 234581 235773 235774 235775 PR: 236226 236231 236236 236291 236329 236381 236405 PR: 236327 236466 236472 236473 236474 236530 236557 PR: 236560 236844 237052 237181 237588 238565 Reviewed by: bcr (man pages) Reviewed by: cem, ngie, rpokala, glebius, kib, bde, emaste (post-commit review on project branch) Relnotes: yes Sponsored by: The FreeBSD Foundation Pull Request: https://reviews.freebsd.org/D21110 r350990: fusefs: add SVN Keywords to the test files Reported by: SVN pre-commit hooks MFC-With: r350665 Sponsored by: The FreeBSD Foundation r350992: fusefs: skip some tests when unsafe aio is disabled MFC-With: r350665 Sponsored by: The FreeBSD Foundation r351039: fusefs: fix intermittency in the default_permissions.Unlink.ok test The test needs to expect a FUSE_FORGET operation. Most of the time the test would pass anyway, because by chance FUSE_FORGET would arrive after the unmount. MFC-With: 350665 Sponsored by: The FreeBSD Foundation r351042: fusefs: Fix the size of fuse_getattr_in In FUSE protocol 7.9, the size of the FUSE_GETATTR request has increased. However, the fusefs driver is currently not sending the additional fields. In our implementation, the additional fields are always zero, so I there haven't been any test failures until now. But fusefs-lkl requires the request's length to be correct. Fix this bug, and also enhance the test suite to catch similar bugs. PR: 239830 MFC-With: 350665 Sponsored by: The FreeBSD Foundation r351061: fusefs: fix the 32-bit build after 351042 Reported by: jhb MFC-With: 351042 Sponsored by: The FreeBSD Foundation r351066: fusefs: fix conditional from r351061 The entirety of r351061 was a copy/paste error. I'm sorry I've been comitting so hastily. Reported by: rpokala Reviewed by: rpokala MFC-With: 351061 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21265 r351113: fusefs: don't send the namespace during listextattr The FUSE_LISTXATTR operation always returns the full list of a file's extended attributes, in all namespaces. There's no way to filter the list server-side. However, currently FreeBSD's fusefs driver sends a namespace string with the FUSE_LISTXATTR request. That behavior was probably copied from fuse_vnop_getextattr, which has an attribute name argument. It's been there ever since extended attribute support was added in r324620. This commit removes it. Reviewed by: cem Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21280 r351560: fusefs: Fix some bugs regarding the size of the LISTXATTR list * A small error in r338152 let to the returned size always being exactly eight bytes too large. * The FUSE_LISTXATTR operation works like Linux's listxattr(2): if the caller does not provide enough space, then the server should return ERANGE rather than return a truncated list. That's true even though in FUSE's case the kernel doesn't provide space to the client at all; it simply requests a maximum size for the list. We previously weren't handling the case where the server returns ERANGE even though the kernel requested as much size as the server had told us it needs; that can happen due to a race. * We also need to ensure that a pathological server that always returns ERANGE no matter what size we request in FUSE_LISTXATTR won't cause an infinite loop in the kernel. As of this commit, it will instead cause an infinite loop that exits and enters the kernel on each iteration, allowing signals to be processed. Reviewed by: cem Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21287 r351961: Coverity fixes in fusefs(5) CID 1404532 fixes a signed vs unsigned comparison error in fuse_vnop_bmap. It could potentially have resulted in VOP_BMAP reporting too many consecutive blocks. CID 1404364 is much worse. It was an array access by an untrusted, user-provided variable. It could potentially have resulted in a malicious file system crashing the kernel or worse. Reported by: Coverity Reviewed by: emaste Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21466 r351963: fusefs: coverity cleanup in the tests Address the following defects reported by Coverity: * Structurally dead code (CID 1404366): set m_quit before FAIL, not after * Unchecked return value of sysctlbyname (CID 1404321) * Unchecked return value of stat(2) (CID 1404471) * Unchecked return value of open(2) (CID 1404402, 1404529) * Unchecked return value of dup(2) (CID 1404478) * Buffer overflows. These are all false positives caused by the fact that Coverity thinks I'm using a buffer to store strings, when in fact I'm really just using it to store a byte array that happens to be initialized with a string. I'm changing the type from char to uint8_t in the hopes that it will placate Coverity. (CID 1404338, 1404350, 1404367, 1404376, 1404379, 1404381, 1404388, 1404403, 1404425, 1404433, 1404434, 1404474, 1404480, 1404484, 1404503, 1404505) * False positive file descriptor leak. I'm going to try to fix this with Coverity modeling, but I'll also change an EXPECT to ASSERT so we don't perform meaningless assertions after the failure. (CID 1404320, 1404324, 1404440, 1404445). * Unannotated file descriptor leak. This will be followed up by a Coverity modeling change. (CID 1404326, 1404334, 1404336, 1404357, 1404361, 1404372, 1404391, 1404395, 1404409, 1404430, 1404448, 1404451, 1404455, 1404457, 1404458, 1404460) * Uninitialized variables in C++ constructors (CID 1404327, 1404346). In the case of m_maxphys, this actually led to part of the FUSE_INIT's response being set to stack garbage during the WriteCluster::clustering test. * Uninitialized sun_len field in struct sockaddr_un (CID 1404330, 1404371, 1404429). Reported by: Coverity Reviewed by: emaste Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21457 r352021: fusefs: suppress some Coverity resource leak CIDs in the tests The fusefs tests deliberately leak file descriptors. To do otherwise would add extra complications to the tests' mock FUSE server. This annotation should hopefully convince Coverity to shut up about the leaks. Reviewed by: uqs Sponsored by: The FreeBSD Foundation r352025: mount_fusefs: fix a segfault on memory allocation failure Reported by: Coverity Coverity CID: 1354188 Sponsored by: The FreeBSD Foundation r352230: fusefs: Fix iosize for FUSE_WRITE in 7.8 compat mode When communicating with a FUSE server that implements version 7.8 (or older) of the FUSE protocol, the FUSE_WRITE request structure is 16 bytes shorter than normal. The protocol version check wasn't applied universally, leading to an extra 16 bytes being sent to such servers. The extra bytes were allocated and bzero()d, so there was no information disclosure. Reviewed by: emaste MFC-With: r350665 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21557 Changes: _U stable/12/ stable/12/MAINTAINERS stable/12/UPDATING stable/12/etc/mtree/BSD.tests.dist stable/12/sbin/mount_fusefs/mount_fusefs.8 stable/12/sbin/mount_fusefs/mount_fusefs.c stable/12/share/man/man5/fusefs.5 stable/12/sys/fs/fuse/fuse.h stable/12/sys/fs/fuse/fuse_device.c stable/12/sys/fs/fuse/fuse_file.c stable/12/sys/fs/fuse/fuse_file.h stable/12/sys/fs/fuse/fuse_internal.c stable/12/sys/fs/fuse/fuse_internal.h stable/12/sys/fs/fuse/fuse_io.c stable/12/sys/fs/fuse/fuse_io.h stable/12/sys/fs/fuse/fuse_ipc.c stable/12/sys/fs/fuse/fuse_ipc.h stable/12/sys/fs/fuse/fuse_kernel.h stable/12/sys/fs/fuse/fuse_main.c stable/12/sys/fs/fuse/fuse_node.c stable/12/sys/fs/fuse/fuse_node.h stable/12/sys/fs/fuse/fuse_param.h stable/12/sys/fs/fuse/fuse_vfsops.c stable/12/sys/fs/fuse/fuse_vnops.c stable/12/sys/sys/param.h stable/12/tests/sys/fs/Makefile stable/12/tests/sys/fs/fusefs/ stable/12/tests/sys/fs/fusefs/access.cc stable/12/tests/sys/fs/fusefs/allow_other.cc stable/12/tests/sys/fs/fusefs/bmap.cc stable/12/tests/sys/fs/fusefs/create.cc stable/12/tests/sys/fs/fusefs/default_permissions.cc stable/12/tests/sys/fs/fusefs/default_permissions_privileged.cc stable/12/tests/sys/fs/fusefs/destroy.cc stable/12/tests/sys/fs/fusefs/dev_fuse_poll.cc stable/12/tests/sys/fs/fusefs/fifo.cc stable/12/tests/sys/fs/fusefs/flush.cc stable/12/tests/sys/fs/fusefs/forget.cc stable/12/tests/sys/fs/fusefs/fsync.cc stable/12/tests/sys/fs/fusefs/fsyncdir.cc stable/12/tests/sys/fs/fusefs/getattr.cc stable/12/tests/sys/fs/fusefs/interrupt.cc stable/12/tests/sys/fs/fusefs/io.cc stable/12/tests/sys/fs/fusefs/link.cc stable/12/tests/sys/fs/fusefs/locks.cc stable/12/tests/sys/fs/fusefs/lookup.cc stable/12/tests/sys/fs/fusefs/mkdir.cc stable/12/tests/sys/fs/fusefs/mknod.cc stable/12/tests/sys/fs/fusefs/mockfs.cc stable/12/tests/sys/fs/fusefs/mockfs.hh stable/12/tests/sys/fs/fusefs/mount.cc stable/12/tests/sys/fs/fusefs/nfs.cc stable/12/tests/sys/fs/fusefs/notify.cc stable/12/tests/sys/fs/fusefs/open.cc stable/12/tests/sys/fs/fusefs/opendir.cc stable/12/tests/sys/fs/fusefs/read.cc stable/12/tests/sys/fs/fusefs/readdir.cc stable/12/tests/sys/fs/fusefs/readlink.cc stable/12/tests/sys/fs/fusefs/release.cc stable/12/tests/sys/fs/fusefs/releasedir.cc stable/12/tests/sys/fs/fusefs/rename.cc stable/12/tests/sys/fs/fusefs/rmdir.cc stable/12/tests/sys/fs/fusefs/setattr.cc stable/12/tests/sys/fs/fusefs/statfs.cc stable/12/tests/sys/fs/fusefs/symlink.cc stable/12/tests/sys/fs/fusefs/unlink.cc stable/12/tests/sys/fs/fusefs/utils.cc stable/12/tests/sys/fs/fusefs/utils.hh stable/12/tests/sys/fs/fusefs/write.cc stable/12/tests/sys/fs/fusefs/xattr.cc