Bug 256936 - Buggy filesystem detected - message wrongly triggered by FUSE filesystems
Summary: Buggy filesystem detected - message wrongly triggered by FUSE filesystems
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-02 11:55 UTC by Agata
Modified: 2021-10-12 05:13 UTC (History)
4 users (show)

See Also:


Attachments
Buggy fuse server message (178.84 KB, image/png)
2021-07-02 20:10 UTC, jSML4ThWwBID69YC
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Agata 2021-07-02 11:55:27 UTC
Since FreeBSD 12.2(? or 12.1?) and still on FreeBSD 13.0 machines running MooseFS occasionally show a message:

"fuse_internal_do_getattr: cache incoherent on [PATH]! Buggy FUSE server detected"

I tried to narrow down the issue. This script:

**************
#! /usr/bin/env bash
set -e

while true
do
	size=$(((2097152 + $RANDOM) % 1024))
	dd if=/dev/urandom of=somefile bs=1k count=$size
done
**************

ran on 2 different mount points of the same MooseFS directory (so 2 processes in 2 different mount points modify the same file) triggers it constantly.

You can get the same result if you mount 2 instances of SSHfs and run this script on both.

I suspect this has to do with file length changing during the write operation. I don't know what SSHfs does, but MooseFS sends invalidate messages every time an open file's length changes. I checked the operation log and they are sent almost constantly along the write operations when the above scripts are run on 2 mount points. However invalidates are not obligatory and even when sent, a race between invalidate and getattr can still occur. And this is not filesystem's fault. Kernel should trust the information it gets from getattr, especially on a network filesystem. This is not "buggy" behaviour :)
Comment 1 jSML4ThWwBID69YC 2021-07-02 13:52:50 UTC
The 'Buggy FUSE server' messages started with FreeBSD 12.2. I think it was the commit for https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=244178 that started sending the messages.
Comment 2 Alan Somers freebsd_committer 2021-07-02 19:36:25 UTC
If the FUSE daemon tells the kernel that it may cache dirty data, but then the daemon tells the kernel that the file has changed while it was dirty, then data has been lost.  Data corruption is a bug.

MooseFS's hands are somewhat tied as it still uses fusefs-libs2.  At that protocol level there really aren't sufficient operations for the server to manage the kernel's cache.  But sshfs uses fusefs-libs3, so it ought to be able to fix the problem.

If it's possible to prove that no data corruption occurred and yet the kernel still prints this error message, then that would constitute a bug in the kernel, and I would fix it.  The most obvious way for that to happen is if you set vfs.fusefs.data_cache_mode=0, completely disabling the cache.  But even if data_cache_mode=1, it wouldn't surprise me if there are cases where the warning is printed inappropriately; it's complicated.  Especially because file attributes can be cached independently of their data.
Comment 3 jSML4ThWwBID69YC 2021-07-02 20:10:19 UTC
Created attachment 226182 [details]
Buggy fuse server message

Attached screenshot showing the date, sysctl setting, and /var/log/messages displaying the error again after the sysctl changed.
Comment 4 jSML4ThWwBID69YC 2021-07-02 20:10:50 UTC
Comment on attachment 226182 [details]
Buggy fuse server message

Hello, 

MooseFS is built against fusefs-libs3-3.10.4 on my system, not version two. I tested setting vfs.fusefs.data_cache_mode=0 and the error message still happens. 

See the attached screenshot showing the date, sysctl setting, and /var/log/messages displaying the error again after the sysctl changed.
Comment 5 jSML4ThWwBID69YC 2021-07-02 20:24:48 UTC
To add to it, I've been seeing these messages since upgrading to 12.2 back in October. I've not experienced any data corruption in that time.
Comment 6 Alan Somers freebsd_committer 2021-07-02 20:45:40 UTC
Ahh, well if moosefs is indeed using fusefs-libs3, then vfs.fusefs.data_cache_mode is completely ignored.  Instead, the cache behavior is controlled by moosefs itself.  You can tell what it's requesting by looking at the FUSE_INIT response, either by dialing up the log verbosity (maybe, depending on if moosefs displays it), or by running this (untested) dtrace script at mount time

fusefs:fusefs:internal:init_done
{
    printf("flags=%s", args[1]->flags);
}

Another important thing that the server tells the kernel is the entry and attr valid timeouts, which are returned by FUSE_LOOKUP.

Finally, there is the AUTO_INVAL_DATA feature (known as FUSE_CAP_AUTO_INVAL_DATA in libfuse).  This would probably be very useful for MooseFS.  In fact, MooseFS may well be written assuming that the feature is available and working.  However, it is not currently implemented in FreeBSD.
Comment 7 Alan Somers freebsd_committer 2021-07-02 20:54:07 UTC
Also, if you see these warnings in FreeBSD 12.2 but not 12.1, then it's due to this commit: https://github.com/freebsd/freebsd-src/commit/f0ebf80364c85feaea24b4aa247e5efe6e1b7733 , which addressed bug https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=244178 .
Comment 8 Graham Perrin 2021-07-03 08:11:22 UTC
Also: 

Buggy fuse server detected · Issue #390 · moosefs/moosefs
<https://github.com/moosefs/moosefs/issues/390>
Comment 9 Agata 2021-07-05 11:45:18 UTC
First of all: MooseFS uses FUSE 3 whenever possible (it checks what is available on the system it's compiled on and uses that). And we do try to use all the new options and mechanisms available to us.

But there is no mechanism that would prevent a network file system that uses cache from overwriting some data if two processes on two machines write to the same cached block at the same time. I mean, once both kernels have modified the same frame of cache and want to push the dirty frame to the filesystem, there is no way to automagically update/correct one of them. The only solution is to not use cache at all (BTW, MooseFS client mounted with -o mfscachemode=DIRECT, which translates to each open always setting direct_io to 1 and keep_cache to 0, still triggers the "buggy" message). But not using cache at all is not a solution, because of performance. The commit you mentioned invalidates all cache, even the dirty frames. IMHO dirty frames should never be invalidated. It should be left to the users/programmers to make sure they don't modify the same area of the file (use locks etc.) Or we can say goodbye to any caching...

The script I posted as an example is not "correct", in that it doesn't use locks, but it's my (programmer's) "fault" that data gets written to a file without control, not the _network_ filesystem's.
Comment 10 jSML4ThWwBID69YC 2021-07-06 16:36:01 UTC
I've retested using MooseFS mounted with -o mfscachemode=DIRECT. I can confirm the buggy fuse server messages still happen repeatedly.
Comment 11 commit-hook freebsd_committer 2021-10-05 13:19:14 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=2636e7c310b77163e96da1bce324da930e693e6d

commit 2636e7c310b77163e96da1bce324da930e693e6d
Author:     Sergey A. Osokin <osa@FreeBSD.org>
AuthorDate: 2021-10-05 13:11:51 +0000
Commit:     Sergey A. Osokin <osa@FreeBSD.org>
CommitDate: 2021-10-05 13:17:58 +0000

    databases/redis{,5,6}: security update to latest releases

    Multiple security issues have been fixed.

    Security: CVE-2021-41099
    Security: CVE-2021-32762
    Security: CVE-2021-32687
    Security: CVE-2021-32675
    Security: CVE-2021-32672
    Security: CVE-2021-32628
    Security: CVE-2021-32627
    Security: CVE-2021-32626

    PR:     256936

 databases/redis/Makefile  | 3 +--
 databases/redis/distinfo  | 6 +++---
 databases/redis5/Makefile | 2 +-
 databases/redis5/distinfo | 6 +++---
 databases/redis6/Makefile | 3 +--
 databases/redis6/distinfo | 6 +++---
 6 files changed, 12 insertions(+), 14 deletions(-)
Comment 12 commit-hook freebsd_committer 2021-10-05 13:36:18 UTC
A commit in branch 2021Q4 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=d87c3f0fa85c9e3e5aa4d089439abc61007a59d1

commit d87c3f0fa85c9e3e5aa4d089439abc61007a59d1
Author:     Sergey A. Osokin <osa@FreeBSD.org>
AuthorDate: 2021-10-05 13:11:51 +0000
Commit:     Sergey A. Osokin <osa@FreeBSD.org>
CommitDate: 2021-10-05 13:35:15 +0000

    databases/redis{,5,6}: security update to latest releases

    Multiple security issues have been fixed.

    Security: CVE-2021-41099
    Security: CVE-2021-32762
    Security: CVE-2021-32687
    Security: CVE-2021-32675
    Security: CVE-2021-32672
    Security: CVE-2021-32628
    Security: CVE-2021-32627
    Security: CVE-2021-32626

    PR:     256936
    (cherry picked from commit 2636e7c310b77163e96da1bce324da930e693e6d)

 databases/redis/Makefile  | 3 +--
 databases/redis/distinfo  | 6 +++---
 databases/redis5/Makefile | 2 +-
 databases/redis5/distinfo | 6 +++---
 databases/redis6/Makefile | 3 +--
 databases/redis6/distinfo | 6 +++---
 6 files changed, 12 insertions(+), 14 deletions(-)
Comment 13 Piotr Robert Konopelko (MooseFS) 2021-10-05 13:38:56 UTC
(In reply to commit-hook from comment #12)

@osa@FreeBSD.org, are you sure the commits mentioned above reference this bug?
Comment 14 Sergey A. Osokin freebsd_committer 2021-10-05 13:40:37 UTC
(In reply to Piotr Robert Konopelko (MooseFS) from comment #13)

I do apologies, I've just made a commit with wrong PR number.
Comment 15 commit-hook freebsd_committer 2021-10-06 20:09:02 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=5d94aaacb5180798b2f698e33937f068386004eb

commit 5d94aaacb5180798b2f698e33937f068386004eb
Author:     Alan Somers <asomers@FreeBSD.org>
AuthorDate: 2021-10-03 16:59:04 +0000
Commit:     Alan Somers <asomers@FreeBSD.org>
CommitDate: 2021-10-06 20:07:33 +0000

    fusefs: quiet some cache-related warnings

    If the FUSE server does something that would make our cache incoherent,
    we should print a warning to the user.  However, we previously warned in
    some situations when we shouldn't, such as if the file's size changed on
    the server _after_ our own attribute cache had expired.  This change
    suppresses the warning in cases like that.  It also moves the warning
    logic to a single place within the code.

    PR:             256936
    Reported by:    Agata <chogata@moosefs.pro>
    Tested by:      Agata <chogata@moosefs.pro>, jSML4ThWwBID69YC@protonmail.com
    MFC after:      2 weeks

 sys/fs/fuse/fuse_internal.c | 74 +++++++++++++++++++++++++++++++--------------
 sys/fs/fuse/fuse_internal.h |  3 +-
 sys/fs/fuse/fuse_io.c       |  4 +--
 sys/fs/fuse/fuse_node.c     | 19 ++++++++++--
 sys/fs/fuse/fuse_node.h     |  2 +-
 sys/fs/fuse/fuse_vfsops.c   | 40 +-----------------------
 sys/fs/fuse/fuse_vnops.c    | 41 ++-----------------------
 7 files changed, 76 insertions(+), 107 deletions(-)
Comment 16 Alan Somers freebsd_committer 2021-10-06 20:18:30 UTC
I found that sometimes this warning was emitted unwarrantedly.  The last commit should reduce that.  However, there's still one situation that will trigger the warning but shouldn't: after the server sends a FUSE_NOTIFY_INVAL_INODE message.  Does MooseFS use those?
Comment 17 Agata 2021-10-07 10:53:04 UTC
Hi Alan,

We call fuse_lowlevel_notify_inval_inode - I assume this is what triggers FUSE_NOTIFY_INVAL_INODE message.
Comment 18 commit-hook freebsd_committer 2021-10-12 05:13:24 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=076040aacb5d58dd924853350136f08d7f37b969

commit 076040aacb5d58dd924853350136f08d7f37b969
Author:     Alan Somers <asomers@FreeBSD.org>
AuthorDate: 2021-10-03 16:59:04 +0000
Commit:     Alan Somers <asomers@FreeBSD.org>
CommitDate: 2021-10-12 00:29:45 +0000

    fusefs: quiet some cache-related warnings

    If the FUSE server does something that would make our cache incoherent,
    we should print a warning to the user.  However, we previously warned in
    some situations when we shouldn't, such as if the file's size changed on
    the server _after_ our own attribute cache had expired.  This change
    suppresses the warning in cases like that.  It also moves the warning
    logic to a single place within the code.

    PR:             256936
    Reported by:    Agata <chogata@moosefs.pro>
    Tested by:      Agata <chogata@moosefs.pro>, jSML4ThWwBID69YC@protonmail.com

    (cherry picked from commit 5d94aaacb5180798b2f698e33937f068386004eb)

 sys/fs/fuse/fuse_internal.c | 74 +++++++++++++++++++++++++++++++--------------
 sys/fs/fuse/fuse_internal.h |  3 +-
 sys/fs/fuse/fuse_io.c       |  4 +--
 sys/fs/fuse/fuse_node.c     | 19 ++++++++++--
 sys/fs/fuse/fuse_node.h     |  2 +-
 sys/fs/fuse/fuse_vfsops.c   | 40 +-----------------------
 sys/fs/fuse/fuse_vnops.c    | 41 ++-----------------------
 7 files changed, 76 insertions(+), 107 deletions(-)