Bug 245689

Summary: Chmod performs unnecessary access calls (FUSE fs)
Product: Base System Reporter: MooseFS FreeBSD Team <freebsd>
Component: kernAssignee: Alan Somers <asomers>
Status: Closed FIXED    
Severity: Affects Many People CC: asomers, piotr.konopelko
Priority: Normal Keywords: performance
Version: 12.1-RELEASEFlags: koobs: maintainer-feedback+
koobs: mfc-stable12+
koobs: mfc-stable11-
Hardware: amd64   
OS: Any   
URL: https://reviews.freebsd.org/D24777
Attachments:
Description Flags
How to install a minimal test instance of MooseFS on FreeBSD
none
Fix issues with FUSE_ACCESS when default_permissions is disabled none

Description MooseFS FreeBSD Team 2020-04-17 10:28:05 UTC
FreeBSD 12.1-p2, MooseFS mounted via FUSE ver 3.

We perform the following operations:

$ touch test.txt ; chmod u-w test.txt ; chmod u+w test.txt
chmod: test.txt: Permission denied

What happens on the fs side:

# touch test.txt
04.08 09:58:05.000715: uid:1001 gid:0 pid:48262 cmd:access (504599,0x1): OK
04.08 09:58:05.001324: uid:1001 gid:0 pid:48262 cmd:lookup (504599,test.txt): ENOENT (No such file or directory)
04.08 09:58:05.001731: uid:1001 gid:0 pid:48262 cmd:access (504599,0x1): OK
04.08 09:58:05.002157: uid:1001 gid:0 pid:48262 cmd:lookup (504599,test.txt): ENOENT (No such file or directory)
04.08 09:58:05.002546: uid:1001 gid:0 pid:48262 cmd:access (504599,0x2): OK
04.08 09:58:05.002946: uid:1001 gid:0 pid:48262 cmd:create (504599,test.txt,-rw-r--r--:0100644): OK (0.0,1238695,1.0,[-rw-r--r--:0100644,1,1001,0,1586332685,1586332685,1586332685,0]) (direct_io:0,keep_cache:0) [handle:03000001]
04.08 09:58:05.003013: uid:1001 gid:0 pid:48262 cmd:flush (1238695) [handle:03000001,uselocks:0,lock_owner:000000000000BC86]: OK
04.08 09:58:05.003249: uid:1001 gid:0 pid:48262 cmd:release (1238695) [handle:03000001,uselocks:0,lock_owner:000000000000BC86]: OK
# chmod u-w test.txt
04.08 09:58:05.005207: uid:1001 gid:0 pid:48263 cmd:access (504599,0x1): OK
04.08 09:58:05.005668: uid:1001 gid:0 pid:48263 cmd:lookup (504599,test.txt): OK (0.0,1238695,1.0,[-rw-r--r--:0100644,1,1001,0,1586332685,1586332685,1586332685,0])
04.08 09:58:05.006095: uid:1001 gid:0 pid:48263 cmd:access (504599,0x1): OK
04.08 09:58:05.006490: uid:1001 gid:0 pid:48263 cmd:lookup (504599,test.txt): OK (0.0,1238695,1.0,[-rw-r--r--:0100644,1,1001,0,1586332685,1586332685,1586332685,0])
04.08 09:58:05.006888: uid:1001 gid:0 pid:48263 cmd:access (504599,0x1): OK
04.08 09:58:05.007260: uid:1001 gid:0 pid:48263 cmd:lookup (504599,test.txt): OK (0.0,1238695,1.0,[-rw-r--r--:0100644,1,1001,0,1586332685,1586332685,1586332685,0])
04.08 09:58:05.007292: uid:1001 gid:0 pid:48263 cmd:access (1238695,0x2): OK
04.08 09:58:05.007903: uid:1001 gid:0 pid:48263 cmd:setattr (1238695,0x1,[mode=r--r--r--:00444]) [no handle]: OK (1.0,[-r--r--r--:0100444,1,1001,0,1586332685,1586332685,1586332685,0])
# chmod u+w test.txt
04.08 09:58:05.009607: uid:1001 gid:0 pid:48264 cmd:access (504599,0x1): OK
04.08 09:58:05.009944: uid:1001 gid:0 pid:48264 cmd:lookup (504599,test.txt): OK (0.0,1238695,1.0,[-r--r--r--:0100444,1,1001,0,1586332685,1586332685,1586332685,0])
04.08 09:58:05.010879: uid:1001 gid:0 pid:48264 cmd:access (504599,0x1): OK
04.08 09:58:05.011309: uid:1001 gid:0 pid:48264 cmd:lookup (504599,test.txt): OK (0.0,1238695,1.0,[-r--r--r--:0100444,1,1001,0,1586332685,1586332685,1586332685,0])
04.08 09:58:05.011680: uid:1001 gid:0 pid:48264 cmd:access (504599,0x1): OK
04.08 09:58:05.012068: uid:1001 gid:0 pid:48264 cmd:lookup (504599,test.txt): OK (0.0,1238695,1.0,[-r--r--r--:0100444,1,1001,0,1586332685,1586332685,1586332685,0])
04.08 09:58:05.012102: uid:1001 gid:0 pid:48264 cmd:access (1238695,0x2): EACCES (Permission denied)
04.08 09:58:05.596937: uid:0 gid:0 pid:48228 cmd:getattr (504599) [no handle]: OK (1.0,[drwxr-xr-x:0040755,3,1001,0,1586332657,1586332685,1586332685,1002246])

So, we have an "access" call to a file that has no "w" permission for the user calling, so we get an EACCES...

There are 2 problems here:
1) It is not necessary to have "w" permission to a file to change its metadata - this only requires the uid doing the changes to be either 0 or owner of the file.
2) MooseFS does NOT send "default_permissions" when initializing FUSE, so the whole checking of permissions should happen on the filesystem side. Yet, practically all the calls are preceded with access calls.

Here is how the above chmod operations look on Linux:

# chmod u-w test.txt
04.08 10:10:26.794571: uid:1000 gid:1000 pid:30550 cmd:lookup (164872,test.txt): OK (0.0,210456,1.0,[-rw-r--r--:0100644,1,1000,1000,1586326757,1586326757,1586326769,0])
04.08 10:10:26.794717: uid:1000 gid:1000 pid:30550 cmd:lookup (164872,test.txt): OK (0.0,210456,1.0,[-rw-r--r--:0100644,1,1000,1000,1586326757,1586326757,1586326769,0])
04.08 10:10:26.802672: uid:1000 gid:1000 pid:30550 cmd:setattr (210456,0x1,[mode=r--r--r--:00444]) [no handle]: OK (1.0,[-r--r--r--:0100444,1,1000,1000,1586326757,1586326757,1586333426,0])
# chmod u+w test.txt
04.08 10:10:26.803478: uid:1000 gid:1000 pid:30551 cmd:lookup (164872,test.txt): OK (0.0,210456,1.0,[-r--r--r--:0100444,1,1000,1000,1586326757,1586326757,1586333426,0])
04.08 10:10:26.803608: uid:1000 gid:1000 pid:30551 cmd:lookup (164872,test.txt): OK (0.0,210456,1.0,[-r--r--r--:0100444,1,1000,1000,1586326757,1586326757,1586333426,0])
04.08 10:10:26.803790: uid:1000 gid:1000 pid:30551 cmd:setattr (210456,0x1,[mode=rw-r--r--:00644]) [no handle]: OK (1.0,[-rw-r--r--:0100644,1,1000,1000,1586326757,1586326757,1586333426,0])

There are no "access" calls at all.

While problem 1) is a simple bug, problem 2) is a serious performance issue. MooseFS is a network filesystem and performance is very important in this case - it has to receive and handle twice as many calls sent via network.
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2020-04-17 10:38:08 UTC
Thank you for the report Mooses

Alan landed our most recent (base 350665) FuseFS infrastructure improvements and may have insight into the best way to address this
Comment 2 Alan Somers freebsd_committer freebsd_triage 2020-04-17 13:03:48 UTC
Please post full steps to reproduce.
Comment 3 Alan Somers freebsd_committer freebsd_triage 2020-04-17 13:06:42 UTC
Also, please show me the output of kldstat.
Comment 4 Conrad Meyer freebsd_committer freebsd_triage 2020-04-17 17:09:36 UTC
The bulk of the 'access'es are from VOP_LOOKUP checking that the directories of the path to the file has +x permission(s) for the current user.  FUSE does this a little weirdly (it basically has an inlined copy of vn_dir_check_exec() embedded, instead of calling the helper) but regardless, this is canonical for FreeBSD lookup.

If FUSE(5) can rely on filesystems to do their own checking as part of lookup, then that is probably an easy optimization.  (Apparently Linux does so.)

chmod => VOP_SETATTR => fuse_vnop_setattr.  At the bottom (success case) we send fuse_internal_access(), then fuse_internal_setattr().  UFS does something similar in ufs_chmod().

Basically: FreeBSD's VOP model has ACCESS as a separate vop, and this is the common pattern.

Probably we can bypass these ACCESS checks in the ~FSESS_DEFAULT_PERMISSIONS case.
Comment 5 MooseFS FreeBSD Team 2020-04-20 10:53:35 UTC
Created attachment 213598 [details]
How to install a minimal test instance of MooseFS on FreeBSD
Comment 6 MooseFS FreeBSD Team 2020-04-20 10:59:48 UTC
I've attached a how to for creating a test environment. After you've installed it as per instructions, do this:

1) On one console open the debug file (as root):
# cat /mnt/mfs/.oplog

2) Then on the second console, run the test commands (as a non-root user!!!):
# cd /mnt/mfs/
# touch test.txt ; chmod u-w test.txt ; chmod u+w test.txt

You will get the "chmod: test1.txt: Permission denied" error after the last chmod and you can observe the calls that were sent to MooseFS on the first console.
Comment 7 MooseFS FreeBSD Team 2020-04-20 11:01:27 UTC
root@test:/ # kldstat
Id Refs Address                Size Name
 1   10 0xffffffff80200000  2448f20 kernel
 2    1 0xffffffff82819000     2668 intpm.ko
 3    1 0xffffffff8281c000      b50 smbus.ko
 4    1 0xffffffff8281d000      acf mac_ntpd.ko
 5    1 0xffffffff8281e000     fcf0 fusefs.ko
Comment 8 Alan Somers freebsd_committer freebsd_triage 2020-05-04 03:13:53 UTC
I've reproduced the issue on FreeBSD head (Thanks for the great instructions!).  Here is what I found.

Issue 1 is essentially a FUSE protocol bug.  The access mode the kernel wants is not W_OK, it's VADMIN.  But the FUSE protocol has no concept of that mode.  Instead, we ask the daemon for W_OK instead, assuming it's close enough.  It's not obvious what we should be doing instead.  Maybe we should always evaluate such requests in-kernel, as if default_permissions were set.

Issue 2 has a simple solution on MooseFS's side.  If a FUSE daemon ever returns ENOSYS to FUSE_ACCESS, then the client will never send FUSE_ACCESS again.  Instead it will assume that access will always be granted.  That would be appropriate for MooseFS if it checks permissions during FUSE_WRITE, FUSE_SETATTR, etc.  Is that an option for you?
Comment 9 MooseFS FreeBSD Team 2020-05-04 12:16:56 UTC
The answer to both questions is - please, let the filesystem that does not send "default_permissions" worry itself about stuff. Now it looks like even when the fs does not send it, you still check. Is there a reason for it?

Of course, issue 1 should be solved in both cases: with "default_permissions" and without. With it yes, I think the solution might be evaluating in kernel. But without "default_permissions" the best solution is just not to check - if the operation is not permitted, the fs will not allow it. It's what is written explicitly in FUSE documentation.

The solution with always assuming access is granted is not great either, because there are two instances where access should be called - when you cd into a directory to check if the x permission is set (X_OK) and when a user explicitly uses access system call in a program.
Comment 10 Alan Somers freebsd_committer freebsd_triage 2020-05-08 03:15:14 UTC
(In reply to MooseFS FreeBSD Team from comment #9)
You've got it backwards.  We only send FUSE_ACCESS when the file system _doesn't_ set default_permissions.  When it does, we perform all access checks in the kernel.

I can certainly fix issue #1: it looks like a one-liner.

But issue #2 is much more complicated.  The root cause of the problem is simply that FreeBSD's and Linux's VFSes work differently.  FreeBSD uses VOP_ACCESS(9) copiously, delegating to file systems most aspects of authorization checking.  But Linux performs much more of that at a higher level.  The FUSE protocol was naively designed as a serialization of Linux's in-kernel file system API, and the result is several warts like this.  From the perspective of FreeBSD's FUSE driver, I can't tell which VOP_ACCESS()s are the result of access(2) and which are the result of any other operation.  So without some major hacks up-stack, there's little I can do to reduce the frequency of FUSE_ACCESS operations.  The only sure way to cut down the traffic is to mount a file system with default_permissions, or to return ENOSYS for FUSE_ACCESS.
Comment 11 MooseFS FreeBSD Team 2020-05-08 11:00:28 UTC
It's great that you can fix issue nr 1. This IS the issue here, after all.

As for nr 2 - we were hoping that we could get rid of the accesses, but if we can't, we can't. We want MooseFS to be fast, but first of all we want it to be reliable. We can't use any of the proposed solutions. We can't use "default_permissions" because of ACLs and MooseFS specific permissions that we need to handle ourselves. And we can't return ENOSYS because we want those "real" accesses to work. So the answer has to stay "nothing can be done" for nr 2. We may consider adding the ENOSYS solution as a switchable option to the system (off by default) if we get user requests about that.
Comment 12 Alan Somers freebsd_committer freebsd_triage 2020-05-08 22:46:53 UTC
Created attachment 214294 [details]
Fix issues with FUSE_ACCESS when default_permissions is disabled

This patch fixes two issues relating to FUSE_ACCESS when the default_permissions mount option is disabled:

* VOP_ACCESS() calls with VADMIN set should never be sent to a fuse server in the form of FUSE_ACCESS operations.  The FUSE protocol has no equivalent of VADMIN, so we must evaluate such things kernel-side, regardless of the default_permissions setting.

* The FUSE protocol only requires FUSE_ACCESS to be sent for two purposes: for the access(2) syscall and to check directory permissions for searchability during lookup.  FreeBSD sends it much more frequently, due to differences between our VFS and Linux's, for which FUSE was designed.  But this patch does eliminate several cases not required by the FUSE protocol:
  - for any FUSE_*XATTR operation
  - when creating a new file
  - when deleting a file
  - when setting timestamps, such as by utimensat(2).

* Additionally, when default_permissions is disabled, this patch removes one FUSE_GETATTR operation when deleting a file.
Comment 13 Alan Somers freebsd_committer freebsd_triage 2020-05-08 22:48:01 UTC
Could you please run your tests again, with this patch applied?  Bonus points if you run the following command while the tests are ongoing.  If it prints anything at all, that indicates a kernel bug.

sudo dtrace -i 'fusefs:fusefs:internal:access_vadmin {stack();}'
Comment 14 Alan Somers freebsd_committer freebsd_triage 2020-05-08 23:04:50 UTC
A slightly updated patch is under review at
https://reviews.freebsd.org/D24777
Comment 15 MooseFS FreeBSD Team 2020-05-18 13:02:58 UTC
I tried to test it, but when I dry run the patch, I get an error:

Hunk #2 failed at 247.
1 out of 6 hunks failed while patching sys/fs/fuse/fuse_vnops.c

I checked it and it looks like in "my" sources function "priv_check_cred", called in line 250 of sys/fs/fuse/fuse_vnops.c has 3 parameters and in the patch it has 2 (and it's not a changed line, so they should match).

I tried both on fresh 12.1 and on 12.1 updated to the last patch level (p5), both have the same problem.

What should I do to test it properly?
Comment 16 Alan Somers freebsd_committer freebsd_triage 2020-05-18 13:13:08 UTC
(In reply to MooseFS FreeBSD Team from comment #15)

The patch is intended for head, not stable/12.  Do you have a FreeBSD head VM?  If not, I can port the patch to stable/12.
Comment 17 MooseFS FreeBSD Team 2020-05-22 09:58:03 UTC
Now we have a VM with head :)

I tested, it works. Thank you very much!

Question: how can we know which version of kernel will have this patch applied? Will it be available already in first release of 13.0? The reason I'm asking: we currently have a workaround for this problem in our code, but we would like to get rid of it sooner rather than later. We can check kernel version during startup (mounting) and not apply the workaround if kernel/system version is greater than X, we just need to know X.
Comment 18 Alan Somers freebsd_committer freebsd_triage 2020-05-22 14:25:57 UTC
(In reply to MooseFS FreeBSD Team from comment #17)
Glad it works.  I'll commit it shortly.  Regarding availability, yes it will be available in 13.0-RELEASE.  Barring complications, it will be available in 12.2-RELEASE, too.  For more fine-grained detection, here's what you should do:

* For 13.0-CURRENT, assume the fix is in place.  There is no need to support versions of head older than the latest.

* For 12-STABLE, you can look at the SVN revision number in the kern.version sysctl.  For compile-time detection, you can check __FreeBSD_version in /usr/include/sys/param.h .  I'm not going to bump that version as part of this commit, but something usually increases it once a week or so.

One more question: did you mother, Mrs. Team, name you "MooseFS FreeBSD", or is that an alias?  If so, shall I credit the alias in the commit message, or is there a different name you would like for me to use?
Comment 19 commit-hook freebsd_committer freebsd_triage 2020-05-22 18:12:12 UTC
A commit references this bug:

Author: asomers
Date: Fri May 22 18:11:20 UTC 2020
New revision: 361401
URL: https://svnweb.freebsd.org/changeset/base/361401

Log:
  Fix issues with FUSE_ACCESS when default_permissions is disabled

  This patch fixes two issues relating to FUSE_ACCESS when the
  default_permissions mount option is disabled:

  * VOP_ACCESS() calls with VADMIN set should never be sent to a fuse server
    in the form of FUSE_ACCESS operations. The FUSE protocol has no equivalent
    of VADMIN, so we must evaluate such things kernel-side, regardless of the
    default_permissions setting.

  * The FUSE protocol only requires FUSE_ACCESS to be sent for two purposes:
    for the access(2) syscall and to check directory permissions for
    searchability during lookup. FreeBSD sends it much more frequently, due to
    differences between our VFS and Linux's, for which FUSE was designed. But
    this patch does eliminate several cases not required by the FUSE protocol:

    * for any FUSE_*XATTR operation
    * when creating a new file
    * when deleting a file
    * when setting timestamps, such as by utimensat(2).

  * Additionally, when default_permissions is disabled, this patch removes one
    FUSE_GETATTR operation when deleting a file.

  PR:		245689
  Reported by:	MooseFS FreeBSD Team <freebsd@moosefs.pro>
  Reviewed by:	cem
  MFC after:	2 weeks
  Differential Revision:	https://reviews.freebsd.org/D24777

Changes:
  head/sys/fs/fuse/fuse_internal.c
  head/sys/fs/fuse/fuse_vnops.c
  head/tests/sys/fs/fusefs/access.cc
  head/tests/sys/fs/fusefs/rename.cc
  head/tests/sys/fs/fusefs/rmdir.cc
  head/tests/sys/fs/fusefs/unlink.cc
  head/tests/sys/fs/fusefs/utils.cc
  head/tests/sys/fs/fusefs/utils.hh
  head/tests/sys/fs/fusefs/xattr.cc
Comment 20 Piotr Robert Konopelko (MooseFS) 2020-05-22 18:21:22 UTC
(In reply to Alan Somers from comment #18)

Alan,
Indeed, this is an alias (mainly for maintaining MooseFS ports for FreeBSD and freebsd@moosefs.pro is set as a maintainer). It also allows a few people at our team be notified about updates. Both this issue and 245688 was raised by Agata Kruszona-Zawadzka <chogata@moosefs.pro>.

We will discuss it internally and we may limit usage of freebsd@moosefs.pro account to raising bug related to MooseFS's ports updates. Other issues could be raised by specific people and freebsd@moosefs.pro alias could be CC'd in such case.

Thanks for pointing this out,
Piotr
Comment 21 commit-hook freebsd_committer freebsd_triage 2020-06-12 20:32:56 UTC
A commit references this bug:

Author: asomers
Date: Fri Jun 12 20:32:29 UTC 2020
New revision: 362116
URL: https://svnweb.freebsd.org/changeset/base/362116

Log:
  MFC r361401:

  Fix issues with FUSE_ACCESS when default_permissions is disabled

  This patch fixes two issues relating to FUSE_ACCESS when the
  default_permissions mount option is disabled:

  * VOP_ACCESS() calls with VADMIN set should never be sent to a fuse server
    in the form of FUSE_ACCESS operations. The FUSE protocol has no equivalent
    of VADMIN, so we must evaluate such things kernel-side, regardless of the
    default_permissions setting.

  * The FUSE protocol only requires FUSE_ACCESS to be sent for two purposes:
    for the access(2) syscall and to check directory permissions for
    searchability during lookup. FreeBSD sends it much more frequently, due to
    differences between our VFS and Linux's, for which FUSE was designed. But
    this patch does eliminate several cases not required by the FUSE protocol:

    * for any FUSE_*XATTR operation
    * when creating a new file
    * when deleting a file
    * when setting timestamps, such as by utimensat(2).

  * Additionally, when default_permissions is disabled, this patch removes one
    FUSE_GETATTR operation when deleting a file.

  PR:		245689
  Reported by:	MooseFS FreeBSD Team <freebsd@moosefs.pro>
  Reviewed by:	cem
  Differential Revision:	https://reviews.freebsd.org/D24777

Changes:
_U  stable/12/
  stable/12/sys/fs/fuse/fuse_internal.c
  stable/12/sys/fs/fuse/fuse_vnops.c
  stable/12/tests/sys/fs/fusefs/access.cc
  stable/12/tests/sys/fs/fusefs/rename.cc
  stable/12/tests/sys/fs/fusefs/rmdir.cc
  stable/12/tests/sys/fs/fusefs/unlink.cc
  stable/12/tests/sys/fs/fusefs/utils.cc
  stable/12/tests/sys/fs/fusefs/utils.hh
  stable/12/tests/sys/fs/fusefs/xattr.cc
Comment 22 Kubilay Kocak freebsd_committer freebsd_triage 2020-07-14 05:38:43 UTC
^Triage: Track merge