Bug 235774 - [FUSEFS]: Need to evict invalidated cache contents on fuse_write_directbackend()
Summary: [FUSEFS]: Need to evict invalidated cache contents on fuse_write_directbackend()
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Alan Somers
URL:
Keywords:
Depends on: 230258
Blocks: 235773
  Show dependency treegraph
 
Reported: 2019-02-16 00:39 UTC by Conrad Meyer
Modified: 2019-04-26 19:48 UTC (History)
8 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Conrad Meyer freebsd_committer 2019-02-16 00:39:58 UTC
We're allowed to cache clean data, but overwrites should invalidate or replace cached ranges.

+++ This bug was initially created as a clone of Bug #230258 +++
Comment 1 Conrad Meyer freebsd_committer 2019-02-16 00:40:29 UTC
(Again, I'll take if no one else beats me to it.)
Comment 2 Ben RUBSON 2019-03-06 12:18:22 UTC
As stated in bug 230258, there are 2 workarounds to this :
- use FUSE direct_io mount option
- sysctl vfs.fuse.data_cache_mode = 0
Comment 3 Conrad Meyer freebsd_committer 2019-03-06 16:27:06 UTC
(In reply to Ben RUBSON from comment #2)
> As stated in bug 230258, there are 2 workarounds to this :
> - use FUSE direct_io mount option
> - sysctl vfs.fuse.data_cache_mode = 0

Makes sense to me — both have the effect of preventing data caching :-).  There will be a performance impact depending on your workload.
Comment 4 Conrad Meyer freebsd_committer 2019-03-06 17:18:33 UTC
I think fuse's IO_DIRECT path is a mess.  Really all IO should go through the buffer cache, and B_DIRECT and ~B_CACHE are just flags that control the buffer's lifetime once the operation is complete.  Removing the "direct" backends entirely (except as implementation details of strategy()) would simplify and correct the caching logic.

Looking at UFS; it really only has a non-bufcache "rawread" path that uses pbufs (and flushes all dirty bufs on the vnode first!).  There is no equivalent for O_DIRECT writes.  And ffs_rawread basically duplicates the ordinary read path for extremely limited cases (single iov, must be sector sized/aligned, etc) — it's unclear to me why it exists.

ffs_write() just uses the ordinary buf cache, paying attention to ioflag & IO_DIRECT and using vfs_bio_set_flags(, ioflag) to propagate it to b_flags & B_DIRECT.  (B_DIRECT causes the buffer to be released immediately when it is freed, instead of being cached.)

I think we should probably learn from UFS for FUSE's IO modes:

1. Keep and enable the direct_io option, for users who truly want to bypass the buf cache entirely.  Preferably this is a per-mountpoint option rather than a global, but that is an orthogonal enhancement.  Confusingly, this is distinct from opening a file O_DIRECT.  Maybe the sysctl/option can be renamed.  "raw io?"

2. Do not actually use the "direct" paths in FUSE outside of global direct_io mode (or a future MP-specific always-direct mode).

3. A caveat here is: FUSE filesystems (?)don't have a native sector/block size, but the buf cache is in block units.  And, we translate O_WRONLY opens into FUSE FUFH_WRONLY opens.  So there will be some trickiness in partial block writes with a O_WRONLY handle when the block is not in cache.  Today that is sidestepped by invoking direct mode, but shouldn't be.

Anyway, this is all future cleanup ideas for this area.  For the more limited scope of fixing just this PR, we can probably draw inspiration from ffs_rawread_sync().
Comment 5 commit-hook freebsd_committer 2019-04-12 19:05:31 UTC
A commit references this bug:

Author: asomers
Date: Fri Apr 12 19:05:08 UTC 2019
New revision: 346162
URL: https://svnweb.freebsd.org/changeset/base/346162

Log:
  fusefs: evict invalidated cache contents during write-through

  fusefs's default cache mode is "writethrough", although it currently works
  more like "write-around"; writes bypass the cache completely.  Since writes
  bypass the cache, they were leaving stale previously-read data in the cache.
  This commit invalidates that stale data.  It also adds a new global
  v_inval_buf_range method, like vtruncbuf but for a range of a file.

  PR:		235774
  Reported by:	cem
  Sponsored by:	The FreeBSD Foundation

Changes:
  projects/fuse2/sys/fs/fuse/fuse_io.c
  projects/fuse2/sys/kern/vfs_subr.c
  projects/fuse2/sys/sys/vnode.h
  projects/fuse2/tests/sys/fs/fusefs/write.cc
Comment 6 Conrad Meyer freebsd_committer 2019-04-12 19:11:04 UTC
Awesome, thanks Alan!
Comment 7 Ben RUBSON 2019-04-14 11:27:16 UTC
Perfect, many thanks Alan !
Upcoming fusefs version really is promising :)
Comment 8 Alan Somers freebsd_committer 2019-04-19 22:34:05 UTC
This is complete on the fuse2 branch.
Comment 9 commit-hook freebsd_committer 2019-04-26 17:10:02 UTC
A commit references this bug:

Author: asomers
Date: Fri Apr 26 17:09:27 UTC 2019
New revision: 346756
URL: https://svnweb.freebsd.org/changeset/base/346756

Log:
  fusefs: fix cache invalidation error from r346162

  An off-by-one error led to the last page of a write not being removed from
  its object, even though that page's buffer was marked as invalid.

  PR:		235774
  Sponsored by:	The FreeBSD Foundation

Changes:
  projects/fuse2/sys/kern/vfs_subr.c
  projects/fuse2/tests/sys/fs/fusefs/write.cc
Comment 10 commit-hook freebsd_committer 2019-04-26 19:48:20 UTC
A commit references this bug:

Author: asomers
Date: Fri Apr 26 19:47:43 UTC 2019
New revision: 346763
URL: https://svnweb.freebsd.org/changeset/base/346763

Log:
  fusefs: fix a deadlock in VOP_PUTPAGES

  As of r346162 fuse now invalidates the cache during writes.  But it can't do
  that when writing from VOP_PUTPAGES, because the write is coming _from_ the
  cache.  Trying to invalidate the cache in that situation causes a deadlock
  in vm_object_page_remove, because the pages in question have already been
  busied by the same thread.

  PR:		235774
  Sponsored by:	The FreeBSD Foundation

Changes:
  projects/fuse2/sys/fs/fuse/fuse_io.c
  projects/fuse2/sys/fs/fuse/fuse_io.h
  projects/fuse2/sys/fs/fuse/fuse_vnops.c