Bug 233783

Summary: [fusefs] returns cached truncated data
Product: Base System Reporter: Alan Somers <asomers>
Component: kernAssignee: Alan Somers <asomers>
Status: In Progress ---    
Severity: Affects Many People CC: cem, emaste, fsu, sigsys
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236977

Description Alan Somers freebsd_committer 2018-12-04 17:50:45 UTC
fuse(4) caches data in the kernel, even when mounted with "-o direct_io".  After a truncation, it may return cached data from past the truncation point.  This data is invalid and should've been dropped during the truncation.  I've reproduced this behavior with libfuse's "passthrough" example program using both CURRENT @r340987 and 12.0-BETA4.  Steps to reproduce:

$ cd /path/to/freebsd/sources/tools/regression/fsx
$ make
$ sudo pkg install pkgconf fusefs-libs3
$ git clone git@github.com:libfuse/libfuse.git
$ cd libfuse/example
$ cc -Wall -I/usr/local/include/fuse3 -L/usr/local/lib -lfuse3 -lpthread passthrough.c -o passthrough
$ sudo kldload fuse
$ mkdir /tmp/mnt
$ sudo ./passthrough -f -sd -o allow_other  /tmp/mnt

Then, run fsx in a separate session:
$ cd /tmp/mnt/tmp
$ /usr/obj/whatever/amd64.amd64/tools/regression/fsx/fsx -WR -P /tmp -S10 fsx.bin

fsx will terminate with this output.  It's reproducible since we specified the seed:
mapped writes DISABLED
Seed set to 10
skipping zero size read
skipping zero size read
skipping zero size read
skipping zero size read
skipping zero size read
skipping zero size read
skipping zero size read
truncating to largest ever: 0x10016
READ BAD DATA: offset = 0xe1e6, size = 0xe229
OFFSET	GOOD	BAD	RANGE
0x1b8df	0x0000	0xb708	0x  b2b
operation# (mod 256) for the bad data may be 8
LOG DUMP (11 total operations):
1(1 mod 256): SKIPPED (no operation)
2(2 mod 256): SKIPPED (no operation)
3(3 mod 256): SKIPPED (no operation)
4(4 mod 256): SKIPPED (no operation)
5(5 mod 256): SKIPPED (no operation)
6(6 mod 256): SKIPPED (no operation)
7(7 mod 256): SKIPPED (no operation)
8(8 mod 256): WRITE	0x1b8df thru 0x21ac6	(0x61e8 bytes) HOLE	***WWWW
9(9 mod 256): TRUNCATE DOWN	from 0x21ac7 to 0x10016	******WWWW
10(10 mod 256): WRITE	0x3e90a thru 0x3ffff	(0x16f6 bytes) HOLE	***WWWW
11(11 mod 256): READ	0xe1e6 thru 0x1c40e	(0xe229 bytes)	***RRRR***
Correct content saved for comparison
(maybe hexdump "fsx.bin" vs "fsx.bin.fsxgood")

By comparing FSX's output to passthrough's, we can deduce where fuse.ko serviced a read from its cache:

fsx's log    			passthrough's log
---------    			-----------------
write	0x1b8df	0x61e8		write	112863	18209
				write	131072	6855
				trunc	137927
trunc	0x10016			trunc	65558
write	0x3d90a	0x16f6		read	196608	65536
				write	256266	5878
				trunc	262144
read	0xe1e6	0x1c40e		read	0	65536 (services 57830 - 65536)
				<read from cache>     (services 65536 - 173556)

The miscompare happened at address 0x1b8df, which was in the part of the read serviced from the cache, and it's the first part of that cached read that ever contained non-zero data in the lifetime of this test.
Comment 1 Conrad Meyer freebsd_committer 2018-12-04 18:23:06 UTC
Quite possibly related to bug 230258.
Comment 2 Conrad Meyer freebsd_committer 2018-12-04 18:24:37 UTC
Everything about our fuse driver is kinda crappy.  The attribute caching isn't correct, the data caching isn't correct, yada yada.  I'm not sure if we're supposed to be caching data at all or if it's just a (broken) performance optimization.
Comment 3 Alan Somers freebsd_committer 2019-04-02 19:35:45 UTC
This bug is still reproducible on 13, though it now requires "sysctl vfs.fusefs.data_cache_mode=2".  Presumably that's due to cem's change r344186.
Comment 4 commit-hook freebsd_committer 2019-04-03 02:30:15 UTC
A commit references this bug:

Author: asomers
Date: Wed Apr  3 02:29:56 UTC 2019
New revision: 345823
URL: https://svnweb.freebsd.org/changeset/base/345823

Log:
  fusefs: during ftruncate, discard cached data past truncation point

  During truncate, fusefs was discarding entire cached blocks, but it wasn't
  zeroing out the unused portion of a final partial block.  This resulted in
  reads returning stale data.

  PR:		233783
  Reported by:	fsx
  Sponsored by:	The FreeBSD Foundation

Changes:
  projects/fuse2/sys/fs/fuse/fuse_node.c
  projects/fuse2/tests/sys/fs/fusefs/setattr.cc
Comment 5 Alan Somers freebsd_committer 2019-04-19 22:32:50 UTC
This is complete on the fuse2 branch.
Comment 6 commit-hook freebsd_committer 2019-04-23 22:26:33 UTC
A commit references this bug:

Author: asomers
Date: Tue Apr 23 22:25:50 UTC 2019
New revision: 346607
URL: https://svnweb.freebsd.org/changeset/base/346607

Log:
  fusefs: use vfs_bio_clrbuf in fuse_vnode_setsize

  Reuse fuse_vnode_setsize instead of reinventing the wheel.  This is what
  ext2_ind_truncate does.

  PR:		233783
  Sponsored by:	The FreeBSD Foundation

Changes:
  projects/fuse2/sys/fs/fuse/fuse_node.c