Bug 236977

Summary: [msdosfs] returns cached truncated data
Product: Base System Reporter: Alan Somers <asomers>
Component: kernAssignee: Konstantin Belousov <kib>
Status: Closed FIXED    
Severity: Affects Many People CC: emaste, kib, pho
Priority: --- Keywords: needs-patch, needs-qa
Version: CURRENTFlags: koobs: mfc-stable12?
koobs: mfc-stable11?
Hardware: Any   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233783
Description Flags
msdosfs: zero tail of the last block on truncation for VREG vnodes as well. none

Description Alan Somers freebsd_committer 2019-04-03 02:38:12 UTC
msdosfs does not correctly discard data when a file's size is reduced during truncate.  The pattern is this:
1) write some data
2) truncate down.  cached data past truncation point should be discarded.
3) truncate back up.
4) read the area between the two truncation points.  It will return the data written in step 1.

This likely has the same root cause as bug 233783, which affected fusefs.  The solution is likely something similar to r345823.

Steps to Reproduce
$ cd /path/to/freebsd/sources/tools/regression/fsx
$ make
$ sudo newfs_msdos /dev/vtbd2
$ sudo mount -t msdosfs /dev/vtbd2 /mnt
$ cd /mnt
$ sudo /usr/obj/whatever/amd64.amd64/tools/regression/fsx/fsx -WR -P /tmp -S18 -n fsx.bin
mapped writes DISABLED
Seed set to 18
truncating to largest ever: 0xb023
truncating to largest ever: 0x192cd
truncating to largest ever: 0x3b023
truncating to largest ever: 0x3f621
READ BAD DATA: offset = 0x7d2a, size = 0x9f4
0x 7d2a	0x0000	0x0745	0x  2d6
operation# (mod 256) for the bad data may be 7
LOG DUMP (23 total operations):
1(1 mod 256): TRUNCATE UP	from 0x0 to 0xb023	******WWWW
2(2 mod 256): TRUNCATE UP	from 0xb023 to 0x192cd
3(3 mod 256): WRITE	0x21da9 thru 0x2efe8	(0xd240 bytes) HOLE
4(4 mod 256): READ	0xa81a thru 0x15033	(0xa81a bytes)
5(5 mod 256): WRITE	0xefe2 thru 0x18cf0	(0x9d0f bytes)
6(6 mod 256): READ	0xc53e thru 0x10ac7	(0x458a bytes)
7(7 mod 256): WRITE	0x228f thru 0xdb4a	(0xb8bc bytes)	***WWWW
8(8 mod 256): WRITE	0x1f676 thru 0x23936	(0x42c1 bytes)
9(9 mod 256): TRUNCATE DOWN	from 0x2efe9 to 0x5bb5	******WWWW
10(10 mod 256): TRUNCATE UP	from 0x5bb5 to 0x3b023	******WWWW
11(11 mod 256): READ	0xd500 thru 0x1443e	(0x6f3f bytes)
12(12 mod 256): TRUNCATE DOWN	from 0x3b023 to 0x32522
13(13 mod 256): WRITE	0x2d8f4 thru 0x32039	(0x4746 bytes)
14(14 mod 256): TRUNCATE DOWN	from 0x32522 to 0x1f921
15(15 mod 256): READ	0xa52a thru 0x1801e	(0xdaf5 bytes)
16(16 mod 256): WRITE	0x37534 thru 0x3be24	(0x48f1 bytes) HOLE
17(17 mod 256): READ	0x9724 thru 0xed9e	(0x567b bytes)
18(18 mod 256): TRUNCATE UP	from 0x3be25 to 0x3f621
19(19 mod 256): READ	0xe5e3 thru 0x15a5c	(0x747a bytes)
20(20 mod 256): TRUNCATE DOWN	from 0x3f621 to 0x23d10
21(21 mod 256): READ	0x15ea7 thru 0x16f2c	(0x1086 bytes)
22(22 mod 256): TRUNCATE DOWN	from 0x23d10 to 0x1a382
23(23 mod 256): READ	0x7d2a thru 0x871d	(0x9f4 bytes)	***RRRR***
Correct content saved for comparison
(maybe hexdump "fsx.bin" vs "fsx.bin.fsxgood")
Comment 1 Konstantin Belousov freebsd_committer 2019-04-03 12:37:55 UTC
(In reply to Alan Somers from comment #0)
This is somewhat strange.  What is the block size on your testing file system ?

Call sequence is msdosfs_setattr()->detrunc()->vtruncbuf()->vnode_pager_setsize(), and the last call zeroes the invalid portion of the partially valid last page.  I suspect that in your test case block size is greater than the page size, and lingering pages from the last block after the new EOF are not cleared.

If I am right, then the attached patch might fix it to you:
Comment 2 Konstantin Belousov freebsd_committer 2019-04-03 12:38:56 UTC
Created attachment 203344 [details]
msdosfs: zero tail of the last block on truncation for VREG vnodes as well.
Comment 3 Alan Somers freebsd_committer 2019-04-03 13:48:06 UTC
I'm creating my filesystem with the default settings on a 1GB device; I'm not sure what block size is the default.  Your patch fixes the problem.  Does it suggest a bug in vnode_pager_setsize?
Comment 4 Konstantin Belousov freebsd_committer 2019-04-03 14:04:56 UTC
(In reply to Alan Somers from comment #3)
newfs should report its type (fat32 most likely) and block size.

No, vnode_pager_setsize() operates as intended, the pages after EOF are wired by the buffer and not released when we invalidate buffers that we can.

I will commit the patch after Peter does some sanity checking for it, hopefully.
Comment 5 Peter Holm freebsd_committer 2019-04-03 14:07:43 UTC
(In reply to Konstantin Belousov from comment #4)

Sure, I'll test your patch.
Comment 6 Peter Holm freebsd_committer 2019-04-03 16:43:47 UTC
I reproduced the problem and verified that the patch fixed this.
I ran the few msdosfs tests I have on both i386 and amd64.
No problems seen.
Comment 7 commit-hook freebsd_committer 2019-04-03 17:03:00 UTC
A commit references this bug:

Author: kib
Date: Wed Apr  3 17:02:19 UTC 2019
New revision: 345847
URL: https://svnweb.freebsd.org/changeset/base/345847

  msdosfs: zero tail of the last block on truncation for VREG vnodes as well.

  Despite the call to vtruncbuf() from detrunc(), which results in
  zeroing part of the partial page after EOF, there still is a
  possibility to retain the stale data which is revived on file
  enlargement.  If the filesystem block size is greater than the page
  size, partial block might keep other after-EOF pages wired and they
  get reused then.  Fix it by zeroing whole part of the partial buffer
  after EOF, not relying on vnode_pager_setsize().

  PR:	236977
  Reported by:	asomers
  Tested by:	pho
  Sponsored by:	The FreeBSD Foundation
  MFC after:	1 week

Comment 8 commit-hook freebsd_committer 2019-04-09 07:36:58 UTC
A commit references this bug:

Author: pho
Date: Tue Apr  9 07:36:40 UTC 2019
New revision: 346047
URL: https://svnweb.freebsd.org/changeset/base/346047

  Added a msdosfs regression test.

  PR:		236977
  Sponsored by:	Dell EMC Isilon

Comment 9 Alan Somers freebsd_committer 2020-09-19 18:41:16 UTC
Already MFCed by r346074.  I don't know why the bugzilla didn't notice.