Created attachment 228584 [details] Simple file ops program In connection with bug #256937, bug #256936 and reports from MooseFS users (that complain, that with large source trees "git clone" operation often fails on MooseFS, when client/mount is on FreeBSD) we developed a program, that shows a basic problem with FreeBSD kernel. It's a simple program, that creates 2 processes. The main process continuously performs 3 operations on a file, at random: read, write or ftruncate. The child process performs stat operation on the same file. If the program is run on a local filesystem on FreeBSD, it works. If it is run on any fusefs (SSHfs, MooseFS) or even on NFS, at some point (usually under 1 minute) the read operation will return wrong number of bytes. No "networking" necessary - all modifications and reads from file happen on one mountpoint. The same program will run on Linux on any kind of fs (local, nfs, fusefs) with no problems. I'm attaching the program. The steps to repeat are simple: install FreeBSD (13 or 14, but with patch from bug #256937, otherwise you will experience kernel panics), compile the program and run it on selected filesystems, wait for results.
@Alan please could you have a look at this one too? Many thanks
Rick, have you every seen a bug like this? To summarize: * One thread alternately truncates, extends, and reads a file * Another thread stats the same file After a few seconds, one of the read operations returns more data than the file ought to contain. It usually rounds up to a 64kB boundary, but not always. I can reproduce it with the NFS client, but the OP says it happens in fuse, too.
Well, I think I've figured out what the breakage is for the NFS client. I'll leave it to asomers@ to decide if fuse suffers from the same problem. To be honest, I think this bug has existed in the BSD NFS client *forever* (as in, since I first wrote NFS client code for 4.3BSD in 1986, believe it or not;-). Here's what seems to happen: Child Parent - does stat() - does VOP_LOOKUP(), which does the Lookup RPC with the directory vnode locked - does ftruncate(), acquiring --> acquires file handle and exclusive vnode lock on file vnode attributes, including Size, valid at this point in time - blocks waiting for locked file vnode - does VOP_SETATTR() of Size, changing the file's size - releases the file vnode - acquires file vnode and fills in now stale attributes including the old wrong Size - does a read() which returns wrong data size I am working on a patch, which I plan to discuss with asomers@ off list. I will put a patch here as an attachment if/when I have one.
Nice sleuthing, Rick! It sounds simple when you put it that way.
Created attachment 228935 [details] mark local file modification and avoid loading Lookup attributes when marked This patch seems to fix the problem for NFS. It defines a new flag called NMODLOCALLY for the NFS vnode, which is set when a file is Written/Setattr-of-Size. Then Lookup throws away the attributes in the RPC reply when the flag is set, since they may be stale and no longer correct. This patch needs more regression testing and determination if the RPC counts increase significantly. A possible alternative is to use a timestamp of when Written/Setattr-of-Size happens, as suggested by asomers@.
Created attachment 228973 [details] timestamp file modifications so Lookup can discard stale attributes Patch (attachment 228935 [details]) resulted in a sifnificant increase in RPC counts during regression testing. This patch uses timestamps as suggested by asomers@. The NFS client VOP_SETATTR-of-Size/VOP_ALLOCATE/VOP_DEALLOCATE save the time when the RPC has been completed in the NFS vnode. Then Lookup takes the time when the RPC is started and compares that time with the saved timestamp in the NFS node to decide if the file attributes in the Lookup reply might be stale and need to be discarded. This patch does not result in an increase of RPC counts and passes the test program (attachment #228584 [details]). It does not handle any possible race between Lookup and Write RPCs, since many of the Write RPCs happen after VOP_WRITE() returns and the vnode is unlocked. I am working on a separate patch for this case.
Created attachment 229053 [details] timestamp file mods so Lookup can discard stale attributes against Writing Similar to attachment #228973 [details], I believe there could be a race between the NFS client VOP_LOOKUP() and file Writing that could result in stale file attributes being loaded into the NFS vnode by VOP_LOOKUP(). I have not been able to reproduce a failure due to this race, but I believe that there are two possibilities: The Lookup RPC happens while VOP_WRITE() is being executed and loads stale file attributes after VOP_WRITE() returns when it has already completed the Write/Commit RPC(s). --> For this case, setting the local modify timestamp at the end of VOP_WRITE() should ensure that stale file attributes are not loaded. The Lookup RPC occurs after VOP_WRITE() has returned, while asynchronous Write/Commit RPCs are in progress and then is blocked by the vnode held by VOP_OPEN/VOP_CLOSE/VOP_FSYNC which will flush writes via ncl_flush() or ncl_vinvalbuf(), clearing the NMODIFIED flag (which indicates Writes-in-progress). The VOP_LOOKUP() then acquires the NFS vnode lock and fills in stale file attributes. --> Setting the local modify timestamp in ncl_flsuh() and ncl_vinvalbuf() when they clear NMODIFIED should ensure that stale file attributes are not loaded. This patch does the above.
Just fyi, I just ran the demo/test program on a stable/12 kernel and it did not fail. - The bug fixed by 2nd attachment has been in the NFS client code for decades. --> I suspect a post-stable/12 change to the name cache exposed the failure. (If the Lookups hit the name cache, there won't be Lookup RPCs to race against the Setattr-of-size.) The fix probably should be MFC'd to stable/12, but won't make it in 12.3.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=2be417843a04f25e631e99d5188eb2652b13d80d commit 2be417843a04f25e631e99d5188eb2652b13d80d Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-10-30 23:35:02 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-10-30 23:35:02 +0000 PR#259071 provides a test program that fails for the NFS client. Testing with it, there appears to be a race between Lookup and VOPs like Setattr-of-size, where Lookup ends up loading stale attributes (including what might be the wrong file size) into the NFS vnode's attribute cache. The race occurs when the modifying VOP (which holds a lock on the vnode), blocks the acquisition of the vnode in Lookup, after the RPC (with now potentially stale attributes). Here's what seems to happen: Child Parent does stat(), which does VOP_LOOKUP(), doing the Lookup RPC with the directory vnode locked, acquiring file attributes valid at this point in time blocks waiting for locked file does ftruncate(), which vnode does VOP_SETATTR() of Size, changing the file's size while holding an exclusive lock on the file's vnode releases the vnode lock acquires file vnode and fills in now stale attributes including the old wrong Size does a read() which returns wrong data size This patch fixes the problem by saving a timestamp in the NFS vnode in the VOPs that modify the file (Setattr-of-size, Allocate). Then lookup/readdirplus compares that timestamp with the time just before starting the RPC after it has acquired the file's vnode. If the modifying RPC occurred during the Lookup, the attributes in the RPC reply are discarded, since they might be stale. With this patch the test program works as expected. Note that the test program does not fail on a July stable/12, although this race is in the NFS client code. I suspect a fairly recent change to the name caching code exposed this bug. PR: 259071 Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D32635 sys/fs/nfsclient/nfs_clrpcops.c | 29 +++++++++++++-- sys/fs/nfsclient/nfs_clvnops.c | 78 +++++++++++++++++++++++++++++++++++++---- sys/fs/nfsclient/nfsnode.h | 1 + 3 files changed, 99 insertions(+), 9 deletions(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=ab87c39c257e7130677867f8e5c11a3ec53fa1bc commit ab87c39c257e7130677867f8e5c11a3ec53fa1bc Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-10-30 23:46:14 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-10-30 23:46:14 +0000 nfscl: Set n_localmodtime in Deallocate Commit 2be417843a04 added n_localmodtime, which is used by Lookup and ReaddirPlus to check to see if the file attributes in an RPC reply might be stale. This patch sets n_localmodtime in Deallocate. Done as a separate commit, since Deallocate is not in stable/13. PR: 259071 Reviewed by: asomers Differential Revision: https://reviews.freebsd.org/D32635 sys/fs/nfsclient/nfs_clvnops.c | 5 +++++ 1 file changed, 5 insertions(+)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=50dcff0816e5e4aa94f51ce27da5121e49902996 commit 50dcff0816e5e4aa94f51ce27da5121e49902996 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-10-31 00:08:28 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-10-31 00:08:28 +0000 nfscl: Add setting n_localmodtime to the Write RPC code Similar to commit 2be417843a, I believe there could be a race between the NFS client VOP_LOOKUP() and file Writing that could result in stale file attributes being loaded into the NFS vnode by VOP_LOOKUP(). I have not been able to reproduce a failure due to this race, but I believe that there are two possibilities: The Lookup RPC happens while VOP_WRITE() is being executed and loads stale file attributes after VOP_WRITE() returns when it has already completed the Write/Commit RPC(s). --> For this case, setting the local modify timestamp at the end of VOP_WRITE() should ensure that stale file attributes are not loaded. The Lookup RPC occurs after VOP_WRITE() has returned, while asynchronous Write/Commit RPCs are in progress and then is blocked by the vnode held by VOP_OPEN/VOP_CLOSE/VOP_FSYNC which will flush writes via ncl_flush() or ncl_vinvalbuf(), clearing the NMODIFIED flag (which indicates Writes-in-progress). The VOP_LOOKUP() then acquires the NFS vnode lock and fills in stale file attributes. --> Setting the local modify timestamp in ncl_flsuh() and ncl_vinvalbuf() when they clear NMODIFIED should ensure that stale file attributes are not loaded. This patch does the above. PR: 259071 Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D32677 sys/fs/nfsclient/nfs_clbio.c | 18 +++++++++++++++--- sys/fs/nfsclient/nfs_clvnops.c | 7 +++++++ 2 files changed, 22 insertions(+), 3 deletions(-)
@Alan Any luck with the Fuse side of this?
I haven't tested it on fuse yet.
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d4c5187bb17925578532c98da55310250c4715d5 commit d4c5187bb17925578532c98da55310250c4715d5 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-10-31 00:08:28 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-11-14 01:21:50 +0000 nfscl: Add setting n_localmodtime to the Write RPC code Similar to commit 2be417843a, I believe there could be a race between the NFS client VOP_LOOKUP() and file Writing that could result in stale file attributes being loaded into the NFS vnode by VOP_LOOKUP(). I have not been able to reproduce a failure due to this race, but I believe that there are two possibilities: The Lookup RPC happens while VOP_WRITE() is being executed and loads stale file attributes after VOP_WRITE() returns when it has already completed the Write/Commit RPC(s). --> For this case, setting the local modify timestamp at the end of VOP_WRITE() should ensure that stale file attributes are not loaded. The Lookup RPC occurs after VOP_WRITE() has returned, while asynchronous Write/Commit RPCs are in progress and then is blocked by the vnode held by VOP_OPEN/VOP_CLOSE/VOP_FSYNC which will flush writes via ncl_flush() or ncl_vinvalbuf(), clearing the NMODIFIED flag (which indicates Writes-in-progress). The VOP_LOOKUP() then acquires the NFS vnode lock and fills in stale file attributes. --> Setting the local modify timestamp in ncl_flsuh() and ncl_vinvalbuf() when they clear NMODIFIED should ensure that stale file attributes are not loaded. This patch does the above. PR: 259071 (cherry picked from commit 50dcff0816e5e4aa94f51ce27da5121e49902996) sys/fs/nfsclient/nfs_clbio.c | 18 +++++++++++++++--- sys/fs/nfsclient/nfs_clvnops.c | 7 +++++++ 2 files changed, 22 insertions(+), 3 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=00ec7c6ab526e0a8b10f57c578f423d647c4d93b commit 00ec7c6ab526e0a8b10f57c578f423d647c4d93b Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-10-30 23:35:02 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-11-14 02:52:26 +0000 nfscl: Fix race between Lookup and Setattr of size PR#259071 provides a test program that fails for the NFS client. Testing with it, there appears to be a race between Lookup and VOPs like Setattr-of-size, where Lookup ends up loading stale attributes (including what might be the wrong file size) into the NFS vnode's attribute cache. The race occurs when the modifying VOP (which holds a lock on the vnode), blocks the acquisition of the vnode in Lookup, after the RPC (with now potentially stale attributes). Here's what seems to happen: Child Parent does stat(), which does VOP_LOOKUP(), doing the Lookup RPC with the directory vnode locked, acquiring file attributes valid at this point in time blocks waiting for locked file does ftruncate(), which vnode does VOP_SETATTR() of Size, changing the file's size while holding an exclusive lock on the file's vnode releases the vnode lock acquires file vnode and fills in now stale attributes including the old wrong Size does a read() which returns wrong data size This patch fixes the problem by saving a timestamp in the NFS vnode in the VOPs that modify the file (Setattr-of-size, Allocate). Then lookup/readdirplus compares that timestamp with the time just before starting the RPC after it has acquired the file's vnode. If the modifying RPC occurred during the Lookup, the attributes in the RPC reply are discarded, since they might be stale. With this patch the test program works as expected. Note that the test program does not fail on a July stable/12, although this race is in the NFS client code. I suspect a fairly recent change to the name caching code exposed this bug. PR: 259071 (cherry picked from commit 2be417843a04f25e631e99d5188eb2652b13d80d) sys/fs/nfsclient/nfs_clrpcops.c | 29 +++++++++++++-- sys/fs/nfsclient/nfs_clvnops.c | 78 +++++++++++++++++++++++++++++++++++++---- sys/fs/nfsclient/nfsnode.h | 1 + 3 files changed, 99 insertions(+), 9 deletions(-)
Update: I just reproduced the problem with sshfs, and the cause looks to be exactly the same as on NFS. It's a race between fuse_vnop_lookup and fuse_vnop_setattr. fuse_vnop_getattr, fuse_vnop_getpages, and fuse_vnop_bmap aren't vulnerable, because they have the vnode locked for the entire time. fuse_vfsop_vget, used by the NFS server, does look vulnerable.
Righto, Since I have not been able to reproduce the problem on stable/12 (although the race is obviously in the code), I suspect that a recent change to name caching exposed it. Looking at the name cache for this test is now at the top of my "to do" list, so I'll post if/when I know more. I am MFC'ng the fixes to stable/12, but won't bother with trying to get them into 12.3, since the problem is not exposed for it.
Just fyi, I have now reproduced the problem on FreeBSD 12.3. For some reason (timing?), it did not recur easily for NFSv4, but does for NFSv3. Then I left the NFSv4 test running for a long time and it eventually failed. The behaviour for name caching seems to be the same for 12.3 as newer 13/14 kernels. I doubt the patches can make it in the 12.3 release, but I will get them into stable/12.
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=ca826694e3b08133b9e16744e1178863cfc4c16b commit ca826694e3b08133b9e16744e1178863cfc4c16b Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-10-30 23:35:02 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-11-17 23:07:11 +0000 nfscl: Fix a race between Setattr of size and Lookup PR#259071 provides a test program that fails for the NFS client. Testing with it, there appears to be a race between Lookup and VOPs like Setattr-of-size, where Lookup ends up loading stale attributes (including what might be the wrong file size) into the NFS vnode's attribute cache. The race occurs when the modifying VOP (which holds a lock on the vnode), blocks the acquisition of the vnode in Lookup, after the RPC (with now potentially stale attributes). Here's what seems to happen: Child Parent does stat(), which does VOP_LOOKUP(), doing the Lookup RPC with the directory vnode locked, acquiring file attributes valid at this point in time blocks waiting for locked file does ftruncate(), which vnode does VOP_SETATTR() of Size, changing the file's size while holding an exclusive lock on the file's vnode releases the vnode lock acquires file vnode and fills in now stale attributes including the old wrong Size does a read() which returns wrong data size This patch fixes the problem by saving a timestamp in the NFS vnode in the VOPs that modify the file (Setattr-of-size, Allocate). Then lookup/readdirplus compares that timestamp with the time just before starting the RPC after it has acquired the file's vnode. If the modifying RPC occurred during the Lookup, the attributes in the RPC reply are discarded, since they might be stale. With this patch the test program works as expected. Note that the test program does not fail on a July stable/12, although this race is in the NFS client code. I suspect a fairly recent change to the name caching code exposed this bug. PR: 259071 (cherry picked from commit 2be417843a04f25e631e99d5188eb2652b13d80d) sys/fs/nfsclient/nfs_clrpcops.c | 29 ++++++++++++++-- sys/fs/nfsclient/nfs_clvnops.c | 73 +++++++++++++++++++++++++++++++++++++---- sys/fs/nfsclient/nfsnode.h | 1 + 3 files changed, 93 insertions(+), 10 deletions(-)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d472d0358e8fe27c1b1050564cdbd9082bd06a05 commit d472d0358e8fe27c1b1050564cdbd9082bd06a05 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-10-31 00:08:28 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-11-18 00:19:50 +0000 nfscl: Add setting n_localmodtime to the Write RPC code Similar to commit 2be417843a, I believe there could be a race between the NFS client VOP_LOOKUP() and file Writing that could result in stale file attributes being loaded into the NFS vnode by VOP_LOOKUP(). I have not been able to reproduce a failure due to this race, but I believe that there are two possibilities: The Lookup RPC happens while VOP_WRITE() is being executed and loads stale file attributes after VOP_WRITE() returns when it has already completed the Write/Commit RPC(s). --> For this case, setting the local modify timestamp at the end of VOP_WRITE() should ensure that stale file attributes are not loaded. The Lookup RPC occurs after VOP_WRITE() has returned, while asynchronous Write/Commit RPCs are in progress and then is blocked by the vnode held by VOP_OPEN/VOP_CLOSE/VOP_FSYNC which will flush writes via ncl_flush() or ncl_vinvalbuf(), clearing the NMODIFIED flag (which indicates Writes-in-progress). The VOP_LOOKUP() then acquires the NFS vnode lock and fills in stale file attributes. --> Setting the local modify timestamp in ncl_flsuh() and ncl_vinvalbuf() when they clear NMODIFIED should ensure that stale file attributes are not loaded. This patch does the above. PR: 259071 (cherry picked from commit 50dcff0816e5e4aa94f51ce27da5121e49902996) sys/fs/nfsclient/nfs_clbio.c | 18 +++++++++++++++--- sys/fs/nfsclient/nfs_clvnops.c | 7 +++++++ 2 files changed, 22 insertions(+), 3 deletions(-)
A fix for fusefs is under review here: https://reviews.freebsd.org/D33158
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=13d593a5b060cf7be40acfa2ca9dc9e0e2339a31 commit 13d593a5b060cf7be40acfa2ca9dc9e0e2339a31 Author: Alan Somers <asomers@FreeBSD.org> AuthorDate: 2021-11-29 02:17:34 +0000 Commit: Alan Somers <asomers@FreeBSD.org> CommitDate: 2022-01-01 00:38:42 +0000 Fix a race in fusefs that can corrupt a file's size. VOPs like VOP_SETATTR can change a file's size, with the vnode exclusively locked. But VOPs like VOP_LOOKUP look up the file size from the server without the vnode locked. So a race is possible. For example: 1) One thread calls VOP_SETATTR to truncate a file. It locks the vnode and sends FUSE_SETATTR to the server. 2) A second thread calls VOP_LOOKUP and fetches the file's attributes from the server. Then it blocks trying to acquire the vnode lock. 3) FUSE_SETATTR returns and the first thread releases the vnode lock. 4) The second thread acquires the vnode lock and caches the file's attributes, which are now out-of-date. Fix this race by recording a timestamp in the vnode of the last time that its filesize was modified. Check that timestamp during VOP_LOOKUP and VFS_VGET. If it's newer than the time at which FUSE_LOOKUP was issued to the server, ignore the attributes returned by FUSE_LOOKUP. PR: 259071 Reported by: Agata <chogata@moosefs.pro> Reviewed by: pfg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D33158 sys/fs/fuse/fuse_internal.c | 5 + sys/fs/fuse/fuse_io.c | 5 +- sys/fs/fuse/fuse_node.c | 5 +- sys/fs/fuse/fuse_node.h | 6 + sys/fs/fuse/fuse_vfsops.c | 13 +- sys/fs/fuse/fuse_vnops.c | 21 +- tests/sys/fs/fusefs/Makefile | 1 + tests/sys/fs/fusefs/last_local_modify.cc (new) | 469 +++++++++++++++++++++++++ 8 files changed, 516 insertions(+), 9 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=36ba360558a752dfa3832ec19bb182e5cf1876d7 commit 36ba360558a752dfa3832ec19bb182e5cf1876d7 Author: Alan Somers <asomers@FreeBSD.org> AuthorDate: 2021-11-29 02:17:34 +0000 Commit: Alan Somers <asomers@FreeBSD.org> CommitDate: 2022-01-18 00:59:03 +0000 Fix a race in fusefs that can corrupt a file's size. VOPs like VOP_SETATTR can change a file's size, with the vnode exclusively locked. But VOPs like VOP_LOOKUP look up the file size from the server without the vnode locked. So a race is possible. For example: 1) One thread calls VOP_SETATTR to truncate a file. It locks the vnode and sends FUSE_SETATTR to the server. 2) A second thread calls VOP_LOOKUP and fetches the file's attributes from the server. Then it blocks trying to acquire the vnode lock. 3) FUSE_SETATTR returns and the first thread releases the vnode lock. 4) The second thread acquires the vnode lock and caches the file's attributes, which are now out-of-date. Fix this race by recording a timestamp in the vnode of the last time that its filesize was modified. Check that timestamp during VOP_LOOKUP and VFS_VGET. If it's newer than the time at which FUSE_LOOKUP was issued to the server, ignore the attributes returned by FUSE_LOOKUP. PR: 259071 Reported by: Agata <chogata@moosefs.pro> Reviewed by: pfg Differential Revision: https://reviews.freebsd.org/D33158 (cherry picked from commit 13d593a5b060cf7be40acfa2ca9dc9e0e2339a31) sys/fs/fuse/fuse_internal.c | 5 + sys/fs/fuse/fuse_io.c | 5 +- sys/fs/fuse/fuse_node.c | 5 +- sys/fs/fuse/fuse_node.h | 6 + sys/fs/fuse/fuse_vfsops.c | 13 +- sys/fs/fuse/fuse_vnops.c | 21 +- tests/sys/fs/fusefs/Makefile | 1 + tests/sys/fs/fusefs/last_local_modify.cc (new) | 469 +++++++++++++++++++++++++ 8 files changed, 516 insertions(+), 9 deletions(-)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=c85846ea3ea531affb80edb2c982017d35b5a40f commit c85846ea3ea531affb80edb2c982017d35b5a40f Author: Alan Somers <asomers@FreeBSD.org> AuthorDate: 2021-11-29 02:17:34 +0000 Commit: Alan Somers <asomers@FreeBSD.org> CommitDate: 2022-01-18 01:23:34 +0000 Fix a race in fusefs that can corrupt a file's size. VOPs like VOP_SETATTR can change a file's size, with the vnode exclusively locked. But VOPs like VOP_LOOKUP look up the file size from the server without the vnode locked. So a race is possible. For example: 1) One thread calls VOP_SETATTR to truncate a file. It locks the vnode and sends FUSE_SETATTR to the server. 2) A second thread calls VOP_LOOKUP and fetches the file's attributes from the server. Then it blocks trying to acquire the vnode lock. 3) FUSE_SETATTR returns and the first thread releases the vnode lock. 4) The second thread acquires the vnode lock and caches the file's attributes, which are now out-of-date. Fix this race by recording a timestamp in the vnode of the last time that its filesize was modified. Check that timestamp during VOP_LOOKUP and VFS_VGET. If it's newer than the time at which FUSE_LOOKUP was issued to the server, ignore the attributes returned by FUSE_LOOKUP. PR: 259071 Reported by: Agata <chogata@moosefs.pro> Reviewed by: pfg Differential Revision: https://reviews.freebsd.org/D33158 (cherry picked from commit 13d593a5b060cf7be40acfa2ca9dc9e0e2339a31) sys/fs/fuse/fuse_internal.c | 5 + sys/fs/fuse/fuse_io.c | 5 +- sys/fs/fuse/fuse_node.c | 5 +- sys/fs/fuse/fuse_node.h | 6 + sys/fs/fuse/fuse_vfsops.c | 13 +- sys/fs/fuse/fuse_vnops.c | 15 +- tests/sys/fs/fusefs/Makefile | 1 + tests/sys/fs/fusefs/last_local_modify.cc (new) | 423 +++++++++++++++++++++++++ 8 files changed, 466 insertions(+), 7 deletions(-)