PROBLEM ======= nfs page faults in response to a bad readdir response by the underlying file system. First it prints to dmesg: Readdir reply file name had imbedded / or nul by Then it page panics: panic: page fault cpuid = 5 time = 1736452691 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00dc52c200 vpanic() at vpanic+0x136/frame 0xfffffe00dc52c330 panic() at panic+0x43/frame 0xfffffe00dc52c390 trap_pfault() at trap_pfault+0x466/frame 0xfffffe00dc52c400 calltrap() at calltrap+0x8/frame 0xfffffe00dc52c400 --- trap 0xc, rip = 0xffffffff80a19014, rsp = 0xfffffe00dc52c4d0, rbp = 0xfffffe00dc52c7f0 --- nfsrpc_readdir() at nfsrpc_readdir+0xbd4/frame 0xfffffe00dc52c7f0 ncl_readdirrpc() at ncl_readdirrpc+0xf0/frame 0xfffffe00dc52c940 ncl_doio() at ncl_doio+0x4bb/frame 0xfffffe00dc52c9e0 ncl_bioread() at ncl_bioread+0x5f4/frame 0xfffffe00dc52cb70 nfs_readdir() at nfs_readdir+0x1d8/frame 0xfffffe00dc52cc80 vop_sigdefer() at vop_sigdefer+0x30/frame 0xfffffe00dc52ccb0 VOP_READDIR_APV() at VOP_READDIR_APV+0x32/frame 0xfffffe00dc52ccd0 kern_getdirentries() at kern_getdirentries+0x1d7/frame 0xfffffe00dc52cdd0 sys_getdirentries() at sys_getdirentries+0x29/frame 0xfffffe00dc52ce00 amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00dc52cf30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00dc52cf30 Examining the core file shows that the value of tl2 is 0x4 in nfs_clrpcops.c:3768, at this line: *tl2++ = cookiep->nfsuquad[0] = cookie.lval[0] = ncookie.lval[0]; STEPS TO REPRODUCE ================== I have a fusefs test case in development that can reliably trigger this panic in nfsd. The test creates a fuse file system and exports it over NFS, then performs a readdir on the NFS mount. ENVIRONMENT =========== FreeBSD 15.0-CURRENT amd64 based on f415b2ef30f7bf0db753f09fbba7b0910475b0d2 (Jan 6 2025), with one small change to fusefs and with rmacklem's WIP patch that adds vfs.nfsd.nfsd_disable_grace. ANALYSIS ======== This looks like a straightforward uninitialized data access. tl2 is initialized to NULL. Then it's set to cp, but only if nfscl_invalidfname reports that the fname is valid. Then it gets dereferenced at line 3768 regardless of whether it was ever initialized.
I've determined that the cause of the failure in nfscl_invalidfname is because my test fuse file system includes the NUL character in its readdir entries. That is, it replies something like this: ents[0].d_namlen = 3; ents[0].d_name = ".."; ents[1].d_namlen = 2; ents[1].d_name = "."; I can and will fix that in the tests. But I can't guarantee that other fuse file systems won't do the same. Even if I have fusefs sanitize these strings in-kernel, other file systems might do the same. For example, I see that ext2 trusts the on-disk data, and copies it into a dirent without sanitizing it. So NFS must be able to handle that.
Created attachment 256592 [details] Fix handling of a nul in a file name received from an NFS server I think this patch should fix the crash. Hopefully Alan can test it?
(In reply to Rick Macklem from comment #2) The patch works for me!
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=f9f0a1d61c7b97c705246c747baec385e0592966 commit f9f0a1d61c7b97c705246c747baec385e0592966 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2025-01-10 03:54:41 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2025-01-10 03:54:41 +0000 nfscl: Fix a crash when a readdir entry has nul in it Commit 026cdaa3b3a9 added a check for a nul or "/" in a file name in a readdir reply. Unfortunately, the minimal testing done on it did not detect a bug that can cause the client to crash. This patch fixes the code so that it does not crash. Note that a NFS server will not normally return a file name in a readdir reply that has a nul or "/" in it, so the crash is unlikely. PR: 283965 Reported by: asomers Tested by: asomers MFC after: 2 weeks sys/fs/nfsclient/nfs_clrpcops.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
The patch has been committed and will be MFC'd.
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d7b69bb536f7dd4857f1aec86536737a2e94c68a commit d7b69bb536f7dd4857f1aec86536737a2e94c68a Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2025-01-10 03:54:41 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2025-01-24 02:25:10 +0000 nfscl: Fix a crash when a readdir entry has nul in it Commit 026cdaa3b3a9 added a check for a nul or "/" in a file name in a readdir reply. Unfortunately, the minimal testing done on it did not detect a bug that can cause the client to crash. This patch fixes the code so that it does not crash. Note that a NFS server will not normally return a file name in a readdir reply that has a nul or "/" in it, so the crash is unlikely. PR: 283965 (cherry picked from commit f9f0a1d61c7b97c705246c747baec385e0592966) sys/fs/nfsclient/nfs_clrpcops.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=968abc191ea145040f4c2105755d11b63ecb2427 commit 968abc191ea145040f4c2105755d11b63ecb2427 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2025-01-10 03:54:41 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2025-01-24 02:30:22 +0000 nfscl: Fix a crash when a readdir entry has nul in it Commit 026cdaa3b3a9 added a check for a nul or "/" in a file name in a readdir reply. Unfortunately, the minimal testing done on it did not detect a bug that can cause the client to crash. This patch fixes the code so that it does not crash. Note that a NFS server will not normally return a file name in a readdir reply that has a nul or "/" in it, so the crash is unlikely. PR: 283965 (cherry picked from commit f9f0a1d61c7b97c705246c747baec385e0592966) sys/fs/nfsclient/nfs_clrpcops.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
Patch has been committed and MFC'd.