I mount /usr/home over NFS. When I boot with the GENERIC-KMSAN kernel, it panics as soon as I run the first command like this: panic: MSan: Uninitialized stack memory in copyout():arg1, offset 28/224, addr 0xfffffe00751d6c54, from nfscl_cberrmap+0xb86 cpuid = 7 time = 1725498654 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x99/frame 0xfffffe00751d67d0 vpanic() at vpanic+0x56e/frame 0xfffffe00751d6960 panic() at panic+0x1dd/frame 0xfffffe00751d6a70 kmsan_report_hook() at kmsan_report_hook+0x28f/frame 0xfffffe00751d6bf0 kmsan_copyout() at kmsan_copyout+0x1f1/frame 0xfffffe00751d6c20 sys_fstat() at sys_fstat+0x12a/frame 0xfffffe00751d6d60 amd64_syscall() at amd64_syscall+0x706/frame 0xfffffe00751d6f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00751d6f30 --- syscall (551, FreeBSD ELF64, fstat), rip = 0x82794501a, rsp = 0x8206ec718, rbp = 0x8206ec810 --- KDB: enter: panic
This might be a false alarm. From inspection, I can't find any code paths that would result in the "struct stat" buffer being uninitialized. Plus, nfscl_cberrmap+0xb86 does not lie within any known function, according to kgdb. Unless I can produce a build where KMSAN can correctly determine the location of the offending function, this will be impossible to fix.
(In reply to Alan Somers from comment #1) It's probably a real bug (I've fixed several issues like this in the past), but the origin info is not helpful. In that case, I usually sprinkle kmsan_check() calls in areas that seem to be relevant, to try and iteratively narrow the problem down to its root cause. I've hit KMSAN issues in NFS before and fixed all that I could find. Offset 28 corresponds to st_uid, which is a bit surprising. Can you describe your NFS setup a bit? Are you using v3 or v4? What exports flags do you have set?
(In reply to Mark Johnston from comment #2) My mount flags are nfsv4,rw,minorversion=2 My server's export flags are -maproot=root -network 192.168.0.0/23 The offending file is /usr/home/somers/.local/share/fish/fish_history , and the KMSAN alert is triggered when fish calls fstat on that file whenever I run any command. But curiously, simply "ls"ing the file is insufficient to trigger the alert.
I think I've found the source of the problem: Firstly, my environment is a 15.0 client and a 14.1 server. The client uses nfs 4.2 for /usr/home and I run nfsuserd on both client and server. I have no NFS tunables set on the client. What happens is that: 1) When I run "whoami", my shell does an append write to /usr/home/somers/.local/share/fish/fish_history 2) ncl_writerpc allocate "struct nfsvattr nfsva" on the stack, on its first line, but does not initialize it. 3) The AppendWrite RPC returns from the server. Wireshark shows that the compound RPC includes SEQUENCE, PUTFH, GETATTR, VERIFY, WRITE, and GETATTR. But the attr masks for neither GETATTR contain Owner. 4) So nfsv4_loadattr returns without initializing nap->na_uid . Note that if I insert logic to set na_uid to 666 in step 2, it is still equal to 666 here. 5) control returns two levels up the stack back to ncl_writerpc 6) which calls nfscl_loadattrcache. At line 522 that function calls NFSBCOPY to actually copy the attributes into NFS's attribute cache, and at that point na_uid is still uninitialized. Later, the shell stat()s the fish_history file. That reads the uninitialized na_uid attribute from NFS's attribute cache, triggering the MSAN panic above. But what can we do about it? I have a patch that sets na_uid=na_gid=VNOVAL in nfsv4_loadattr. That makes the MSAN warnings go away. And yet, I'm not convinced that it's the correct solution. After all, I've never actually seen the wrong uid or gid displayed by "ls", which suggests that it's getting set somehow, even if MSAN doesn't know it.
(In reply to Alan Somers from comment #4) > After all, I've never actually seen the wrong uid or gid displayed by "ls", which suggests that it's getting set somehow, even if MSAN doesn't know it. What if you stat(1) just the one file? ls is using fts(3), which perhaps stats something like "." before the file in question, so the struct stat is already pre-populated with the "right" value.
(In reply to Mark Johnston from comment #5) If I stat the one file, then everything is fine. MSAN doesn't even trigger. It's the append write that introduces the uninitialized data.
Created attachment 253436 [details] Initialize previously uninitialized fields in nfsv4_loadattr Rick, does this patch look ok to you?
(In reply to Alan Somers from comment #7) I'd be tempted to use: UID_NOBODY and GID_NOGROUP although I do not think it much matters. You could also add a VATTR_NULL(&nap->na_vattr); right after "if (nap != NULL) {" so that they are all initialized as invalid. Btw, in case you might be interested, the WRITE doesn't GETATTR for owner/owner_group because there is a race between the write-backs and doing upcalls to the nfsuserd to resolve the names. (I've forgotten the details, but it is an ugly one, specific to the WRITE case.)
(In reply to Rick Macklem from comment #8) Good point with VATTR_NULL. In fact, I think we should probably stop default initializing any attribute that might be initialized below, like mtime, mode, etc. That would leave the default initialization section looking like this. Does that sound good to you? if (nap != NULL) { VATTR_NULL(&nap->na_vattr); nap->na_gen = 0; nap->na_flags = 0; nap->na_blocksize = NFS_FABLKSIZE; }
(In reply to Alan Somers from comment #9) I think you'll find that results in some pretty weird attributes for files. Some NFSv4 servers do not return all the attributes, although it is hard to be sure. (Remember that not all servers are FreeBSD or even POSIX based. There was a Windows based one, but I do not know if it still exists.) All setting the defaults does is make the files look ok (to "ls" etc). It has no effect on permission to access them (that is done by the server). I'd leave the initializations alone (adding the stuff you proposed and I suggested).
(In reply to Rick Macklem from comment #10) Ok, I'll do it that way.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=44328abfb7aca8150b07b83ff502c9185677e3fb commit 44328abfb7aca8150b07b83ff502c9185677e3fb Author: Alan Somers <asomers@FreeBSD.org> AuthorDate: 2024-09-08 20:42:38 +0000 Commit: Alan Somers <asomers@FreeBSD.org> CommitDate: 2024-09-08 23:28:33 +0000 nfscl: fix uninitialized memory in nfsv4_loadattr When processing an RPC response that did not include any Owner attribute, nfsv4_loadattr would return na_uid and na_gid uninitialized. The uninitialized values could then make their way into the NFS attribute cache via nfscl_loadattrcache. PR: 281279 Reported by: KMSAN MFC after: 2 weeks Reviewed by: rmacklem Sponsored by: Axcient sys/fs/nfs/nfs_commonsubs.c | 1 + 1 file changed, 1 insertion(+)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4421ce1328dff275ae522222010b6b3b791730e7 commit 4421ce1328dff275ae522222010b6b3b791730e7 Author: Alan Somers <asomers@FreeBSD.org> AuthorDate: 2024-09-08 20:42:38 +0000 Commit: Alan Somers <asomers@FreeBSD.org> CommitDate: 2024-10-10 19:08:12 +0000 nfscl: fix uninitialized memory in nfsv4_loadattr When processing an RPC response that did not include any Owner attribute, nfsv4_loadattr would return na_uid and na_gid uninitialized. The uninitialized values could then make their way into the NFS attribute cache via nfscl_loadattrcache. PR: 281279 Reported by: KMSAN Reviewed by: rmacklem Sponsored by: Axcient (cherry picked from commit 44328abfb7aca8150b07b83ff502c9185677e3fb) sys/fs/nfs/nfs_commonsubs.c | 1 + 1 file changed, 1 insertion(+)