Bug 281279 - nfscl: panic: MSan: Uninitialized stack memory in nfscl_cberrmap
Summary: nfscl: panic: MSan: Uninitialized stack memory in nfscl_cberrmap
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Alan Somers
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2024-09-05 01:15 UTC by Alan Somers
Modified: 2024-10-10 19:11 UTC (History)
2 users (show)

See Also:
asomers: mfc-stable14?
asomers: mfc-stable13?


Attachments
Initialize previously uninitialized fields in nfsv4_loadattr (1.04 KB, patch)
2024-09-08 20:50 UTC, Alan Somers
asomers: maintainer-approval? (rmacklem)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Somers freebsd_committer freebsd_triage 2024-09-05 01:15:26 UTC
I mount /usr/home over NFS.  When I boot with the GENERIC-KMSAN kernel, it panics as soon as I run the first command like this:

panic: MSan: Uninitialized stack memory in copyout():arg1, offset 28/224, addr 0xfffffe00751d6c54, from nfscl_cberrmap+0xb86
cpuid = 7
time = 1725498654
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x99/frame 0xfffffe00751d67d0
vpanic() at vpanic+0x56e/frame 0xfffffe00751d6960
panic() at panic+0x1dd/frame 0xfffffe00751d6a70
kmsan_report_hook() at kmsan_report_hook+0x28f/frame 0xfffffe00751d6bf0
kmsan_copyout() at kmsan_copyout+0x1f1/frame 0xfffffe00751d6c20
sys_fstat() at sys_fstat+0x12a/frame 0xfffffe00751d6d60
amd64_syscall() at amd64_syscall+0x706/frame 0xfffffe00751d6f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00751d6f30
--- syscall (551, FreeBSD ELF64, fstat), rip = 0x82794501a, rsp = 0x8206ec718, rbp = 0x8206ec810 ---
KDB: enter: panic
Comment 1 Alan Somers freebsd_committer freebsd_triage 2024-09-05 21:48:08 UTC
This might be a false alarm.  From inspection, I can't find any code paths that would result in the "struct stat" buffer being uninitialized.  Plus, nfscl_cberrmap+0xb86 does not lie within any known function, according to kgdb.  Unless I can produce a build where KMSAN can correctly determine the location of the offending function, this will be impossible to fix.
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2024-09-05 21:53:53 UTC
(In reply to Alan Somers from comment #1)
It's probably a real bug (I've fixed several issues like this in the past), but the origin info is not helpful.  In that case, I usually sprinkle kmsan_check() calls in areas that seem to be relevant, to try and iteratively narrow the problem down to its root cause.  I've hit KMSAN issues in NFS before and fixed all that I could find.

Offset 28 corresponds to st_uid, which is a bit surprising.  Can you describe your NFS setup a bit?  Are you using v3 or v4?  What exports flags do you have set?
Comment 3 Alan Somers freebsd_committer freebsd_triage 2024-09-05 22:43:35 UTC
(In reply to Mark Johnston from comment #2)
My mount flags are nfsv4,rw,minorversion=2
My server's export flags are -maproot=root -network 192.168.0.0/23

The offending file is /usr/home/somers/.local/share/fish/fish_history , and the KMSAN alert is triggered when fish calls fstat on that file whenever I run any command.  But curiously, simply "ls"ing the file is insufficient to trigger the alert.
Comment 4 Alan Somers freebsd_committer freebsd_triage 2024-09-08 19:09:41 UTC
I think I've found the source of the problem:

Firstly, my environment is a 15.0 client and a 14.1 server.  The client uses nfs 4.2 for /usr/home and I run nfsuserd on both client and server.  I have no NFS tunables set on the client.

What happens is that:
1) When I run "whoami", my shell does an append write to /usr/home/somers/.local/share/fish/fish_history
2) ncl_writerpc allocate "struct nfsvattr nfsva" on the stack, on its first line, but does not initialize it.
3) The AppendWrite RPC returns from the server.  Wireshark shows that the compound RPC includes SEQUENCE, PUTFH, GETATTR, VERIFY, WRITE, and GETATTR.  But the attr masks for neither GETATTR contain Owner.  
4) So nfsv4_loadattr returns without initializing nap->na_uid .  Note that if I insert logic to set na_uid to 666 in step 2, it is still equal to 666 here.
5) control returns two levels up the stack back to ncl_writerpc
6) which calls nfscl_loadattrcache.  At line 522 that function calls NFSBCOPY to actually copy the attributes into NFS's attribute cache, and at that point na_uid is still uninitialized.

Later, the shell stat()s the fish_history file.  That reads the uninitialized na_uid attribute from NFS's attribute cache, triggering the MSAN panic above.

But what can we do about it? I have a patch that sets na_uid=na_gid=VNOVAL in nfsv4_loadattr.  That makes the MSAN warnings go away.  And yet, I'm not convinced that it's the correct solution.  After all, I've never actually seen the wrong uid or gid displayed by "ls", which suggests that it's getting set somehow, even if MSAN doesn't know it.
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2024-09-08 19:19:23 UTC
(In reply to Alan Somers from comment #4)
> After all, I've never actually seen the wrong uid or gid displayed by "ls", which suggests that it's getting set somehow, even if MSAN doesn't know it.

What if you stat(1) just the one file?  ls is using fts(3), which perhaps stats something like "." before the file in question, so the struct stat is already pre-populated with the "right" value.
Comment 6 Alan Somers freebsd_committer freebsd_triage 2024-09-08 20:38:33 UTC
(In reply to Mark Johnston from comment #5)
If I stat the one file, then everything is fine.  MSAN doesn't even trigger.  It's the append write that introduces the uninitialized data.
Comment 7 Alan Somers freebsd_committer freebsd_triage 2024-09-08 20:50:03 UTC
Created attachment 253436 [details]
Initialize previously uninitialized fields in nfsv4_loadattr

Rick, does this patch look ok to you?
Comment 8 Rick Macklem freebsd_committer freebsd_triage 2024-09-08 22:23:43 UTC
(In reply to Alan Somers from comment #7)
I'd be tempted to use:
    UID_NOBODY and GID_NOGROUP
although I do not think it much matters.

You could also add a
    VATTR_NULL(&nap->na_vattr);
right after "if (nap != NULL) {"
so that they are all initialized as invalid.

Btw, in case you might be interested, the WRITE doesn't
GETATTR for owner/owner_group because there is a race
between the write-backs and doing upcalls to the nfsuserd
to resolve the names. (I've forgotten the details, but it
is an ugly one, specific to the WRITE case.)
Comment 9 Alan Somers freebsd_committer freebsd_triage 2024-09-08 23:08:11 UTC
(In reply to Rick Macklem from comment #8)
Good point with VATTR_NULL.  In fact, I think we should probably stop default initializing any attribute that might be initialized below, like mtime, mode, etc.  That would leave the default initialization section looking like this.  Does that sound good to you?

		if (nap != NULL) {
			VATTR_NULL(&nap->na_vattr);
			nap->na_gen = 0;
			nap->na_flags = 0;
			nap->na_blocksize = NFS_FABLKSIZE;
		}
Comment 10 Rick Macklem freebsd_committer freebsd_triage 2024-09-08 23:14:29 UTC
(In reply to Alan Somers from comment #9)
I think you'll find that results in some pretty weird
attributes for files. Some NFSv4 servers do not return
all the attributes, although it is hard to be sure.
(Remember that not all servers are FreeBSD or even POSIX
based. There was a Windows based one, but I do not know
if it still exists.)

All setting the defaults does is make the files look
ok (to "ls" etc). It has no effect on permission to
access them (that is done by the server).

I'd leave the initializations alone (adding the stuff
you proposed and I suggested).
Comment 11 Alan Somers freebsd_committer freebsd_triage 2024-09-08 23:27:11 UTC
(In reply to Rick Macklem from comment #10)
Ok, I'll do it that way.
Comment 12 commit-hook freebsd_committer freebsd_triage 2024-09-08 23:29:24 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=44328abfb7aca8150b07b83ff502c9185677e3fb

commit 44328abfb7aca8150b07b83ff502c9185677e3fb
Author:     Alan Somers <asomers@FreeBSD.org>
AuthorDate: 2024-09-08 20:42:38 +0000
Commit:     Alan Somers <asomers@FreeBSD.org>
CommitDate: 2024-09-08 23:28:33 +0000

    nfscl: fix uninitialized memory in nfsv4_loadattr

    When processing an RPC response that did not include any Owner
    attribute, nfsv4_loadattr would return na_uid and na_gid uninitialized.
    The uninitialized values could then make their way into the NFS
    attribute cache via nfscl_loadattrcache.

    PR:             281279
    Reported by:    KMSAN
    MFC after:      2 weeks
    Reviewed by:    rmacklem
    Sponsored by:   Axcient

 sys/fs/nfs/nfs_commonsubs.c | 1 +
 1 file changed, 1 insertion(+)
Comment 13 commit-hook freebsd_committer freebsd_triage 2024-10-10 19:11:55 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4421ce1328dff275ae522222010b6b3b791730e7

commit 4421ce1328dff275ae522222010b6b3b791730e7
Author:     Alan Somers <asomers@FreeBSD.org>
AuthorDate: 2024-09-08 20:42:38 +0000
Commit:     Alan Somers <asomers@FreeBSD.org>
CommitDate: 2024-10-10 19:08:12 +0000

    nfscl: fix uninitialized memory in nfsv4_loadattr

    When processing an RPC response that did not include any Owner
    attribute, nfsv4_loadattr would return na_uid and na_gid uninitialized.
    The uninitialized values could then make their way into the NFS
    attribute cache via nfscl_loadattrcache.

    PR:             281279
    Reported by:    KMSAN
    Reviewed by:    rmacklem
    Sponsored by:   Axcient

    (cherry picked from commit 44328abfb7aca8150b07b83ff502c9185677e3fb)

 sys/fs/nfs/nfs_commonsubs.c | 1 +
 1 file changed, 1 insertion(+)