Bug 257043 - NFS nfsrpc_readdirplus() panic
Summary: NFS nfsrpc_readdirplus() panic
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Rick Macklem
URL:
Keywords: panic
Depends on:
Blocks:
 
Reported: 2021-07-07 17:52 UTC by dgilbert
Modified: 2021-07-25 21:51 UTC (History)
2 users (show)

See Also:
rmacklem: mfc-stable13+


Attachments
cache_purge() when starting to read an NFS directory mounted "rdirplus" (648 bytes, patch)
2021-07-07 23:58 UTC, Rick Macklem
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description dgilbert 2021-07-07 17:52:33 UTC
Server: Threadripper 1900 with 40T ZFS array running 13.0.

Client: HiFive Unmatched RISC-V running 14-CURRENT.  I believe the most recent commit was ed0a582d88f7cb8deea9b6fbe6dade43911c0dd7 on Sat Jun 19 17:50:11 2021 +0200.

I was on the client, and wanted to fetch an img to put onto a usb stick.  I cd'd into NFS to where the image was stored.  As I discovered it was bz2'd (ll to read dir), I flipped back to the server's shell and pbunzip2'd (much faster --- .bz2 disappears, .img appears).  I flipped back to the client and did "ll" again --- at which point, client panic'd.

I'm going to paste part of the bt here.  Links below to the kernel, kernel.debug, vmcore.

#11 0xffffffc000343466 in panic (fmt=0xffffffc000829fb0 <kdb_why> "\372Rg")
    at /home/dgilbert/FreeBSD/src/sys/kern/kern_shutdown.c:843
#12 0xffffffc0003f1916 in cache_enter_time (dvp=<optimized out>, vp=<optimized out>, cnp=<optimized out>,
    tsp=<optimized out>, dtsp=<optimized out>) at /home/dgilbert/FreeBSD/src/sys/kern/vfs_cache.c:2471
#13 0xffffffc000237834 in nfsrpc_readdirplus (vp=<optimized out>, uiop=0xffffffc2204d56c8, cookiep=0xffffffc2204d5580,
    cred=<optimized out>, p=<optimized out>, nap=0xffffffc2204d5588, attrflagp=0xffffffc2204d557c,
    eofp=<optimized out>, stuff=0x0) at /home/dgilbert/FreeBSD/src/sys/fs/nfsclient/nfs_clrpcops.c:3765
#14 0xffffffc0002423a0 in ncl_readdirplusrpc (vp=0xffffffd2591fa1c0, uiop=0xffffffc2204d56c8, cred=0xffffffd033b29700,
    td=0xffffffc221787b00) at /home/dgilbert/FreeBSD/src/sys/fs/nfsclient/nfs_clvnops.c:2506
#15 0xffffffc000251648 in ncl_doio (vp=0xffffffd2591fa1c0, bp=0xffffffc001876b90, cr=0xffffffd033b29700,
    td=0xffffffc221787b00, called_from_strategy=<optimized out>)
    at /home/dgilbert/FreeBSD/src/sys/fs/nfsclient/nfs_clbio.c:1696
#16 0xffffffc000250984 in ncl_bioread (vp=0xffffffd2591fa1c0, uio=0xffffffc2204d5a88, ioflag=<optimized out>,
    cred=<optimized out>) at /home/dgilbert/FreeBSD/src/sys/fs/nfsclient/nfs_clbio.c:607
#17 0xffffffc0002463e8 in nfs_readdir (ap=0xffffffc2204d5ac0)
    at /home/dgilbert/FreeBSD/src/sys/fs/nfsclient/nfs_clvnops.c:2394
#18 0xffffffc0003fef14 in vop_sigdefer (vop=<optimized out>, a=0xffffffc2204d5ac0)
    at /home/dgilbert/FreeBSD/src/sys/kern/vfs_default.c:1499
#19 0xffffffc000241b86 in nfs_vnodeops_bypass (a=0xf2cf08918fe95d7a)
    at /home/dgilbert/FreeBSD/src/sys/fs/nfsclient/nfs_clvnops.c:209
#20 0xffffffc0005dcd84 in VOP_READDIR_APV (vop=0xffffffc000765110 <newnfs_vnodeops>, a=0xffffffc2204d5ac0)
    at vnode_if.c:1939
#21 0xffffffc000421212 in VOP_READDIR (vp=<optimized out>, uio=<optimized out>, cred=<optimized out>,
    eofflag=<optimized out>, ncookies=<optimized out>, cookies=<optimized out>) at ./vnode_if.h:985
#22 kern_getdirentries (td=<optimized out>, fd=<optimized out>, buf=<optimized out>, count=<optimized out>,
    basep=<optimized out>, residp=<optimized out>, bufseg=<optimized out>)
    at /home/dgilbert/FreeBSD/src/sys/kern/vfs_syscalls.c:4169
#23 0xffffffc000421076 in sys_getdirentries (td=0xffffffc000829fb0 <kdb_why>, uap=0xffffffc221787ee8)
    at /home/dgilbert/FreeBSD/src/sys/kern/vfs_syscalls.c:4116
#24 0xffffffc0005d5e14 in syscallenter (td=0xffffffc221787b00)
    at /home/dgilbert/FreeBSD/src/sys/riscv/riscv/../../kern/subr_syscall.c:189
#25 ecall_handler () at /home/dgilbert/FreeBSD/src/sys/riscv/riscv/trap.c:165
#26 do_trap_user (frame=<optimized out>) at /home/dgilbert/FreeBSD/src/sys/riscv/riscv/trap.c:375
#27 <signal handler called>
#28 0x0000000040310e08 in ?? ()

Client:

[1:16:17]root@ump:/var/crash> uname -a
FreeBSD ump.daveg.ca 14.0-CURRENT FreeBSD 14.0-CURRENT #2 unmatched-n247472-2c2ed1f58a18: Wed Jul  7 01:02:27 EDT 2021     dgilbert@vr.home.dclg.ca:/home/dgilbert/FreeBSD/obj/home/dgilbert/FreeBSD/src/riscv.riscv64/sys/GENERIC  riscv

Server:

[2:124:424]root@vr:/vr1/tmp> uname -a
FreeBSD vr.home.dclg.ca 13.0-RELEASE FreeBSD 13.0-RELEASE #21 releng/13.0-n244733-ea31abc261f-dirty: Tue Apr 20 12:19:38 EDT 2021     root@vr.home.dclg.ca:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

files:
vmcore.3.bz2 https://nextcloud.towernet.ca/s/kzfKrAmqWd7xZYp
core.txt.3.bz2 https://nextcloud.towernet.ca/s/Atemjg9AsJY6MP9
info.3.bz2 https://nextcloud.towernet.ca/s/KjGrqfgRidszNJb
kernel.3.bz2 https://nextcloud.towernet.ca/s/RGdgMM3KRozWQ6t
kernel.3.debug.bz2 https://nextcloud.towernet.ca/s/e6DES6GeCNJeKka
Comment 1 Rick Macklem freebsd_committer 2021-07-07 22:56:31 UTC
I believe that the KASSERT() at line# 2471 in sys/kern/vfs_cache.c
is not feasible for the NFS client.
In general, the NFS client cannot know that a directory has been
modified locally on the server (or by another NFS client) such
that the same file name now refers to a different file.
(NFS does not have a cache coherency protocol.)

For NFS, what needs to happen for this case is that the new
entry needs to replace the old (now bogus/stale) entry.

I will look to see if a change in directory attributes can
be used to invalidate all entries in the cache for that directory
for the case of re-reading the directory, but this will, at best,
reduce the likelyhood of having a new file for the same name and
not stop it from happening.

Not using the "rdirplus" mount option should avoid the problem in
the meantime.

mjg@, can you change cache_enter_time() so that it replaces the
entry instead of panics?

I reproduced this by doing the following:
- NFS mount and "ll".
- go to server and do two "mv"s, where nfs_clstate.c and xxx.c were files
  in the directory
# mv nfs_clstate.c xxxx
# mv xxx.c nfs_clstate.c
- then "ll" again on the NFS client
Comment 2 Rick Macklem freebsd_committer 2021-07-07 23:58:33 UTC
Created attachment 226290 [details]
cache_purge() when starting to read an NFS directory mounted "rdirplus"

This patch is probably a good idea and I think it avoids the
panic() for this specific case.

However, as I noted in the previous comment, an NFS client cannot,
in general, know when a file on the server has been replaced by
another file of the same name, locally on the NFS server.
Comment 3 Mateusz Guzik freebsd_committer 2021-07-08 02:50:27 UTC
I'm definitely not going to remove the assert since it is too useful.

For one, the cache has been dropping the new entry for years now and the discrepancy here was only found thanks to said assert.

What I can do is add a flag that the caller expects there may be a conflicting entry and it should be removed in favor of the new one.
Comment 4 Mateusz Guzik freebsd_committer 2021-07-08 03:29:00 UTC
I'm going to have to sleep on the patch, there are some hairy parts.
Comment 5 Mateusz Guzik freebsd_committer 2021-07-09 20:10:51 UTC
Not the exact patch I intend to commit, but does this solve the problem?

https://people.freebsd.org/~mjg/nfs-cache.diff
Comment 6 Rick Macklem freebsd_committer 2021-07-10 00:16:39 UTC
Yes, nfs-cache.diff stops the crash the way I reproduced it.

I suspect that other cache_enter_time() call(s) in the NFS
client will need to be changed as well, but I'll look at
that after you commit your final version of this to "main".

Thanks, rick
Comment 7 Rick Macklem freebsd_committer 2021-07-10 00:43:24 UTC
Dave, would it be possible for you to patch the kernel
on your NFS client with:
https://people.freebsd.org/~mjg/nfs-cache.diff
and then see if you can reproduce the panic?

If you get the chance to do this, please report
back here.
Comment 8 Mateusz Guzik freebsd_committer 2021-07-12 05:06:39 UTC
I committed the routine, did not touch nfs.

The use would be: cache_enter_time_flags(..., VFS_CACHE_DROPOLD) where appropriate.
Comment 9 commit-hook freebsd_committer 2021-07-14 20:37:21 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7f5508fe78d17af968fe67e00ffa7c975aa2c67d

commit 7f5508fe78d17af968fe67e00ffa7c975aa2c67d
Author:     Rick Macklem <rmacklem@FreeBSD.org>
AuthorDate: 2021-07-14 20:33:37 +0000
Commit:     Rick Macklem <rmacklem@FreeBSD.org>
CommitDate: 2021-07-14 20:33:37 +0000

    nfscl: Avoid KASSERT() panic in cache_enter_time()

    Commit 844aa31c6d87 added cache_enter_time_flags(), specifically
    so that the NFS client could specify that cache enter replace
    any stale entry for the same name.  Doing so avoids a KASSERT()
    panic() in cache_enter_time(), as reported by the PR.

    This patch uses cache_enter_time_flags() for Readdirplus, to
    avoid the panic(), since it is impossible for the NFS client
    to know if another client (or a local process on the NFS server)
    has replaced a file with another file of the same name.

    This patch only affects NFS mounts that use the "rdirplus"
    mount option.

    There may be other places in the NFS client where this needs
    to be done, but no panic() has been observed during testing.

    PR:     257043
    MFC after:      2 weeks

 sys/fs/nfsclient/nfs_clrpcops.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 10 Rick Macklem freebsd_committer 2021-07-14 20:40:39 UTC
The patches to fix this KASSERT() panic are now
in "main".

They need to be MFC'd, since the same KASSERT()
is in stable/13.

mjg@, I didn't notice an MFC on your commit.
Are you planning on doing so?
Comment 11 Mateusz Guzik freebsd_committer 2021-07-14 20:42:37 UTC
A week or so from now.
Comment 12 Mateusz Guzik freebsd_committer 2021-07-25 07:03:42 UTC
I just merged the change to stable/12.
Comment 13 Mateusz Guzik freebsd_committer 2021-07-25 07:04:41 UTC
Sorry, meant stable/13. stable/12 does not have the assert and is not getting the patch as is -- it is missing some prerequisites.
Comment 14 commit-hook freebsd_committer 2021-07-25 21:48:51 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=8e24b25b67bb664168e3daf41d7b2262f5b136d5

commit 8e24b25b67bb664168e3daf41d7b2262f5b136d5
Author:     Rick Macklem <rmacklem@FreeBSD.org>
AuthorDate: 2021-07-14 20:33:37 +0000
Commit:     Rick Macklem <rmacklem@FreeBSD.org>
CommitDate: 2021-07-25 21:45:05 +0000

    nfscl: Avoid KASSERT() panic in cache_enter_time()

    Commit 844aa31c6d87 added cache_enter_time_flags(), specifically
    so that the NFS client could specify that cache enter replace
    any stale entry for the same name.  Doing so avoids a KASSERT()
    panic() in cache_enter_time(), as reported by the PR.

    This patch uses cache_enter_time_flags() for Readdirplus, to
    avoid the panic(), since it is impossible for the NFS client
    to know if another client (or a local process on the NFS server)
    has replaced a file with another file of the same name.

    This patch only affects NFS mounts that use the "rdirplus"
    mount option.

    There may be other places in the NFS client where this needs
    to be done, but no panic() has been observed during testing.

    PR:     257043

    (cherry picked from commit 7f5508fe78d17af968fe67e00ffa7c975aa2c67d)

 sys/fs/nfsclient/nfs_clrpcops.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 15 Rick Macklem freebsd_committer 2021-07-25 21:51:36 UTC
Patch to fix this has been MFC'd.