Bug 228497 - [nfs] kernel panic in nfsrv_checksequence
Summary: [nfs] kernel panic in nfsrv_checksequence
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.1-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Rick Macklem
URL:
Keywords: panic
Depends on:
Blocks:
 
Reported: 2018-05-25 23:47 UTC by cryopie
Modified: 2018-06-06 22:25 UTC (History)
2 users (show)

See Also:
rmacklem: mfc-stable10+
rmacklem: mfc-stable11+


Attachments
Free up sessions when an ExchangeID with new co_verifier is done (1.94 KB, patch)
2018-05-29 21:19 UTC, Rick Macklem
no flags Details | Diff
make destroysession mark it defunct instead of freeing session (2.90 KB, patch)
2018-05-29 23:04 UTC, Rick Macklem
no flags Details | Diff
fix locking for destroysession in the NFSv4.1 server (1.08 KB, patch)
2018-05-30 13:13 UTC, Rick Macklem
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description cryopie 2018-05-25 23:47:00 UTC
Two FreeBSD 11.1 systems crashed within fifteen minutes of each other. 

Host1: (CPU: Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz (3500.06-MHz K8-class CPU)):

db:0:kdb.enter.default>  bt
Tracing pid 2291 tid 101254 td 0xfffff800526b5000
nfsrv_checksequence() at nfsrv_checksequence+0x208/frame 0xfffffe0860a85e20
nfsrvd_sequence() at nfsrvd_sequence+0x12a/frame 0xfffffe0860a85e70
nfsrvd_dorpc() at nfsrvd_dorpc+0xeed/frame 0xfffffe0860a86050
nfssvc_program() at nfssvc_program+0x5c0/frame 0xfffffe0860a86200
svc_run_internal() at svc_run_internal+0xcc9/frame 0xfffffe0860a86340
svc_run() at svc_run+0x161/frame 0xfffffe0860a86390
nfsrvd_nfsd() at nfsrvd_nfsd+0x236/frame 0xfffffe0860a86500
nfssvc_nfsd() at nfssvc_nfsd+0x1d9/frame 0xfffffe0860a86960
sys_nfssvc() at sys_nfssvc+0x9c/frame 0xfffffe0860a86980
amd64_syscall() at amd64_syscall+0xa4a/frame 0xfffffe0860a86ab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0860a86ab0
--- syscall (155, FreeBSD ELF64, sys_nfssvc), rip = 0x800871c9a, rsp = 0x7fffffffe518, rbp = 0x7fffffffe940 ---

May 25 17:53:00 Host1 current process          = 2291 (nfsd: master)
May 25 17:53:00 Host1 processor eflags = interrupt enabled, resume, IOPL = 0
May 25 17:53:00 Host1                  = DPL 0, pres 1, long 1, def32 0, gran 1
May 25 17:53:00 Host1 code segment             = base rx0, limit 0xfffff, type 0x1b

-----

Host2: (CPU: Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz (2700.00-MHz K8-class CPU)):

db:0:kdb.enter.default>  bt
Tracing pid 2001 tid 102447 td 0xfffff800a716d000
nfsrv_checksequence() at nfsrv_checksequence+0x208/frame 0xfffffe08629ca540
nfsrvd_sequence() at nfsrvd_sequence+0x12a/frame 0xfffffe08629ca590
nfsrvd_dorpc() at nfsrvd_dorpc+0xeed/frame 0xfffffe08629ca770
nfssvc_program() at nfssvc_program+0x5c0/frame 0xfffffe08629ca920
svc_run_internal() at svc_run_internal+0xcc9/frame 0xfffffe08629caa60
svc_thread_start() at svc_thread_start+0xb/frame 0xfffffe08629caa70
fork_exit() at fork_exit+0x85/frame 0xfffffe08629caab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe08629caab0
--- trap 0xc, rip = 0x800871c9a, rsp = 0x7fffffffe518, rbp = 0x7fffffffe940 ---

May 25 17:39:19 Host2 current process           = 2001 (nfsd: service)
May 25 17:39:19 Host2 processor eflags  = interrupt enabled, resume, IOPL = 0
May 25 17:39:19 Host2                   = DPL 0, pres 1, long 1, def32 0, gran 1
May 25 17:39:19 Host2 code segment              = base rx0, limit 0xfffff, type 0x1b
May 25 17:39:19 Host2 frame pointer             = 0x28:0xfffffe08629ca540
May 25 17:39:19 Host2 stack pointer             = 0x28:0xfffffe08629ca4f0
May 25 17:39:19 Host2 instruction pointer       = 0x20:0xffffffff80980668
May 25 17:39:19 Host2 fault code                = supervisor read data, page not present
May 25 17:39:19 Host2 fault virtual address     = 0x2f0
May 25 17:39:19 Host2 cpuid = 0; apic id = 00
May 25 17:39:19 Host2 Fatal trap 12: page fault while in kernel mode


Build: 
FreeBSD 11.1-STABLE #2 r321665+366f54a78b2(freenas/11.1-stable): Wed Mar 21 23:04:13 UTC 2018
    root@gauntlet:/freenas-11-releng/freenas/_BE/objs/freenas-11-releng/freenas/_BE/os/sys/FreeNAS.amd64 amd64
FreeBSD clang version 5.0.0 (tags/RELEASE_500/final 312559) (based on LLVM 5.0.0svn)

--- 

Both machines run FreeNAS 11.1-U4. I don't know whether FreeNAS patches FreeBSD kernel, but the function seemed like an unusual place to need a patch, and so I decided to file a bug report here instead of at FreeNAS bug tracker. 

I was doing an 'mv foo bar/' via NFS on Host3 when the crash occurred. Both 'foo' and 'bar' are directories exported by Host2 and mounted as /mnt/foo and /mnt/bar on Host3. Nothing explicit was being done on Host1. Both serve NFS to half a dozen clients, all with default options (configured from FreeNAS GUI) and neither server is under heavy load. All hosts are on the same subnet and promiscuous mode is disabled on the switch.

The NFS client, Host3, is a Linux box (Linux 4.16.9-1-ARCH #1 SMP PREEMPT Thu May 17 02:10:09 UTC 2018 x86_64 GNU/Linux). The NFS mount options are the following: 

rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.7.7,local_lock=none,addr=10.0.0.24,_netdev

I am unable to reproduce this but if it happens again I'll reboot with a debug kernel. I don't see how this occurred in two different hosts more or less simultaneously, given that they've been running without issue for weeks.
Comment 1 Rick Macklem freebsd_committer 2018-05-29 21:19:16 UTC
Created attachment 193829 [details]
Free up sessions when an ExchangeID with new co_verifier is done

This patch might fix the crashes. However, I don't know how the
Linux client might have gotten into this situation?
(Normally a Linux client only changes the co_verifier when it is
 hard rebooted via the power cord or similar.)

This patch was committed to FreeBSD-current as r333580, but has not been
MFC'd.

If you could apply this patch to your server(s) and report of you get
another crash with the patch applied, that would tell us that something
else is broken.

Basically a crash like this occurs when the server somehow is using
a session when it is free'd.

Maybe some sort of network partitioning/client reboot or similar triggered
the behaviour that caused the crashes?
Comment 2 Rick Macklem freebsd_committer 2018-05-29 21:20:00 UTC
Add MFC flags for patch.
Comment 3 Rick Macklem freebsd_committer 2018-05-29 23:04:13 UTC
Created attachment 193832 [details]
make destroysession mark it defunct instead of freeing session

I think this patch (just did it and I haven't even tested it yet) is
more likely to stop the crashes than the other one (although you
should use the other patch too).
My guess is that the Linux client did a DestroySession when the session
was still in use. I'd call this a client bug, but it shouldn't cause
a crash.

This is the only case I can see where the session might have been free'd
when still being used.
If you can test both these patches and let us know how it goes, that would
be appreciated.
Comment 4 cryopie 2018-05-30 00:22:21 UTC
(In reply to Rick Macklem from comment #3)

The network is entirely on Layer 2 with nothing fancy at all; no partitioning, link aggregation, routing etc. Also, the Linux client did not crash or die when the two FreeBSD systems crashed. 

I recently changed the NFS entry in fstab to use only two options "auto,_netdev", which causes systemd to automount NFS shares when the directory is first read. I just noticed that my syslog contains several messages such as:

systemd[1]: mnt-foo-bar.mount: Directory /mnt/foo/bar to mount over is not empty, mounting anyway.

(foo and foo/bar are ZFS datasets on FreeNAS) 

I haven't manually unmounted any mount point; so, in the not unlikely event that systemd is mounting over its own mount, the mount point may have several overlapping mounts on it (is this possible?). Could both inner- and outer mounts have had open file handles?

I'm not even sure whether the Linux NFS client handles this sort of behaviour, i.e., ignore a second mount if the same share is already mounted on the same path. When I try to do this by hand it doesn't work--a single "umount /mnt/foo/bar" unmounts the share.

I'm unable to reproduce this by hand. If I get a persistent reproducer, I'll test your patches.
Comment 5 Rick Macklem freebsd_committer 2018-05-30 01:18:17 UTC
Well, I'd suggest you apply both patches to your servers.
I think they are both "safe to do" and if it does crash again
with the patches in place, we'll know they didn't fix the problem.

I realize that "no crashes" doesn't imply "fixed".
Comment 6 Rick Macklem freebsd_committer 2018-05-30 13:13:47 UTC
Created attachment 193840 [details]
fix locking for destroysession in the NFSv4.1 server

I decided that it was better to fix the DestroySession issue by
strengthening the locking than deferring the nfsrv_freesession().
This patch strengthens the locking and should be used instead of
attachment #193832 [details].
Comment 7 commit-hook freebsd_committer 2018-05-30 20:17:14 UTC
A commit references this bug:

Author: rmacklem
Date: Wed May 30 20:16:18 UTC 2018
New revision: 334396
URL: https://svnweb.freebsd.org/changeset/base/334396

Log:
  Strengthen locking for the NFSv4.1 server DestroySession operation.

  If a client did a DestroySession on a session while it was still in use,
  the server might try to use the session structure after it is free'd.
  I think the client has violated RFC5661 if it does this, but this patch
  makes DestroySession block all other nfsd threads so no thread could
  be using the session when it is free'd. After the DestroySession, nfsd
  threads will not be able to find the session. The patch also adds a check
  for nd_sessionid being set, although if that was not the case it would have
  been all 0s and unlikely to have a false match.
  This might fix the crashes described in PR#228497 for the FreeNAS server.

  PR:		228497
  MFC after:	1 week

Changes:
  head/sys/fs/nfsserver/nfs_nfsdstate.c
Comment 8 commit-hook freebsd_committer 2018-06-06 01:21:40 UTC
A commit references this bug:

Author: rmacklem
Date: Wed Jun  6 01:21:34 UTC 2018
New revision: 334698
URL: https://svnweb.freebsd.org/changeset/base/334698

Log:
  MFC: r334396
  Strengthen locking for the NFSv4.1 server DestroySession operation.

  If a client did a DestroySession on a session while it was still in use,
  the server might try to use the session structure after it is free'd.
  I think the client has violated RFC5661 if it does this, but this patch
  makes DestroySession block all other nfsd threads so no thread could
  be using the session when it is free'd. After the DestroySession, nfsd
  threads will not be able to find the session. The patch also adds a check
  for nd_sessionid being set, although if that was not the case it would have
  been all 0s and unlikely to have a false match.
  This might fix the crashes described in PR#228497 for the FreeNAS server.

  PR:		228497

Changes:
_U  stable/11/
  stable/11/sys/fs/nfsserver/nfs_nfsdstate.c
Comment 9 commit-hook freebsd_committer 2018-06-06 01:31:50 UTC
A commit references this bug:

Author: rmacklem
Date: Wed Jun  6 01:30:49 UTC 2018
New revision: 334699
URL: https://svnweb.freebsd.org/changeset/base/334699

Log:
  MFC: r334396
  Strengthen locking for the NFSv4.1 server DestroySession operation.

  If a client did a DestroySession on a session while it was still in use,
  the server might try to use the session structure after it is free'd.
  I think the client has violated RFC5661 if it does this, but this patch
  makes DestroySession block all other nfsd threads so no thread could
  be using the session when it is free'd. After the DestroySession, nfsd
  threads will not be able to find the session. The patch also adds a check
  for nd_sessionid being set, although if that was not the case it would have
  been all 0s and unlikely to have a false match.
  This might fix the crashes described in PR#228497 for the FreeNAS server.

  PR:		228497

Changes:
_U  stable/10/
  stable/10/sys/fs/nfsserver/nfs_nfsdstate.c
Comment 10 commit-hook freebsd_committer 2018-06-06 22:02:48 UTC
A commit references this bug:

Author: rmacklem
Date: Wed Jun  6 22:02:21 UTC 2018
New revision: 334739
URL: https://svnweb.freebsd.org/changeset/base/334739

Log:
  MFC: r333580
  Fix a slow leak of session structures in the NFSv4.1 server.

  For a fairly rare case of a client doing an ExchangeID after a hard reboot,
  the old confirmed clientid still exists, but some clients use a new
  co_verifier. For this case, the server was not freeing up the sessions on
  the old confirmed clientid.
  This patch fixes this case. It also adds two LIST_INIT() macros, which are
  actually no-ops, since the structure is malloc()d with M_ZERO so the pointer
  is already set to NULL.
  It should have minimal impact, since the only way I could exercise this
  code path was by doing a hard power cycle (pulling the plus) on a machine
  running Linux with a NFSv4.1 mount on the server.
  Originally spotted during testing of the ESXi 6.5 client.

  PR:		228497

Changes:
_U  stable/11/
  stable/11/sys/fs/nfsserver/nfs_nfsdstate.c
Comment 11 commit-hook freebsd_committer 2018-06-06 22:19:03 UTC
A commit references this bug:

Author: rmacklem
Date: Wed Jun  6 22:18:24 UTC 2018
New revision: 334741
URL: https://svnweb.freebsd.org/changeset/base/334741

Log:
  MFC: r333580
  Fix a slow leak of session structures in the NFSv4.1 server.

  For a fairly rare case of a client doing an ExchangeID after a hard reboot,
  the old confirmed clientid still exists, but some clients use a new
  co_verifier. For this case, the server was not freeing up the sessions on
  the old confirmed clientid.
  This patch fixes this case. It also adds two LIST_INIT() macros, which are
  actually no-ops, since the structure is malloc()d with M_ZERO so the pointer
  is already set to NULL.
  It should have minimal impact, since the only way I could exercise this
  code path was by doing a hard power cycle (pulling the plus) on a machine
  running Linux with a NFSv4.1 mount on the server.
  Originally spotted during testing of the ESXi 6.5 client.

  PR:		228497

Changes:
_U  stable/10/
  stable/10/sys/fs/nfsserver/nfs_nfsdstate.c
Comment 12 Rick Macklem freebsd_committer 2018-06-06 22:25:26 UTC
The two patches have been committed and MFC'd. Since I think
they will stop these crashes, I'll close this PR.

If you experience a similar crash with these patches applied,
please reopen this.