Bug 249127

Summary: panic!("docallb") in nfsrv_docallback
Product: Base System Reporter: Alan Somers <asomers>
Component: kernAssignee: Rick Macklem <rmacklem>
Status: Closed FIXED    
Severity: Affects Some People Flags: rmacklem: mfc-stable12+
rmacklem: mfc-stable11+
Priority: ---    
Version: 12.1-STABLE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Add a check for LCL_NEEDSCONFIRM to nfsrv_checkgetattr() none

Description Alan Somers freebsd_committer freebsd_triage 2020-09-05 13:31:24 UTC
I just saw this panic on a 12-stable machine.  Unfortunately, I don't have a core dump, just a stack trace.  It was serving NFS v4.0, with delegations enabled.  The clients were all Debian, with Linux 3.16.0.

The proximal cause of the panic seems to be that the file had a write delegation issued to an unconfirmed client.  Root cause is harder to determine.  Did the kernel previously issue a delegation to an unconfirmed client?  Or did the client somehow change to an unconfirmed state after the delegation was issued, perhaps due to a race?

Should there be a check for LCL_NEEDSCONFIRM somewhere around line 3166 in nfs_nfsdstate.c?

kdb_backtrace
vpanic
panic
nfsrv_docallback
nfsrv_checkgetattr
nfsrvd_getattr
nfsrvd_dorpc
nfssvc_program
svc_run_internal
svc_thread_start
fork_exit
fork_trampoline
Comment 1 Rick Macklem freebsd_committer freebsd_triage 2020-09-05 23:16:34 UTC
Created attachment 217776 [details]
Add a check for LCL_NEEDSCONFIRM to nfsrv_checkgetattr()

I think this patch will fix the panic.

It makes nfsrv_checkgetattr() return
as if there is no delegation when the
ClientID for the delegation is unconfirmed.
Comment 2 Rick Macklem freebsd_committer freebsd_triage 2020-09-05 23:21:40 UTC
I believe that nfsrv_opencheck() will have checked for
an unconfirmed ClientID.  As such, I think the ClientID
would be confirmed when the Open and Delegation was issued.

However, a reboot of the client (or a broken client that
for some reason did a SetClientID after acquiring a ClientID)
would set the ClientID back to unconfirmed.
--> This implies any delegation on the ClientID is no longer
    valid (it is still there, since old state remains until
    the SetClientIDConfirm is done to complete the handshake
    that (re)creates the ClientID).

I think the bug was nfsrv_checkgetattr() assuming that the
delegation it found was valid, because it did not check
for an unconfirmed ClientID.

The patch in the attachment adds that check and should avoid
the panic.
Comment 3 commit-hook freebsd_committer freebsd_triage 2020-09-14 00:45:31 UTC
A commit references this bug:

Author: rmacklem
Date: Mon Sep 14 00:44:50 UTC 2020
New revision: 365703
URL: https://svnweb.freebsd.org/changeset/base/365703

Log:
  Fix a case where the NFSv4.0 server might crash if delegations are enabled.

  asomers@ reported a crash on an NFSv4.0 server with a backtrace of:
  kdb_backtrace
  vpanic
  panic
  nfsrv_docallback
  nfsrv_checkgetattr
  nfsrvd_getattr
  nfsrvd_dorpc
  nfssvc_program
  svc_run_internal
  svc_thread_start
  fork_exit
  fork_trampoline
  where the panic message was "docallb", which indicates that a callback
  was attempted when the ClientID is unconfirmed.
  This would not normally occur, but it is possible to have an unconfirmed
  ClientID structure with delegation structure(s) chained off it if the
  client were to issue a SetClientID with the same "id" but different
  "verifier" after acquiring delegations on the previously confirmed ClientID.

  The bug appears to be that nfsrv_checkgetattr() failed to check for
  this uncommon case of an unconfirmed ClientID with a delegation structure
  that no longer refers to a delegation the client knows about.

  This patch adds a check for this case, handling it as if no delegation
  exists, which is the case when the above occurs.
  Although difficult to reproduce, this change should avoid the panic().

  PR:		249127
  Reported by:	asomers
  Reviewed by:	asomers
  MFC after:	1 week
  Differential Revision:	https://reviews.freebbsd.org/D26342

Changes:
  head/sys/fs/nfsserver/nfs_nfsdstate.c
Comment 4 Rick Macklem freebsd_committer freebsd_triage 2020-09-14 00:47:11 UTC
Patch has been committed to head and will be MFC'd.
Comment 5 commit-hook freebsd_committer freebsd_triage 2020-09-21 00:50:34 UTC
A commit references this bug:

Author: rmacklem
Date: Mon Sep 21 00:50:32 UTC 2020
New revision: 365934
URL: https://svnweb.freebsd.org/changeset/base/365934

Log:
  MFC: r365703
  Fix a case where the NFSv4.0 server might crash if delegations are enabled.

  asomers@ reported a crash on an NFSv4.0 server with a backtrace of:
  kdb_backtrace
  vpanic
  panic
  nfsrv_docallback
  nfsrv_checkgetattr
  nfsrvd_getattr
  nfsrvd_dorpc
  nfssvc_program
  svc_run_internal
  svc_thread_start
  fork_exit
  fork_trampoline
  where the panic message was "docallb", which indicates that a callback
  was attempted when the ClientID is unconfirmed.
  This would not normally occur, but it is possible to have an unconfirmed
  ClientID structure with delegation structure(s) chained off it if the
  client were to issue a SetClientID with the same "id" but different
  "verifier" after acquiring delegations on the previously confirmed ClientID.

  The bug appears to be that nfsrv_checkgetattr() failed to check for
  this uncommon case of an unconfirmed ClientID with a delegation structure
  that no longer refers to a delegation the client knows about.

  This patch adds a check for this case, handling it as if no delegation
  exists, which is the case when the above occurs.
  Although difficult to reproduce, this change should avoid the panic().

  PR:		249127

Changes:
_U  stable/12/
  stable/12/sys/fs/nfsserver/nfs_nfsdstate.c
Comment 6 commit-hook freebsd_committer freebsd_triage 2020-09-21 01:39:44 UTC
A commit references this bug:

Author: rmacklem
Date: Mon Sep 21 01:39:01 UTC 2020
New revision: 365935
URL: https://svnweb.freebsd.org/changeset/base/365935

Log:
  MFC: r365703
  Fix a case where the NFSv4.0 server might crash if delegations are enabled.

  asomers@ reported a crash on an NFSv4.0 server with a backtrace of:
  kdb_backtrace
  vpanic
  panic
  nfsrv_docallback
  nfsrv_checkgetattr
  nfsrvd_getattr
  nfsrvd_dorpc
  nfssvc_program
  svc_run_internal
  svc_thread_start
  fork_exit
  fork_trampoline
  where the panic message was "docallb", which indicates that a callback
  was attempted when the ClientID is unconfirmed.
  This would not normally occur, but it is possible to have an unconfirmed
  ClientID structure with delegation structure(s) chained off it if the
  client were to issue a SetClientID with the same "id" but different
  "verifier" after acquiring delegations on the previously confirmed ClientID.

  The bug appears to be that nfsrv_checkgetattr() failed to check for
  this uncommon case of an unconfirmed ClientID with a delegation structure
  that no longer refers to a delegation the client knows about.

  This patch adds a check for this case, handling it as if no delegation
  exists, which is the case when the above occurs.
  Although difficult to reproduce, this change should avoid the panic().

  PR:		249127

Changes:
_U  stable/11/
  stable/11/sys/fs/nfsserver/nfs_nfsdstate.c
Comment 7 commit-hook freebsd_committer freebsd_triage 2020-09-24 14:59:21 UTC
A commit references this bug:

Author: rmacklem
Date: Thu Sep 24 14:59:10 UTC 2020
New revision: 366116
URL: https://svnweb.freebsd.org/changeset/base/366116

Log:
  MFS: r365703
  Fix a case where the NFSv4.0 server might crash if delegations are enabled.

  asomers@ reported a crash on an NFSv4.0 server with a backtrace of:
  kdb_backtrace
  vpanic
  panic
  nfsrv_docallback
  nfsrv_checkgetattr
  nfsrvd_getattr
  nfsrvd_dorpc
  nfssvc_program
  svc_run_internal
  svc_thread_start
  fork_exit
  fork_trampoline
  where the panic message was "docallb", which indicates that a callback
  was attempted when the ClientID is unconfirmed.
  This would not normally occur, but it is possible to have an unconfirmed
  ClientID structure with delegation structure(s) chained off it if the
  client were to issue a SetClientID with the same "id" but different
  "verifier" after acquiring delegations on the previously confirmed ClientID.

  The bug appears to be that nfsrv_checkgetattr() failed to check for
  this uncommon case of an unconfirmed ClientID with a delegation structure
  that no longer refers to a delegation the client knows about.

  This patch adds a check for this case, handling it as if no delegation
  exists, which is the case when the above occurs.
  Although difficult to reproduce, this change should avoid the panic().

  PR:		249127
  Approved by:	re (gjb)

Changes:
_U  releng/12.2/
  releng/12.2/sys/fs/nfsserver/nfs_nfsdstate.c
Comment 8 Rick Macklem freebsd_committer freebsd_triage 2020-09-24 15:03:24 UTC
Patch has been MFC'd and MFS's to releng/12.2.