Bug 181834 - [nfs] amd mounting NFS directories can drive a dead-lock [regression]
Summary: [nfs] amd mounting NFS directories can drive a dead-lock [regression]
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-05 10:50 UTC by jcharbon
Modified: 2017-12-31 22:27 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jcharbon 2013-09-05 10:50:00 UTC
Short Summary:

On FreeBSD 8.4, the amd auto-mounter daemon can drive a machine dead-lock when mounting NFS directories.

Long Summary:

If amd daemon starts, the machine appears dead-locked right after:

- ssh connection are stalled
- virtual consoles are stalled
- serial console is stalled

However machine still replies to ping, thus the kernel is still alive.  Launching DDB kernel debugger via hardware NMI button during dead-lock gave us this status:

syslogd and devd are waiting for the Giant kernel lock:

db> show allchains
chain 1:
   thread 100171 (pid 885, syslogd) blocked on lock 0xffffffff80e2e100 
(sleep mutex) "Giant"
   thread 100148 (pid 1120, amd) running on CPU 9
chain 2:
   thread 100205 (pid 742, devd) blocked on lock 0xffffffff80e2e100 
(sleep mutex) "Giant"
   thread 100148 (pid 1120, amd) running on CPU 9

 which is owned by amd deamon

db> show lock Giant
   class: sleep mutex
   name: Giant
   flags: {DEF, RECURSE}
   state: {OWNED, CONTESTED, RECURSED}
   owner: 0xffffff003ef0f8e0 (tid 100148, pid 1120, "amd")
   recursed: 1
   An other backstrace with the witness kernel (kernel-witness):

  by the way this amd thread also owns other kernel mutexes:

db> show alllocks
Process 1120 (amd) thread 0xffffff003ef0f8e0 (100148)
exclusive rw udpinp (udpinp) r = 0 (0xffffff007b39fa60) locked @ 
/app/jcharbon/git/freebsd-vrsn/sys/netinet/in_pcb.c:237
exclusive rw udp (udp) r = 0 (0xffffffff80ff4d28) locked @ 
/app/jcharbon/git/freebsd-vrsn/sys/netinet/udp_usrreq.c:1464
exclusive lockmgr nfs (nfs) r = 0 (0xffffff007b1bd7e8) locked @ 
/app/jcharbon/git/freebsd-vrsn/sys/nfsclient/nfs_node.c:166
exclusive sleep mutex Giant (Giant) r = 1 (0xffffffff80e2e100) locked @ 
/app/jcharbon/git/freebsd-vrsn/sys/kern/vfs_mount.c:730

 Next we launch DDB directly from kernel NFS code which gave us as backstrace:

Tracing pid 1142 tid 100301 td 0xffffff00379098e0
kvprintf() at kvprintf+0x17a
nfs_msg() at nfs_msg+0x52
nfs_feedback() at nfs_feedback+0x105
clnt_reconnect_call() at clnt_reconnect_call+0x19b
nfs_request() at nfs_request+0x1e5
nfs_getattr() at nfs_getattr+0x2bc
mountnfs() at mountnfs+0x330
nfs_mount() at nfs_mount+0xe3f
vfs_donmount() at vfs_donmount+0xcde
kernel_mount() at kernel_mount+0xa1
nfs_cmount() at nfs_cmount+0x5a
mount() at mount+0x1ea
amd64_syscall() at amd64_syscall+0xf9
Xfast_syscall() at Xfast_syscall+0xfc
--- syscall (21, FreeBSD ELF64, mount), rip = 0x8007cea4c, rsp = 
0x7fffffffdd88, rbp = 0x2 ---
db> next
db> next
db> next
..

  And many 'next' debugger commands later:  We are back in the same place in 
nfs_request().

 At that point, the mountnfs() call will just loop infinitely in nfs_request() function and never releases kernel Giant and (nfs) kernel mutexes.

How-To-Repeat: Using these amd's files: /etc/amd.conf:

$ cat /etc/amd.conf
[global]
browsable_dirs = no
map_type = file
mount_type = nfs
search_path = /etc
auto_dir = /.amd
cache_duration = 30
log_file = syslog:daemon
log_options = fatal,error
print_pid = yes
pid_file = /var/run/amd.pid
restart_mounts = yes
selectors_in_defaults = no

[/nfs/home]
map_name = /etc/home.map
$

and /etc/home.map:

$ cat /etc/home.map
/defaults type:=nfs;opts:=tcp,intr,nosuid;rhost:=1.2.3.4
* rfs:=/dev/${key};fs:=${autodir}/nfs/home/${key}
$

Just a:

# /etc/rc.d/amd onestart

will drive the dead-lock.
Comment 1 jcharbon 2013-09-05 13:13:01 UTC
  This issue is reproducible on a fresh install of FreeBSD 8.4-RELEASE, 
however it is _not_ reproducible on a fresh install of FreeBSD 8.3-RELEASE.

--
Julien Charbon
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2013-09-06 01:57:34 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 3 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:00:47 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped