Host is 10.2-BETA2 running r285810. Host has four NFS mounts from two separate NetApp filers. One of those filers was rebooted, which caused the mount to stop responding (and produced a LOR which I believe is unrelated, see bug 203133). When the filer became available again, a client got this: Sep 15 07:14:48 client kernel: lock order reversal: Sep 15 07:14:48 client kernel: 1st 0xfffff8000d704d50 newnfs (newnfs) @ /space/freebsd/stable/10/sys/nlm/nlm_advlock.c:500 Sep 15 07:14:48 client kernel: 2nd 0xffffffff81c694d8 allproc (allproc) @ /space/freebsd/stable/10/sys/kern/kern_proc.c:309 Sep 15 07:14:48 client kernel: KDB: stack backtrace: Sep 15 07:14:48 client kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0469a137a0 Sep 15 07:14:48 client kernel: kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe0469a13850 Sep 15 07:14:48 client kernel: witness_checkorder() at witness_checkorder+0xe24/frame 0xfffffe0469a138e0 Sep 15 07:14:48 client kernel: _sx_slock() at _sx_slock+0x76/frame 0xfffffe0469a13920 Sep 15 07:14:48 client kernel: pfind() at pfind+0x22/frame 0xfffffe0469a13940 Sep 15 07:14:48 client kernel: nlm_set_creds_for_lock() at nlm_set_creds_for_lock+0xb4/frame 0xfffffe0469a13970 Sep 15 07:14:48 client kernel: nlm_client_recover_lock() at nlm_client_recover_lock+0x61/frame 0xfffffe0469a139b0 Sep 15 07:14:48 client kernel: lf_iteratelocks_sysid() at lf_iteratelocks_sysid+0x194/frame 0xfffffe0469a13a10 Sep 15 07:14:48 client kernel: nlm_client_recovery() at nlm_client_recovery+0x51/frame 0xfffffe0469a13a50 Sep 15 07:14:48 client kernel: nlm_client_recovery_start() at nlm_client_recovery_start+0x35/frame 0xfffffe0469a13a70 Sep 15 07:14:48 client kernel: fork_exit() at fork_exit+0x84/frame 0xfffffe0469a13ab0 Sep 15 07:14:48 client kernel: fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0469a13ab0 Sep 15 07:14:48 client kernel: --- trap 0xc, rip = 0x800cb747a, rsp = 0x7fffffffeb78, rbp = 0x7fffffffed00 --- Sep 15 07:14:49 client rpc.statd: Unsolicited notification from host netapp-14 Sep 15 07:14:51 client rpc.statd: Unsolicited notification from host netapp-126 Sep 15 07:14:57 client kernel: newnfs server netapp:/vol/filestore: is alive aganin Unfortunately, I'm unlikely to be able to recreate this one at will. Gavin
The info below doesn't seem to show where the locks are acquired in the other order. The NFS client (for NFSv4) and the NLM first lock the NFS vnode and then the "proc" related lock(s). Since I do not believe the NFS subsystem and NLM (not really a part of NFS, but a separate protocol) ever first locks the proc structure and then a vnode, I don't think and deadlock can occur. If someone knows of a way that the generic kernel code could lock an NFS client vnode after acquiring a proc lock, please let me know. I do know that it isn't practical to "fix" these LORs, but I do not believe that they can cause deadlocks. If I find out where harmless LORs are listed, I'll add these. rick ps: Although 203133 isn't the same LOR, the same story applies to both. *** This bug has been marked as a duplicate of bug 203133 ***