Bug 86427 - [lor] Deadlock with FASTIPSEC and nat
Summary: [lor] Deadlock with FASTIPSEC and nat
Status: Closed Works As Intended
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 6.0-BETA5
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-21 21:10 UTC by mike
Modified: 2015-07-26 08:54 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description mike 2005-09-21 21:10:05 UTC
	A box running with openVPN along with a FastIPSEC tunnel will enter into a deadlock under the right conditions

Fix: 

Using pf instead of ipfw seems to work around the issue although I dont know if that 
	just makes it harder to trigger

	Using regular IPSEC seems to work just fine, but 
	debug.mpsafenet forced to 0 as ipsec requires Giant
How-To-Repeat: 	client machines come in via tun0 (openvpn interface)
  	and are natted going out vlan7
	/sbin/ipfw add 3400 divert natd ip from any to any via vlan7
	/sbin/natd -unregistered_only -n vlan7
	 /sbin/ifconfig gif0 create tunnel xxx.yyy.zzz.86 aaa.bbb.ccc.118
        /sbin/ifconfig gif0 10.44.99.2 netmask 255.255.255.252 10.44.99.1 netmask 255.255.255.252

Then we add a simple IPSEC policy across the gif tunnel 
setkey -c <<EOF
add xxx.yyy.zzz.86 aaa.bbb.ccc.118 esp 1044 -m any -E blowfish-cbc "JOy1QIbr8swoTIiuZYCtkDiCSHeI1eb48rux7IzwbEXyA";
add aaa.bbb.ccc.118 xxx.yyy.zzz.86 esp 1044 -m any -E blowfish-cbc "JOy1QIbr8swoTIiuZYCtkDiCSHeI1eb48rux7IzwbEXyA";
spdadd xxx.yyy.zzz.86/32 aaa.bbb.ccc.118/32 any -P out ipsec esp/tunnel/xxx.yyy.zzz.86-aaa.bbb.ccc.118/require;
spdadd aaa.bbb.ccc.118/32 xxx.yyy.zzz.86/32 any -P in  ipsec esp/tunnel/aaa.bbb.ccc.118-xxx.yyy.zzz.86/require;
EOF


I then bring up the openvpn tunnels (just a few test clients).  I can then trigger the deadlock by 
doing an scp across the gif tunnel

On the serial console I see


lock order reversal
 1st 0xc292a090 inp (divinp) @ /usr/src/sys/netinet/ip_divert.c:327
 2nd 0xc28bc950 ipsec request (ipsec request) @ /usr/src/sys/netipsec/ipsec_output.c:354
KDB: stack backtrace:
kdb_backtrace(0,ffffffff,c075ed70,c075ed98,c07261c4) at kdb_backtrace+0x29
witness_checkorder(c28bc950,9,c06f7aa6,162) at witness_checkorder+0x564
_mtx_lock_flags(c28bc950,0,c06f7aa6,162,0) at _mtx_lock_flags+0x5b
ipsec4_process_packet(c2a4a400,c28bc900,22,0,c28b7600) at ipsec4_process_packet+0x45
ip_output(c2a4a400,0,e741eb28,22,0) at ip_output+0x74f
div_output(c2926b20,c2a4a400,c255a280,0,e741ec08) at div_output+0x185
div_send(c2926b20,0,c2a4a400,c255a280,0) at div_send+0x3f
sosend(c2926b20,c255a280,e741ec3c,c2a4a400,0) at sosend+0x5e3
kern_sendit(c2731600,3,e741ecbc,0,0) at kern_sendit+0x104
sendit(c2731600,3,e741ecbc,0,bfbdec1c) at sendit+0x163
sendto(c2731600,e741ed04,6,2,296) at sendto+0x4d
syscall(3b,3b,3b,2,7c) at syscall+0x22f
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (133, FreeBSD ELF32, sendto), eip = 0x280c5d97, esp = 0xbfbdeb0c, ebp = 0xbfbeebb8 ---


And here the box is frozen up.  On the serial console, I can break into the debugger


telnet> send break
KDB: enter: Line break on console
[thread pid 12 tid 100004 ]
Stopped at      kdb_enter+0x2b: nop     
db> 


db> show pcpu
cpuid        = 0
curthread    = 0xc22a9900: pid 12 "idle: cpu0"
curpcb       = 0xe3481d90
fpcurthread  = none
idlethread   = 0xc22a9900: pid 12 "idle: cpu0"
APIC ID      = 0
currentldt   = 0x50
spin locks held:
db> show alllocks
Process 682 (ssh) thread 0xc2aa6000 (100137)
exclusive sleep mutex crypto (crypto op queues) r = 0 (0xc07aa420) locked @ /usr/src/sys/opencrypto/crypto.c:669
exclusive sleep mutex ipsec request r = 1 (0xc28bc950) locked @ /usr/src/sys/netipsec/xform_esp.c:872
exclusive sleep mutex inp (tcpinp) r = 0 (0xc28c6a68) locked @ /usr/src/sys/netinet/tcp_usrreq.c:651
Process 586 (natd) thread 0xc2731600 (100080)
exclusive sleep mutex inp (divinp) r = 0 (0xc292a090) locked @ /usr/src/sys/netinet/ip_divert.c:327
exclusive sleep mutex div r = 0 (0xc07a0b6c) locked @ /usr/src/sys/netinet/ip_divert.c:325
Process 37 (swi4: clock sio) thread 0xc2317d80 (100036)
exclusive sleep mutex tcp r = 0 (0xc07a1ecc) locked @ /usr/src/sys/netinet/tcp_timer.c:457
db> ps
  pid   proc     uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
  682 c2aa4830    0   681   681 0004002 [LOCK    div c29f57c0] ssh
  681 c2aa4a3c    0   679   681 0004002 [SLPQ piperd 0xc272f000][SLP] scp
  679 c2894830    0   678   679 0004002 [SLPQ pause 0xc2894864][SLP] csh
  678 c2894624 1001   675   678 0004102 [SLPQ wait 0xc2894624][SLP] su
  675 c26efa3c 1001   674   675 0004002 [SLPQ pause 0xc26efa70][SLP] csh
  674 c2894a3c 1001   672   672 0000100 [SLPQ select 0xc079d3c4][SLP] sshd
  672 c2734830    0   523   672 0004100 [SLPQ sbwait 0xc2a2920c][SLP] sshd
  663 c29f320c    0   657   663 0004102 [SLPQ select 0xc079d3c4][SLP] openvpn
  657 c2894c48    0   656   657 0004002 [SLPQ pause 0xc2894c7c][SLP] csh
  656 c29f3000 1001   653   656 0004102 [SLPQ wait 0xc29f3000][SLP] su
  653 c29f3624 1001   652   653 0004002 [SLPQ pause 0xc29f3658][SLP] csh
  652 c26ec624 1001   650   650 0000100 [SLPQ select 0xc079d3c4][SLP] sshd
  650 c29f3a3c    0   523   650 0004100 [SLPQ sbwait 0xc2a2a20c][SLP] sshd
  649 c29f3c48    0     1   649 0004002 [SLPQ ttyin 0xc254d410][SLP] getty
  648 c29f7000    0     1   648 0004002 [SLPQ ttyin 0xc254f010][SLP] getty
  647 c29f720c    0     1   647 0004002 [SLPQ ttyin 0xc254c410][SLP] getty
  646 c29f7418    0     1   646 0004002 [SLPQ ttyin 0xc254fc10][SLP] getty
  645 c29f7624    0     1   645 0004002 [SLPQ ttyin 0xc2555010][SLP] getty
  644 c29f7830    0     1   644 0004002 [SLPQ ttyin 0xc254f810][SLP] getty
  643 c2734000    0     1   643 0004002 [SLPQ ttyin 0xc2555810][SLP] getty
  642 c2733000    0     1   642 0004002 [SLPQ ttyin 0xc2557010][SLP] getty
  641 c26efc48    0     1   641 0004002 [SLPQ ttyin 0xc2556010][SLP] getty
  595 c2730a3c    0     1    64 0000002 [SLPQ select 0xc079d3c4][SLP] 3dm2
  586 c2730624    0     1   586 0000000 [LOCK ipsec request c26bacc0] natd
  570 c2734624    0     1   570 0000101 [SLPQ select 0xc079d3c4][SLP] bgpd
  568 c289020c    0     1   568 0000101 [SLPQ select 0xc079d3c4][SLP] zebra
  545 c2734c48    0     1   545 0000000 [SLPQ nanslp 0xc07500ac][SLP] cron
  532 c2733830   25     1   532 0000100 [SLPQ pause 0xc2733864][SLP] sendmail
  528 c2730418    0     1   528 0000100 [SLPQ select 0xc079d3c4][SLP] sendmail
  523 c2733c48    0     1   523 0000100 [SLPQ select 0xc079d3c4][SLP] sshd
  521 c2890c48    0   505   505 0000000 [SLPQ pause 0xc2890c7c][SLP] ntpd
  505 c26eca3c    0     1   505 0000000 [SLPQ select 0xc079d3c4][SLP] ntpd
  407 c2890624   53     1   407 0000100 [SLPQ select 0xc079d3c4][SLP] named
  346 c2890000    0     1   346 0000000 [SLPQ select 0xc079d3c4][SLP] syslogd
  313 c26ec000    0     1   313 0000000 [SLPQ select 0xc079d3c4][SLP] devd
  191 c26ef000    0     1   191 0000000 [SLPQ pause 0xc26ef034][SLP] adjkerntz
   63 c238d20c    0     0     0 0000204 [SLPQ - 0xe4f4fd04][SLP] schedcpu
   62 c238d418    0     0     0 0000204 [SLPQ - 0xc07a502c][SLP] nfsiod 3
   61 c238d624    0     0     0 0000204 [SLPQ - 0xc07a5028][SLP] nfsiod 2
   60 c238d830    0     0     0 0000204 [SLPQ - 0xc07a5024][SLP] nfsiod 1
   59 c238da3c    0     0     0 0000204 [SLPQ - 0xc07a5020][SLP] nfsiod 0
   58 c238dc48    0     0     0 0000204 [SLPQ vlruwt 0xc238dc48][SLP] vnlru
   57 c23f4000    0     0     0 0000204 [SLPQ syncer 0xc074fe1c][SLP] syncer
   56 c23f420c    0     0     0 0000204 [SLPQ psleep 0xc079d90c][SLP] bufdaemon
   55 c23f4418    0     0     0 000020c [SLPQ pgzero 0xc07ab684][SLP] pagezero
   54 c23f4624    0     0     0 0000204 [SLPQ psleep 0xc07ab1d4][SLP] vmdaemon
   53 c23f4830    0     0     0 0000204 [SLPQ psleep 0xc07ab190][SLP] pagedaemon
   52 c23f4a3c    0     0     0 0000204 [IWAIT] swi0: sio
   51 c23f4c48    0     0     0 0000204 [SLPQ usbevt 0xc238b210][SLP] usb4
   50 c2316624    0     0     0 0000204 [SLPQ usbevt 0xc23d4210][SLP] usb3
   49 c2316830    0     0     0 0000204 [SLPQ usbevt 0xc23c0210][SLP] usb2
   48 c2316a3c    0     0     0 0000204 [SLPQ usbevt 0xc2394210][SLP] usb1
   47 c2316c48    0     0     0 0000204 [SLPQ usbtsk 0xc074ad64][SLP] usbtask
   46 c238c000    0     0     0 0000204 [SLPQ usbevt 0xc2398210][SLP] usb0
   45 c238c20c    0     0     0 0000204 [IWAIT] swi6: task queue
   44 c238c418    0     0     0 0000204 [SLPQ - 0xc233a980][SLP] acpi_task2
   43 c238c624    0     0     0 0000204 [SLPQ - 0xc233a980][SLP] acpi_task1
    9 c238c830    0     0     0 0000204 [SLPQ - 0xc233a980][SLP] acpi_task0
    8 c238ca3c    0     0     0 0000204 [SLPQ - 0xc233aa00][SLP] kqueue taskq
   42 c238cc48    0     0     0 0000204 [IWAIT] swi2: cambio
   41 c238d000    0     0     0 0000204 [IWAIT] swi5:+
    7 c2309c48    0     0     0 0000204 [SLPQ - 0xc233ac80][SLP] thread taskq
   40 c2315000    0     0     0 0000204 [IWAIT] swi6:+
   39 c231520c    0     0     0 0000204 [SLPQ - 0xc074a5a0][SLP] yarrow
    6 c2315418    0     0     0 0000204 [SLPQ - 0xc074d5a8][SLP] g_down
    5 c2315624    0     0     0 0000204 [SLPQ - 0xc074d5a4][SLP] g_up
    4 c2315830    0     0     0 0000204 [SLPQ - 0xc074d59c][SLP] g_event
    3 c2315a3c    0     0     0 0000204 [SLPQ crypto_ret_wait 0xc07aa444][SLP] crypto returns
    2 c2315c48    0     0     0 0000204 [SLPQ crypto_wait 0xc07aa404][SLP] crypto
   38 c2316000    0     0     0 0000204 [IWAIT] swi3: vm
   37 c231620c    0     0     0 000020c [LOCK    inp c22f8dc0] swi4: clock sio
   36 c2316418    0     0     0 0000204 [LOCK    div c29f57c0] swi1: net
   35 c2300624    0     0     0 0000204 [IWAIT] irq23: uhci0 ehci0
   34 c2300830    0     0     0 0000204 [IWAIT] irq22: em0
   33 c2300a3c    0     0     0 0000204 [IWAIT] irq21: twe0
   32 c2300c48    0     0     0 0000204 [IWAIT] irq20: fxp0
   31 c2309000    0     0     0 0000204 [IWAIT] irq19: uhci1++
   30 c230920c    0     0     0 0000204 [IWAIT] irq18: uhci2
   29 c2309418    0     0     0 0000204 [IWAIT] irq17: fwohci0
   28 c2309624    0     0     0 0000204 [IWAIT] irq16: uhci3
   27 c2309830    0     0     0 0000204 [IWAIT] irq15: ata1
   26 c2309a3c    0     0     0 0000204 [IWAIT] irq14: ata0
   25 c22ad20c    0     0     0 0000204 [IWAIT] irq13:
   24 c22ad418    0     0     0 0000204 [IWAIT] irq12:
   23 c22ad624    0     0     0 0000204 [IWAIT] irq11:
   22 c22ad830    0     0     0 0000204 [IWAIT] irq10:
   21 c22ada3c    0     0     0 0000204 [IWAIT] irq9: acpi0
   20 c22adc48    0     0     0 0000204 [IWAIT] irq8:
   19 c2300000    0     0     0 0000204 [IWAIT] irq7: ppc0
   18 c230020c    0     0     0 0000204 [IWAIT] irq6:
   17 c2300418    0     0     0 0000204 [IWAIT] irq5:
   16 c22a8000    0     0     0 0000204 [IWAIT] irq4: sio0
   15 c22a820c    0     0     0 0000204 [IWAIT] irq3:
   14 c22a8418    0     0     0 0000204 [IWAIT] irq0:
   13 c22a8624    0     0     0 0000204 [IWAIT] irq1: atkbd0
   12 c22a8830    0     0     0 000020c [CPU 0] idle: cpu0
   11 c22a8a3c    0     0     0 000020c [CPU 1] idle: cpu1
    1 c22a8c48    0     0     1 0004200 [SLPQ wait 0xc22a8c48][SLP] init
   10 c22ad000    0     0     0 0000204 [SLPQ ktrace 0xc074dff8][SLP] ktrace
    0 c074d6a0    0     0     0 0000200 [IWAIT] swapper
db> show lockedvnods
Locked vnodes
db>  show lockedbufs
db> trace
Tracing pid 12 tid 100004 td 0xc22a9900
kdb_enter(c07056df) at kdb_enter+0x2b
siointr1(c254b000,c07ad5c0,0,c07054eb,56e) at siointr1+0xce
siointr(c254b000) at siointr+0x21
intr_execute_handlers(c229f490,e3481c94,4,e3481cd8,c068c0e3) at intr_execute_handlers+0xa5
lapic_handle_intr(34) at lapic_handle_intr+0x2e
Xapic_isr1() at Xapic_isr1+0x33
--- interrupt, eip = 0xc08a18fd, esp = 0xe3481cd8, ebp = 0xe3481cd8 ---
acpi_cpu_c1(c074f780,1,e3481cf8,1,c22a8830) at acpi_cpu_c1+0x5
acpi_cpu_idle(e3481d0c,c0516a51,c05169f4,e3481d24,c0516834) at acpi_cpu_idle+0x13e
cpu_idle(c05169f4,e3481d24,c0516834,0,e3481d38) at cpu_idle+0x28
idle_proc(0,e3481d38,0,c05169f4,0) at idle_proc+0x5d
fork_exit(c05169f4,0,e3481d38) at fork_exit+0xa0
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe3481d6c, ebp = 0 ---
db> show lockedvnods
Locked vnodes
db>
Comment 1 Bjoern A. Zeeb 2005-09-21 23:24:22 UTC
For the archives. I added the LOR with ID 163 to "the LOR page"
See http://sources.zabbadoz.net/freebsd/lor.html#163
Comment 2 roberthuff 2006-01-13 01:23:11 UTC
	I'm getting what seems to be either the same problem or its fraternal 
twin ... only without either IPSEC (any flavor) or any vpn.
	Running

FreeBSD 7.0-CURRENT #0: Wed Jan  4 13:41:21 EST 20

	I get this

Jan 12 19:27:51 jerusalem kernel: lock order reversal:
Jan 12 19:27:51 jerusalem kernel: 1st 0xc364a090 inp (divinp) @ 
/usr/src/sys/netinet/ip_divert.c:327
Jan 12 19:27:51 jerusalem kernel: 2nd 0xc07655cc in_multi_mtx 
(in_multi_mtx) @ /usr/src/sys/netinet/ip_output.c:291
Jan 12 19:27:51 jerusalem kernel: KDB: stack backtrace:
Jan 12 19:27:51 jerusalem kernel: 
kdb_backtrace(c06b7a91,c07655cc,c06b7470,c06b7470,c06c0886) at 
kdb_backtrace+0x2f
Jan 12 19:27:51 jerusalem kernel: 
witness_checkorder(c07655cc,9,c06c0886,123,c06bead6) at 
witness_checkorder+0x6e1
Jan 12 19:27:51 jerusalem kernel: 
_mtx_lock_flags(c07655cc,0,c06c0886,123,c05427bd) at _mtx_lock_flags+0x85
Jan 12 19:27:51 jerusalem kernel: ip_output(c33c0b00,0,d56e5afc,22,0) 
at ip_output+0x460
Jan 12 19:27:51 jerusalem kernel: 
div_output(c35ff000,c33c0b00,c341d970,0,d56e5bb8) at div_output+0x1d5
Jan 12 19:27:51 jerusalem kernel: 
div_send(c35ff000,0,c33c0b00,c341d970,0) at div_send+0x5d
Jan 12 19:27:51 jerusalem kernel: 
sosend(c35ff000,c341d970,d56e5be4,c33c0b00,0) at sosend+0x49e
Jan 12 19:27:51 jerusalem kernel: kern_sendit(c339c900,3,d56e5c64,0,0) 
at kern_sendit+0x106
Jan 12 19:27:51 jerusalem kernel: 
sendit(c339c900,3,d56e5c64,0,bfbeedb0) at sendit+0x1a8
Jan 12 19:27:51 jerusalem kernel: sendto(c339c900,d56e5d04,18,43c,6) 
at sendto+0x5b
Jan 12 19:27:51 jerusalem kernel: syscall(3b,3b,3b,bfbeed90,2) at 
syscall+0x2a6
Jan 12 19:27:51 jerusalem kernel: Xint0x80_syscall() at 
Xint0x80_syscall+0x1f
Jan 12 19:27:51 jerusalem kernel: --- syscall (133, FreeBSD ELF32, 
sendto), eip

	and startup continues.
	If this is a problem with ipfw, it's one that happened since 
approximately the middle of December.
Comment 3 Robert Watson freebsd_committer freebsd_triage 2006-01-13 17:52:39 UTC
On Fri, 13 Jan 2006, Robert Huff wrote:

> The following reply was made to PR kern/86427; it has been noted by GNATS.
>
> From: Robert Huff <roberthuff@rcn.com>
> To: bug-followup@FreeBSD.org,  mike@sentex.net
> Cc:
> Subject: Re: kern/86427: LOR / Deadlock with FASTIPSEC and nat
> Date: Thu, 12 Jan 2006 20:23:11 -0500
>
> 	I'm getting what seems to be either the same problem or its fraternal
> twin ... only without either IPSEC (any flavor) or any vpn.
> 	Running

This may in part be due to a bug in IP divert sockets, resulting in recursion 
in the network stack.  The attached untested patch may help, or at least, 
eliminate part of the problem.  This hasn't yet been committed because I've 
had trouble finding someone to test it.

Index: ip_divert.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/ip_divert.c,v
retrieving revision 1.113
diff -u -r1.113 ip_divert.c
--- ip_divert.c	13 May 2005 11:44:37 -0000	1.113
+++ ip_divert.c	13 Nov 2005 19:27:32 -0000
@@ -61,6 +61,7 @@
  #include <vm/uma.h>

  #include <net/if.h>
+#include <net/netisr.h>
  #include <net/route.h>

  #include <netinet/in.h>
@@ -378,7 +379,7 @@
  		SOCK_UNLOCK(so);
  #endif
  		/* Send packet to input processing */
-		ip_input(m);
+		netisr_queue(NETISR_IP, m);
  	}

  	return error;
Comment 4 roberthuff 2006-02-15 18:41:32 UTC
	(I don't know this will be useful, but in case ....)

	Running:

huff@jerusalem>> uname -v 
FreeBSD 7.0-CURRENT #0: Fri Jan 13 13:21:14 EST 2006

	at the last re-boot I got this:

Feb 14 07:42:47 jerusalem kernel: lock order reversal:
Feb 14 07:42:47 jerusalem kernel: 1st 0xc3628090 inp (divinp) @ /usr/src/sys/netinet/ip_divert.c:328
Feb 14 07:42:47 jerusalem kernel: 2nd 0xc0764fe8 in_multi_mtx (in_multi_mtx) @ /usr/src/sys/netinet/ip_output.c:291
Feb 14 07:42:47 jerusalem kernel: KDB: stack backtrace:
Feb 14 07:42:47 jerusalem kernel: kdb_backtrace(c06b7d5c,c0764fe8,c06b772c,c06b772c,c06c0c61) at kdb_backtrace+0x2f
Feb 14 07:42:47 jerusalem kernel: witness_checkorder(c0764fe8,9,c06c0c61,123,c06beeb1) at witness_checkorder+0x6e4
Feb 14 07:42:47 jerusalem kernel: _mtx_lock_flags(c0764fe8,0,c06c0c61,123,c0542b9f) at _mtx_lock_flags+0x8b
Feb 14 07:42:47 jerusalem kernel: ip_output(c394d700,0,d56daafc,22,0) at ip_output+0x460
Feb 14 07:42:47 jerusalem kernel: div_output(c35fd3e4,c394d700,c3522760,0,d56dabb8) at div_output+0x1d5
Feb 14 07:42:47 jerusalem kernel: div_send(c35fd3e4,0,c394d700,c3522760,0) at div_send+0x5d
Feb 14 07:42:47 jerusalem kernel: sosend(c35fd3e4,c3522760,d56dabe4,c394d700,0) at sosend+0x49e
Feb 14 07:42:47 jerusalem kernel: kern_sendit(c339b300,3,d56dac64,0,0) at kern_sendit+0x106
Feb 14 07:42:47 jerusalem kernel: sendit(c339b300,3,d56dac64,0,bfbeedb0) at sendit+0x1a8
Feb 14 07:42:47 jerusalem kernel: sendto(c339b300,d56dad04,18,43c,6) at sendto+0x5b
Feb 14 07:42:47 jerusalem kernel: syscall(3b,3b,3b,bfbeed90,2) at syscall+0x2a6
Feb 14 07:42:47 jerusalem kernel: Xint0x80_syscall() at Xint0x80_syscall+0x1f
Feb 14 07:42:47 jerusalem kernel: --- syscall (133, FreeBSD ELF32, sendto), eip = 0x4814230b, esp = 0xbfbeecfc, ebp = 0xbfbfeda8 ---

	which was was expected, and then this, which was new:

Feb 14 07:43:22 jerusalem kernel: lock order reversal:
Feb 14 07:43:22 jerusalem kernel: 1st 0xc3629480 inp (rawinp) @ /usr/src/sys/netinet/raw_ip.c:202
Feb 14 07:43:22 jerusalem kernel: 2nd 0xc36293d8 inp (raw6inp) @ /usr/src/sys/netinet/raw_ip.c:202
Feb 14 07:43:22 jerusalem kernel: KDB: stack backtrace:
Feb 14 07:43:22 jerusalem kernel: kdb_backtrace(c06b7d5c,c36293d8,c06c4e3d,c06c4ca8,c06c0cdb) at kdb_backtrace+0x2f
Feb 14 07:43:22 jerusalem kernel: witness_checkorder(c36293d8,9,c06c0cdb,ca,246) at witness_checkorder+0x6e4
Feb 14 07:43:22 jerusalem kernel: _mtx_lock_flags(c36293d8,0,c06c0cdb,ca,1) at _mtx_lock_flags+0x8b
Feb 14 07:43:22 jerusalem kernel: rip_input(c394d600,14,0,d4477be8,c0542b9f) at rip_input+0x7b
Feb 14 07:43:22 jerusalem kernel: icmp_input(c394d600,14,c3429000,1,0) at icmp_input+0x511
Feb 14 07:43:22 jerusalem kernel: ip_input(c394d600,0,c06bec13,e9,c0764bd8) at ip_input+0x656
Feb 14 07:43:22 jerusalem kernel: netisr_processqueue(c0764bd8,0,c06bec13,153,c32b41c0) at netisr_processqueue+0x8a
Feb 14 07:43:22 jerusalem kernel: swi_net(0,d4477cdc,c050e8f0,c0719970,1) at swi_net+0xa4
Feb 14 07:43:22 jerusalem kernel: ithread_execute_handlers(c32acac8,c32a0500,c06b14b0,2f9,c32ad780) at ithread_execute_handlers+0x10d
Feb 14 07:43:22 jerusalem kernel: ithread_loop(c327e6c0,d4477d38,c06b12e3,30e,c327e6c0) at ithread_loop+0x77
Feb 14 07:43:22 jerusalem kernel: fork_exit(c0502d1c,c327e6c0,d4477d38) at fork_exit+0xc5
Feb 14 07:43:22 jerusalem kernel: fork_trampoline() at fork_trampoline+0x8
Feb 14 07:43:22 jerusalem kernel: --- trap 0x1, eip = 0, esp = 0xd4477d6c, ebp = 0 ---

	The machine in question is still functional:

huff@jerusalem>> uptime
 1:39PM  up 1 day,  5:57, 6 users, load averages: 1.39, 2.44, 2.99

	and I believe (but have no hard data) Robert's patch has
reduced the impact (time between failure was smallnum days, now
smallnum weeks).


				Robert Huff
Comment 5 roberthuff 2006-03-28 01:58:13 UTC
	I have some new and hopefully useful information.  Now running:

	FreeBSD 7.0-CURRENT #0: Mon Mar 13 09:23:39 EST 2006

	a reboot today produced:

Mar 27 18:14:44 jerusalem kernel: lock order reversal:
Mar 27 18:14:44 jerusalem kernel: 1st 0xc362a090 inp (divinp) @ 
/usr/src/sys/netinet/ip_divert.c:327
Mar 27 18:14:44 jerusalem kernel: 2nd 0xc076c618 PFil hook read/write 
mutex (PFil hook read/write mutex) @ /usr/src/sys/net/pfil.c:73
Mar 27 18:14:44 jerusalem kernel: KDB: stack backtrace:
Mar 27 18:14:44 jerusalem kernel: 
kdb_backtrace(c06bdee4,c076c618,c06c5154,c06c5154,c06c5120) at 
kdb_backtrace+0x2f
Mar 27 18:14:44 jerusalem kernel: 
witness_checkorder(c076c618,1,c06c5120,49,c051034d) at 
witness_checkorder+0x6e4
Mar 27 18:14:44 jerusalem kernel: 
_rw_rlock(c076c618,c06c5120,49,c36049c0,0) at _rw_rlock+0x6d
Mar 27 18:14:44 jerusalem kernel: 
pfil_run_hooks(c076c600,d5917b34,c34f8c00,2,0) at pfil_run_hooks+0x37
Mar 27 18:14:44 jerusalem kernel: ip_output(c33bc100,0,d5917b00,22,0) 
at ip_output+0x6f4
Mar 27 18:14:44 jerusalem kernel: 
div_output(c36003e4,c33bc100,c334f9b0,0,d5917bbc) at div_output+0x1d5
Mar 27 18:14:44 jerusalem kernel: 
div_send(c36003e4,0,c33bc100,c334f9b0,0) at div_send+0x5d
Mar 27 18:14:44 jerusalem kernel: 
sosend(c36003e4,c334f9b0,d5917be8,c33bc100,0) at sosend+0x49e
Mar 27 18:14:44 jerusalem kernel: kern_sendit(c3507870,3,d5917c68,0,0) 
at kern_sendit+0x106
Mar 27 18:14:44 jerusalem kernel: 
sendit(c3507870,3,d5917c68,0,bfbeedd1) at sendit+0x1a8
Mar 27 18:14:44 jerusalem kernel: 
sendto(c3507870,d5917d04,18,c35068d0,6) at sendto+0x5b
Mar 27 18:14:44 jerusalem kernel: syscall(3b,3b,3b,bfbeed90,2) at 
syscall+0x2a4
Mar 27 18:14:44 jerusalem kernel: Xint0x80_syscall() at 
Xint0x80_syscall+0x1f
Mar 27 18:14:44 jerusalem kernel: --- syscall (133, FreeBSD ELF32, 
sendto), eip = 0x4814d283, esp = 0xbfbeecfc, ebp = 0xbfbfeda8 ---
Comment 6 Bjoern A. Zeeb freebsd_committer freebsd_triage 2006-04-08 14:02:42 UTC
Responsible Changed
From-To: freebsd-bugs->gnn

LOR #163 was assigned to gnn as he is working on the locking in netipsec.
Comment 7 George V. Neville-Neil freebsd_committer freebsd_triage 2010-06-15 18:18:21 UTC
Responsible Changed
From-To: gnn->freebsd-net

I believe this is fixed but others can comment on it at will.
Comment 8 Ermal Luçi freebsd_committer freebsd_triage 2015-07-26 08:54:20 UTC
This for sure is not an issue anymore.
Closing and if needed can be re-opened.