Created attachment 161149 [details] core.txt from /var/crash Hi, We have four jail host machines that all experience this problem. We first suspected hardware, but since all four crash regularly, it should be as software (kernel?) problem. The jail hosts are running 10.2-p2 using GENERIC+VIMAGE using netgraph bridging and they encounter frequent kernel crashes. See attached core.txt - there is also a core dump available if someone want to debug it. The hosts contains around 20 jails half of them running apache2 serving a couple of thousands of users. ZFS on root. The web servers run a java web app (behind apache using mod_jk) process consuming a few GB of ram which frequently executes posix_spawn(3). We also experienced the problem when running on 10.1-p15. The crashes do not appear on 10.1-p4 which we have temporarily reverted to. Could the SCTP kernel memory corruption fix in 10.1-p5 be relevant? Or perhaps the 10.1-p15 TCP resource exhaustion one?
We have now seen the same problem in the older kernel as well. Something with forking and tcp: Need help here... (kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:219 #1 0xffffffff80949a42 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:451 #2 0xffffffff80949e25 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:758 #3 0xffffffff80949cb3 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:687 #4 0xffffffff80d5cf9b in trap_fatal (frame=<value optimized out>, eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:851 #5 0xffffffff80d5d29d in trap_pfault (frame=0xfffffe1760bc1840, usermode=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:674 #6 0xffffffff80d5c93a in trap (frame=0xfffffe1760bc1840) at /usr/src/sys/amd64/amd64/trap.c:440 #7 0xffffffff80d42cb2 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #8 0xffffffff809984cc in turnstile_broadcast (ts=0x0, queue=1) at /usr/src/sys/kern/subr_turnstile.c:838 #9 0xffffffff809480c0 in __rw_wunlock_hard (c=0xfffff8123b287618, tid=1, file=0x1 <Address 0x1 out of bounds>, line=1) at /usr/src/sys/kern/kern_rwlock.c:988 #10 0xffffffff80b066a4 in tcp_twclose (tw=<value optimized out>, reuse=<value optimized out>) at /usr/src/sys/netinet/tcp_timewait.c:540 #11 0xffffffff80b06ceb in tcp_tw_2msl_scan (reuse=0) at /usr/src/sys/netinet/tcp_timewait.c:748 #12 0xffffffff80b049be in tcp_slowtimo () at /usr/src/sys/netinet/tcp_timer.c:198 #13 0xffffffff809b78b4 in pfslowtimo (arg=0x0) at /usr/src/sys/kern/uipc_domain.c:508 #14 0xffffffff8095f8db in softclock_call_cc (c=0xffffffff81620bf0, cc=0xffffffff8169dc00, direct=0) at /usr/src/sys/kern/kern_timeout.c:685 #15 0xffffffff8095fd04 in softclock (arg=0xffffffff8169dc00) at /usr/src/sys/kern/kern_timeout.c:814 #16 0xffffffff809158eb in intr_event_execute_handlers ( p=<value optimized out>, ie=0xfffff801102e0d00) at /usr/src/sys/kern/kern_intr.c:1264 #17 0xffffffff80915d36 in ithread_loop (arg=0xfffff801102adee0) at /usr/src/sys/kern/kern_intr.c:1277 #18 0xffffffff8091343a in fork_exit ( callout=0xffffffff80915ca0 <ithread_loop>, arg=0xfffff801102adee0, frame=0xfffffe1760bc1c00) at /usr/src/sys/kern/kern_fork.c:1018 #19 0xffffffff80d431ee in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611 #20 0x0000000000000000 in ?? ()
It might be useful to go to frame 8 and run disassemble there.
Hi! This is a fresh core dump. This is beyond the scope of my experience, so please advice what to do next. Thanks! :-) # kgdb kernel /var/crash/vmcore.2 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: panic: tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp != NULL cpuid = 16 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe183d9e97e0 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe183d9e9890 vpanic() at vpanic+0x126/frame 0xfffffe183d9e98d0 kassert_panic() at kassert_panic+0x139/frame 0xfffffe183d9e9940 tcp_usr_detach() at tcp_usr_detach+0xf9/frame 0xfffffe183d9e9970 sofree() at sofree+0x1f1/frame 0xfffffe183d9e99a0 soclose() at soclose+0x3a0/frame 0xfffffe183d9e99f0 _fdrop() at _fdrop+0x29/frame 0xfffffe183d9e9a10 closef() at closef+0x1e2/frame 0xfffffe183d9e9aa0 closefp() at closefp+0x9d/frame 0xfffffe183d9e9ae0 amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe183d9e9bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe183d9e9bf0 --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x801c8d94a, rsp = 0x7ffff91c8668, rbp = 0x7ffff91c8680 --- KDB: enter: panic Uptime: 18h57m59s Dumping 23085 out of 98263 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Reading symbols from /boot/kernel/nullfs.ko.symbols...done. Loaded symbols for /boot/kernel/nullfs.ko.symbols Reading symbols from /boot/kernel/zfs.ko.symbols...done. Loaded symbols for /boot/kernel/zfs.ko.symbols Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. Loaded symbols for /boot/kernel/opensolaris.ko.symbols Reading symbols from /boot/kernel/ng_bridge.ko.symbols...done. Loaded symbols for /boot/kernel/ng_bridge.ko.symbols Reading symbols from /boot/kernel/netgraph.ko.symbols...done. Loaded symbols for /boot/kernel/netgraph.ko.symbols Reading symbols from /boot/kernel/ng_eiface.ko.symbols...done. Loaded symbols for /boot/kernel/ng_eiface.ko.symbols Reading symbols from /boot/kernel/ng_ether.ko.symbols...done. Loaded symbols for /boot/kernel/ng_ether.ko.symbols Reading symbols from /boot/kernel/accf_data.ko.symbols...done. Loaded symbols for /boot/kernel/accf_data.ko.symbols Reading symbols from /boot/kernel/accf_http.ko.symbols...done. Loaded symbols for /boot/kernel/accf_http.ko.symbols Reading symbols from /boot/kernel/ums.ko.symbols...done. Loaded symbols for /boot/kernel/ums.ko.symbols Reading symbols from /boot/kernel/ng_socket.ko.symbols...done. Loaded symbols for /boot/kernel/ng_socket.ko.symbols Reading symbols from /boot/kernel/fdescfs.ko.symbols...done. Loaded symbols for /boot/kernel/fdescfs.ko.symbols #0 doadump (textdump=1) at pcpu.h:219 219 __asm("movq %%gs:%1,%0" : "=r" (td) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0xffffffff8094b337 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:451 #2 0xffffffff8094b845 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:758 #3 0xffffffff8094b6d9 in kassert_panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:646 #4 0xffffffff80b1ee59 in tcp_usr_detach (so=<value optimized out>) at /usr/src/sys/netinet/tcp_usrreq.c:202 #5 0xffffffff809cd291 in sofree (so=0xfffff801dd302000) at /usr/src/sys/kern/uipc_socket.c:747 #6 0xffffffff809cdb00 in soclose (so=<value optimized out>) at /usr/src/sys/kern/uipc_socket.c:849 #7 0xffffffff808fe659 in _fdrop (fp=0xfffff802a593db40, td=0x0) at file.h:343 #8 0xffffffff80901092 in closef (fp=0xfffff802a593db40, td=0xfffff80eebc894a0) at /usr/src/sys/kern/kern_descrip.c:2338 #9 0xffffffff808feb5d in closefp (fdp=0xfffff80b20cce000, fd=<value optimized out>, fp=0xfffff802a593db40, td=0xfffff80eebc894a0, holdleaders=<value optimized out>) at /usr/src/sys/kern/kern_descrip.c:1194 #10 0xffffffff80d7bc3a in amd64_syscall (td=0xfffff80eebc894a0, traced=0) at subr_syscall.c:134 #11 0xffffffff80d5f1db in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #12 0x0000000801c8d94a in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) f 8 #8 0xffffffff80901092 in closef (fp=0xfffff802a593db40, td=0xfffff80eebc894a0) at /usr/src/sys/kern/kern_descrip.c:2338 2338 return (fdrop(fp, td)); (kgdb) help List of classes of commands: aliases -- Aliases of other commands breakpoints -- Making program stop at certain points data -- Examining data files -- Specifying and examining files internals -- Maintenance commands obscure -- Obscure features running -- Running the program stack -- Examining the stack status -- Status inquiries support -- Support facilities tracepoints -- Tracing of program execution without stopping the program user-defined -- User-defined commands Type "help" followed by a class name for a list of commands in that class. Type "help" followed by command name for full documentation. Command name abbreviations are allowed if unambiguous. (kgdb) disassemble Dump of assembler code for function closef: 0xffffffff80900eb0 <closef+0>: push %rbp 0xffffffff80900eb1 <closef+1>: mov %rsp,%rbp 0xffffffff80900eb4 <closef+4>: push %r15 0xffffffff80900eb6 <closef+6>: push %r14 0xffffffff80900eb8 <closef+8>: push %r13 0xffffffff80900eba <closef+10>: push %r12 0xffffffff80900ebc <closef+12>: push %rbx 0xffffffff80900ebd <closef+13>: sub $0x58,%rsp 0xffffffff80900ec1 <closef+17>: mov %rsi,%r12 0xffffffff80900ec4 <closef+20>: mov %rdi,%r14 0xffffffff80900ec7 <closef+23>: cmpw $0x1,0x20(%r14) 0xffffffff80900ecd <closef+29>: jne 0xffffffff80901077 <closef+455> 0xffffffff80900ed3 <closef+35>: test %r12,%r12 0xffffffff80900ed6 <closef+38>: je 0xffffffff80901077 <closef+455> 0xffffffff80900edc <closef+44>: mov 0x8(%r12),%rax 0xffffffff80900ee1 <closef+49>: mov 0x428(%rax),%rcx 0xffffffff80900ee8 <closef+56>: testb $0x1,0xb0(%rcx) 0xffffffff80900eef <closef+63>: je 0xffffffff80900f50 <closef+160> 0xffffffff80900ef1 <closef+65>: mov 0x18(%r14),%rcx 0xffffffff80900ef5 <closef+69>: movw $0x0,-0x62(%rbp) 0xffffffff80900efb <closef+75>: movq $0x0,-0x78(%rbp) 0xffffffff80900f03 <closef+83>: movq $0x0,-0x70(%rbp) 0xffffffff80900f0b <closef+91>: movw $0x2,-0x64(%rbp) 0xffffffff80900f11 <closef+97>: mov 0x428(%rax),%rax 0xffffffff80900f18 <closef+104>: movq $0xffffffff81557f68,-0x58(%rbp) 0xffffffff80900f20 <closef+112>: mov %rcx,-0x50(%rbp) 0xffffffff80900f24 <closef+116>: mov %rax,-0x48(%rbp) 0xffffffff80900f28 <closef+120>: movl $0x2,-0x40(%rbp) 0xffffffff80900f2f <closef+127>: lea -0x78(%rbp),%rax 0xffffffff80900f33 <closef+131>: mov %rax,-0x38(%rbp) 0xffffffff80900f37 <closef+135>: movl $0x40,-0x30(%rbp) 0xffffffff80900f3e <closef+142>: mov 0x8(%rcx),%rdi 0xffffffff80900f42 <closef+146>: lea -0x58(%rbp),%rsi 0xffffffff80900f46 <closef+150>: callq 0xffffffff80ea8870 <VOP_ADVLOCK_APV> 0xffffffff80900f4b <closef+155>: mov 0x8(%r12),%rax 0xffffffff80900f50 <closef+160>: mov 0x50(%rax),%rbx 0xffffffff80900f54 <closef+164>: test %rbx,%rbx 0xffffffff80900f57 <closef+167>: je 0xffffffff80901077 <closef+455> 0xffffffff80900f5d <closef+173>: mov 0x48(%rax),%r15 0xffffffff80900f61 <closef+177>: add $0x40,%r15 0xffffffff80900f65 <closef+181>: xor %esi,%esi 0xffffffff80900f67 <closef+183>: mov $0xffffffff810042e9,%rdx 0xffffffff80900f6e <closef+190>: mov $0x906,%ecx 0xffffffff80900f73 <closef+195>: mov %r15,%rdi 0xffffffff80900f76 <closef+198>: callq 0xffffffff80952ba0 <_sx_xlock> 0xffffffff80900f7b <closef+203>: mov 0x20(%rbx),%rbx 0xffffffff80900f7f <closef+207>: mov 0x8(%r12),%rax 0xffffffff80900f84 <closef+212>: cmp 0x50(%rax),%rbx ---Type <return> to continue, or q <return> to quit--- 0xffffffff80900f88 <closef+216>: je 0xffffffff80901063 <closef+435> 0xffffffff80900f8e <closef+222>: lea -0x58(%rbp),%r13 0xffffffff80900f92 <closef+226>: nopw %cs:0x0(%rax,%rax,1) 0xffffffff80900fa0 <closef+240>: mov 0x10(%rbx),%rax 0xffffffff80900fa4 <closef+244>: testb $0x1,0xb0(%rax) 0xffffffff80900fab <closef+251>: je 0xffffffff80901050 <closef+416> 0xffffffff80900fb1 <closef+257>: incl 0x4(%rbx) 0xffffffff80900fb4 <closef+260>: mov $0xffffffff810042e9,%rsi 0xffffffff80900fbb <closef+267>: mov $0x90e,%edx 0xffffffff80900fc0 <closef+272>: mov %r15,%rdi 0xffffffff80900fc3 <closef+275>: callq 0xffffffff80952f90 <_sx_xunlock> 0xffffffff80900fc8 <closef+280>: movw $0x0,-0x62(%rbp) 0xffffffff80900fce <closef+286>: movq $0x0,-0x78(%rbp) 0xffffffff80900fd6 <closef+294>: movq $0x0,-0x70(%rbp) 0xffffffff80900fde <closef+302>: movw $0x2,-0x64(%rbp) 0xffffffff80900fe4 <closef+308>: mov 0x18(%r14),%rax 0xffffffff80900fe8 <closef+312>: mov 0x10(%rbx),%rcx 0xffffffff80900fec <closef+316>: movq $0xffffffff81557f68,-0x58(%rbp) 0xffffffff80900ff4 <closef+324>: mov %rax,-0x50(%rbp) 0xffffffff80900ff8 <closef+328>: mov %rcx,-0x48(%rbp) 0xffffffff80900ffc <closef+332>: movl $0x2,-0x40(%rbp) 0xffffffff80901003 <closef+339>: lea -0x78(%rbp),%rcx 0xffffffff80901007 <closef+343>: mov %rcx,-0x38(%rbp) 0xffffffff8090100b <closef+347>: movl $0x40,-0x30(%rbp) 0xffffffff80901012 <closef+354>: mov 0x8(%rax),%rdi 0xffffffff80901016 <closef+358>: mov %r13,%rsi 0xffffffff80901019 <closef+361>: callq 0xffffffff80ea8870 <VOP_ADVLOCK_APV> 0xffffffff8090101e <closef+366>: xor %esi,%esi 0xffffffff80901020 <closef+368>: mov $0xffffffff810042e9,%rdx 0xffffffff80901027 <closef+375>: mov $0x917,%ecx 0xffffffff8090102c <closef+380>: mov %r15,%rdi 0xffffffff8090102f <closef+383>: callq 0xffffffff80952ba0 <_sx_xlock> 0xffffffff80901034 <closef+388>: decl 0x4(%rbx) 0xffffffff80901037 <closef+391>: jne 0xffffffff80901050 <closef+416> 0xffffffff80901039 <closef+393>: cmpl $0x0,0x8(%rbx) 0xffffffff8090103d <closef+397>: je 0xffffffff80901050 <closef+416> 0xffffffff8090103f <closef+399>: movl $0x0,0x8(%rbx) 0xffffffff80901046 <closef+406>: mov %rbx,%rdi 0xffffffff80901049 <closef+409>: callq 0xffffffff80954a40 <wakeup> 0xffffffff8090104e <closef+414>: xchg %ax,%ax 0xffffffff80901050 <closef+416>: mov 0x20(%rbx),%rbx 0xffffffff80901054 <closef+420>: mov 0x8(%r12),%rax 0xffffffff80901059 <closef+425>: cmp 0x50(%rax),%rbx 0xffffffff8090105d <closef+429>: jne 0xffffffff80900fa0 <closef+240> 0xffffffff80901063 <closef+435>: mov $0xffffffff810042e9,%rsi 0xffffffff8090106a <closef+442>: mov $0x91f,%edx 0xffffffff8090106f <closef+447>: mov %r15,%rdi 0xffffffff80901072 <closef+450>: callq 0xffffffff80952f90 <_sx_xunlock> 0xffffffff80901077 <closef+455>: mov $0xffffffff,%eax ---Type <return> to continue, or q <return> to quit--- 0xffffffff8090107c <closef+460>: lock xadd %eax,0x28(%r14) 0xffffffff80901082 <closef+466>: cmp $0x1,%eax 0xffffffff80901085 <closef+469>: jne 0xffffffff809010a5 <closef+501> 0xffffffff80901087 <closef+471>: mov %r14,%rdi 0xffffffff8090108a <closef+474>: mov %r12,%rsi 0xffffffff8090108d <closef+477>: callq 0xffffffff808fe630 <_fdrop> 0xffffffff80901092 <closef+482>: mov %eax,%ebx 0xffffffff80901094 <closef+484>: mov %ebx,%eax 0xffffffff80901096 <closef+486>: add $0x58,%rsp 0xffffffff8090109a <closef+490>: pop %rbx 0xffffffff8090109b <closef+491>: pop %r12 0xffffffff8090109d <closef+493>: pop %r13 0xffffffff8090109f <closef+495>: pop %r14 0xffffffff809010a1 <closef+497>: pop %r15 0xffffffff809010a3 <closef+499>: pop %rbp 0xffffffff809010a4 <closef+500>: retq 0xffffffff809010a5 <closef+501>: xor %ebx,%ebx 0xffffffff809010a7 <closef+503>: test %eax,%eax 0xffffffff809010a9 <closef+505>: jne 0xffffffff80901094 <closef+484> 0xffffffff809010ab <closef+507>: add $0x28,%r14 0xffffffff809010af <closef+511>: xor %ebx,%ebx 0xffffffff809010b1 <closef+513>: mov $0xffffffff80ebcddb,%rdi 0xffffffff809010b8 <closef+520>: xor %eax,%eax 0xffffffff809010ba <closef+522>: mov %r14,%rsi 0xffffffff809010bd <closef+525>: callq 0xffffffff8094b5a0 <kassert_panic> 0xffffffff809010c2 <closef+530>: jmp 0xffffffff80901094 <closef+484> End of assembler dump.
The core dump has options DDB options DEADLKRES options INVARIANTS options INVARIANT_SUPPORT options WITNESS options WITNESS_SKIPSPIN so it should be possible to get more information, right?
Created attachment 161296 [details] output from kgdb "info threads"
I got a fresh core dump from another machine with the same setup. WITNESS, VARIANTS etc all turned on. Attached the quite long out put from "info threads" as well. We *really* need to find a way to stop our servers from crashing, so any help appreciated! (kgdb) bt #0 doadump (textdump=<value optimized out>) at pcpu.h:219 #1 0xffffffff80945d02 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:451 #2 0xffffffff809460e5 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:758 #3 0xffffffff80945f73 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:687 #4 0xffffffff80d595cb in trap_fatal (frame=<value optimized out>, eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:851 #5 0xffffffff80d598cd in trap_pfault (frame=0xfffffe00003cc700, usermode=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:674 #6 0xffffffff80d58f6a in trap (frame=0xfffffe00003cc700) at /usr/src/sys/amd64/amd64/trap.c:440 #7 0xffffffff80d3ef72 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #8 0xffffffff8099487c in turnstile_broadcast (ts=0x0, queue=1) at /usr/src/sys/kern/subr_turnstile.c:838 #9 0xffffffff80944380 in __rw_wunlock_hard (c=0xfffff803755e9928, tid=1, file=0x1 <Address 0x1 out of bounds>, line=1) at /usr/src/sys/kern/kern_rwlock.c:988 #10 0xffffffff80b02b24 in tcp_twclose (tw=<value optimized out>, reuse=<value optimized out>) at /usr/src/sys/netinet/tcp_timewait.c:540 #11 0xffffffff80b0316b in tcp_tw_2msl_scan (reuse=0) at /usr/src/sys/netinet/tcp_timewait.c:748 #12 0xffffffff80b00e7e in tcp_slowtimo () at /usr/src/sys/netinet/tcp_timer.c:198 #13 0xffffffff809b3c74 in pfslowtimo (arg=0x0) at /usr/src/sys/kern/uipc_domain.c:508 #14 0xffffffff8095bb7b in softclock_call_cc (c=0xffffffff8161bbf0, cc=0xffffffff81698d00, direct=0) at /usr/src/sys/kern/kern_timeout.c:685 #15 0xffffffff8095bfa4 in softclock (arg=0xffffffff81698d00) at /usr/src/sys/kern/kern_timeout.c:814 #16 0xffffffff80911c3b in intr_event_execute_handlers (p=<value optimized out>, ie=0xfffff8010bbabc00) at /usr/src/sys/kern/kern_intr.c:1264 #17 0xffffffff80912086 in ithread_loop (arg=0xfffff801102abf40) at /usr/src/sys/kern/kern_intr.c:1277 #18 0xffffffff8090f78a in fork_exit (callout=0xffffffff80911ff0 <ithread_loop>, arg=0xfffff801102abf40, frame=0xfffffe00003ccac0) at /usr/src/sys/kern/kern_fork.c:1018 #19 0xffffffff80d3f4ae in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611 #20 0x0000000000000000 in ?? () Current language: auto; currently minimal (kgdb) info threads
Perhaps this is helpful as well? Fatal trap 12: page fault while in kernel mode cpuid = 14; apic id = 0e fault virtual address = 0x30 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8099487c stack pointer = 0x28:0xfffffe00003cc7b0 frame pointer = 0x28:0xfffffe00003cc7e0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 12 (swi4: clock) trap number = 12 panic: page fault cpuid = 14
Sorry, comment 6 and 7 where core dumps *without* WITNESS et al. my bad.
(In reply to Palle Girgensohn from comment #7) this was without WITNESS/INVARIANTS.
Created attachment 161327 [details] First tentative patch for this issue Waiting to see if it improves things, and -net feedbacks to get a better overview of what is going on here.
Just an update: The first tentative patch seems to address completely this issue, I am working on a more longterm patch following -net advices. I will add the corresponding review here.
Making progress on this issue: The most useful stackstrace so far: panic: tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp != NULL cpuid = 4 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff8032467b = db_trace_self_wrapper+0x2b/frame 0xfffffe1f9e1f8730 vpanic() at 0xffffffff804b5672 = vpanic+0x182/frame 0xfffffe1f9e1f87b0 kassert_panic() at 0xffffffff804b54e6 = kassert_panic+0x126/frame 0xfffffe1f9e1f8820 tcp_usr_detach() at 0xffffffff806564dc = tcp_usr_detach+0x1bc/frame 0xfffffe1f9e1f8850 sofree() at 0xffffffff8053de66 = sofree+0x1a6/frame 0xfffffe1f9e1f8880 tcp_close() at 0xffffffff8064dd8e = tcp_close+0x11e/frame 0xfffffe1f9e1f88b0 tcp_timer_2msl() at 0xffffffff80653c28 = tcp_timer_2msl+0x278/frame 0xfffffe1f9e1f88e0 softclock_call_cc() at 0xffffffff804cbacc = softclock_call_cc+0x19c/frame 0xfffffe1f9e1f89c0 softclock() at 0xffffffff804cbec7 = softclock+0x47/frame 0xfffffe1f9e1f89e0 intr_event_execute_handlers() at 0xffffffff8047aa86 = intr_event_execute_handlers+0x96/frame 0xfffffe1f9e1f8a20 ithread_loop() at 0xffffffff8047b106 = ithread_loop+0xa6/frame 0xfffffe1f9e1f8a70 fork_exit() at 0xffffffff804781b4 = fork_exit+0x84/frame 0xfffffe1f9e1f8ab0 fork_trampoline() at 0xffffffff80713fce = fork_trampoline+0xe/frame 0xfffffe1f9e1f8ab0 The scenario: 1. thread1: tcp_timer_2msl() expires and tcp_close() is called to clean this TCP connection. 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag, the process continues and calls INP_WUNLOCK() here: https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568 3. thread2: Now because INP_WLOCK is released, the inp can transition to INP_TIMEWAIT state and nothing is preventing it. 4. thread2: During the INP_TIMEWAIT state transition, the inp is marked with INP_TIMEWAIT flag. 5. thread1: Back in business and tcp_close() call continues with sofree() -> tcp_usr_detach() -> tcp_detach(). Then as inp is marked with INP_DROPPED|INP_TIMEWAIT flags, in_pcbfree() is called. w/ INVARIANTS you have an assertion here, w/o INVARIANTS process continues. 6. Later: tcp_twclose() cleans up this INP_TIMEWAIT inp and calls in_pcbfree() again to achieve a fancy inp double-free. This issue is a tricky one and seems here since quite a while. It has been witness at least once in 10.1 and by two different people in 11.0. Astute questions: o Why INP_DROPPED flag is not tested in tcp_input() in the first place? When you are marked as INP_DROPPED, you are almost dead, you should not be allowed to transition to a different state! Good point, and tcp_input() relies on the fact that INP_DROPPED inps are no more in TCP hash table. But tcp_input() in some cases do relock INP (see relocked: label) and if it does check a lot of things after having relocked the inp it does not check for a recently added INP_DROPPED flag. o Why tcp_detach() does an unconditional in_pcbfree() for inps in TIMEWAIT state? This because inps in TIMEWAIT state have only one exit: Being freed. And it is the duty of tcp_detach() to free all inps with INP_DROPPED|INP_TIMEWAIT. o Why this issue is so rare? Good question, I can see how to have a specific TCP traffic to make it more frequent but no definitive answer yet. Fix proposal: This issue description is still a bit fresh but I would enforce that an inp with INP_DROPPED flag should not be allowed to change state.
The review for this issue fix: Fix a double-free when an inp transitions to INP_TIMEWAIT state after having been dropped https://reviews.freebsd.org/D8211
Thanks Julian. Might be worth obsoleting the existing patch here if the review is the canonical/final patch source from here on. Attachment -> Details -> Edit Details -> Obsolete [X]
(In reply to Kubilay Kocak from comment #15) You are totally right, first patch marked as "obsolete", current proposed patch pushed in D8211.
A commit references this bug: Author: jch Date: Tue Oct 18 07:16:50 UTC 2016 New revision: 307551 URL: https://svnweb.freebsd.org/changeset/base/307551 Log: Fix a double-free when an inp transitions to INP_TIMEWAIT state after having been dropped. This fixes enforces in_pcbdrop() logic in tcp_input(): "in_pcbdrop() is used by TCP to mark an inpcb as unused and avoid future packet delivery or event notification when a socket remains open but TCP has closed." PR: 203175 Reported by: Palle Girgensohn, Slawa Olhovchenkov Tested by: Slawa Olhovchenkov Reviewed by: Slawa Olhovchenkov Approved by: gnn, Slawa Olhovchenkov Differential Revision: https://reviews.freebsd.org/D8211 MFC after: 1 week Sponsored by: Verisign, inc Changes: head/sys/netinet/tcp_input.c head/sys/netinet/tcp_timewait.c head/sys/netinet/tcp_usrreq.c
A commit references this bug: Author: jch Date: Tue Oct 25 12:53:15 UTC 2016 New revision: 307905 URL: https://svnweb.freebsd.org/changeset/base/307905 Log: MFC r307551: Fix a double-free when an inp transitions to INP_TIMEWAIT state after having been dropped. This change enforces in_pcbdrop() logic in tcp_input(): "in_pcbdrop() is used by TCP to mark an inpcb as unused and avoid future packet delivery or event notification when a socket remains open but TCP has closed." PR: 203175 Reported by: Palle Girgensohn, Slawa Olhovchenkov Tested by: Slawa Olhovchenkov Reviewed by: Slawa Olhovchenkov Approved by: gnn, Slawa Olhovchenkov Differential Revision: https://reviews.freebsd.org/D8211 Sponsored by: Verisign, inc Changes: _U stable/11/ stable/11/sys/netinet/tcp_input.c stable/11/sys/netinet/tcp_timewait.c stable/11/sys/netinet/tcp_usrreq.c
A commit references this bug: Author: jch Date: Tue Oct 25 12:58:37 UTC 2016 New revision: 307906 URL: https://svnweb.freebsd.org/changeset/base/307906 Log: MFC r307551: Fix a double-free when an inp transitions to INP_TIMEWAIT state after having been dropped. This change enforces in_pcbdrop() logic in tcp_input(): "in_pcbdrop() is used by TCP to mark an inpcb as unused and avoid future packet delivery or event notification when a socket remains open but TCP has closed." PR: 203175 Reported by: Palle Girgensohn, Slawa Olhovchenkov Tested by: Slawa Olhovchenkov Reviewed by: Slawa Olhovchenkov Approved by: gnn, Slawa Olhovchenkov Differential Revision: https://reviews.freebsd.org/D8211 Sponsored by: Verisign, inc Changes: stable/10/sys/netinet/tcp_input.c stable/10/sys/netinet/tcp_timewait.c stable/10/sys/netinet/tcp_usrreq.c