__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 55 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu, (kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 #1 doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xffffffff805ee215 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486 #3 0xffffffff805ee680 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919 #4 0xffffffff805ee483 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843 #5 0xffffffff808e58f7 in trap_fatal (frame=0xfffffe01140f6540, eva=24) at /usr/src/sys/amd64/amd64/trap.c:915 #6 0xffffffff808e594f in trap_pfault (frame=frame@entry=0xfffffe01140f6540, usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized out>, ucode@entry=0x0) at /usr/src/sys/amd64/amd64/trap.c:732 #7 0xffffffff808e5116 in trap (frame=0xfffffe01140f6540) at /usr/src/sys/amd64/amd64/trap.c:398 #8 <signal handler called> #9 m_copydata (m=0x0, m@entry=0xfffff80576236400, off=0, len=1, cp=<optimized out>) at /usr/src/sys/kern/uipc_mbuf.c:656 #10 0xffffffff8076eb5a in tcp_output (tp=0xfffffe0169d87ca8) at /usr/src/sys/netinet/tcp_output.c:1068 #11 0xffffffff80765fbb in tcp_do_segment (m=<optimized out>, th=<optimized out>, so=0xfffff802a3cb03b0, tp=0xfffffe0169d87ca8, drop_hdrlen=52, tlen=<optimized out>, iptos=0 '\000') at /usr/src/sys/netinet/tcp_input.c:2817 #12 0xffffffff80763588 in tcp_input (mp=<optimized out>, offp=<optimized out>, proto=<optimized out>) at /usr/src/sys/netinet/tcp_input.c:1135 #13 0xffffffff80757912 in ip_input (m=0x0) at /usr/src/sys/netinet/ip_input.c:833 #14 0xffffffff8072c318 in netisr_process_workstream_proto ( nwsp=<optimized out>, proto=1) at /usr/src/sys/net/netisr.c:919 #15 swi_net (arg=<optimized out>) at /usr/src/sys/net/netisr.c:966 #16 0xffffffff805bb045 in intr_event_execute_handlers (p=<optimized out>, ie=0xfffff800029d6500) at /usr/src/sys/kern/kern_intr.c:1168 #17 ithread_execute_handlers (p=<optimized out>, ie=0xfffff800029d6500) at /usr/src/sys/kern/kern_intr.c:1181 #18 ithread_loop (arg=0xfffff80002a05020) at /usr/src/sys/kern/kern_intr.c:1269 #19 0xffffffff805b7c77 in fork_exit ( callout=0xffffffff805bad30 <ithread_loop>, arg=0xfffff80002a05020, frame=0xfffffe01140f6c00) at /usr/src/sys/kern/kern_fork.c:1069 #20 <signal handler called> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 55 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu, (kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 #1 doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xffffffff805ee6a5 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486 #3 0xffffffff805eeb10 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919 #4 0xffffffff805ee913 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843 #5 0xffffffff808e5d57 in trap_fatal (frame=0xfffffe0113ba9540, eva=80) at /usr/src/sys/amd64/amd64/trap.c:915 #6 0xffffffff808e5daf in trap_pfault (frame=frame@entry=0xfffffe0113ba9540, usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized out>, ucode@entry=0x0) at /usr/src/sys/amd64/amd64/trap.c:732 #7 0xffffffff808e5576 in trap (frame=0xfffffe0113ba9540) at /usr/src/sys/amd64/amd64/trap.c:398 #8 <signal handler called> #9 0xffffffff80650d6c in turnstile_wait (ts=0xfffff8000229c780, owner=<optimized out>, queue=queue@entry=0) at /usr/src/sys/kern/subr_turnstile.c:794 #10 0xffffffff805d5d75 in __mtx_lock_sleep (c=0xfffff80004cf5618, v=<optimized out>) at /usr/src/sys/kern/kern_mutex.c:664 #11 0xffffffff80771f9a in tcp_hpts_thread (ctx=0xfffff80004cf5600) at /usr/src/sys/netinet/tcp_hpts.c:1816 #12 0xffffffff80609166 in softclock_call_cc (c=0xfffff80004cf56c0, cc=cc@entry=0xffffffff80c6bd40 <cc_cpu+4800>, direct=direct@entry=1) at /usr/src/sys/kern/kern_timeout.c:696 #13 0xffffffff80608f3f in callout_process (now=now@entry=8227146370453) at /usr/src/sys/kern/kern_timeout.c:479 #14 0xffffffff80590355 in handleevents (now=8227146370453, fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:213 #15 0xffffffff8059011c in hardclockintr () at /usr/src/sys/kern/kern_clocksource.c:148 #16 0xffffffff808b78a1 in ipi_bitmap_handler (frame=...) at /usr/src/sys/x86/x86/mp_x86.c:1318 #17 <signal handler called> #18 acpi_cpu_c1 () at /usr/src/sys/x86/x86/cpu_machdep.c:211 #19 0xffffffff804180cb in acpi_cpu_idle (sbt=<optimized out>) at /usr/src/sys/dev/acpica/acpi_cpu.c:1185 #20 0xffffffff808acde1 in cpu_idle_acpi (sbt=0) at /usr/src/sys/x86/x86/cpu_machdep.c:509 #21 0xffffffff808ace97 in cpu_idle (busy=0) at /usr/src/sys/x86/x86/cpu_machdep.c:629 #22 0xffffffff8061fcb4 in sched_idletd (dummy=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2874 #23 0xffffffff805b7c77 in fork_exit ( callout=0xffffffff8061f920 <sched_idletd>, arg=0x0, frame=0xfffffe0113ba9c00) at /usr/src/sys/kern/kern_fork.c:1069 #24 <signal handler called>
Can you include more information about your FreeBSD version (git hash), tcp configuration (congestion control, tcp stack) and maybe comment on what your workload is and when the panics are triggered?
It would be great to see the panic message and to have a way to reproduce it. Can you describe how to reproduce the issue? Does the same problem occur when you are using CURRENT?
Created attachment 223295 [details] crash log http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/usr/src/sys/amd64/conf/ srv+base http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/etc/sysctl.conf Other configs available: http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/ base+srv This is home NAS+++ server, with web server, samba, rtorrent, etc that connected to inet via IPv4 + IPv6. This happen few times, once per day. I can not reproduce this. FreeBSD 13 amd64, few days old sources build.
Created attachment 223296 [details] log 2
igb0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether *********** media: Ethernet autoselect (1000baseT <full-duplex>) status: active nd6 options=9<PERFORMNUD,IFDISABLED>
Got first crash (@m_copydata (m=0x0, m@entry=0xfffff80576236400, off=0, len=1, cp=<optimized out>) at /usr/src/sys/kern/uipc_mbuf.c:656) on 2 different installations. Suspect that it caused by some changes in last 1-2 month or by me: I add options RATELIMIT #o TX rate limiting support
#10 0xffffffff8076ec3a in tcp_output (tp=0xfffffe01137c60c0) at /usr/src/sys/netinet/tcp_output.c:1068 1068 m_copydata(mb, moff, len, (kgdb) info locals moff = 0 mb = 0xfffff802a20ba800 msb = <optimized out> opt = "\001\001\b\n\325\035\060\267+\223/\264\001\376\377\377\001M\373\267\000\000\000\000\340`|\023\001\376\377\377\250\005\000\000\000\000\000" to = {to_flags = 16, to_tsval = 3073383893, to_tsecr = 3023016747, to_sacks = 0xfffffe001eacac00 "\300P\254\036", to_signature = 0x11ea8d0c0 <error: Cannot access memory at address 0x11ea8d0c0>, to_tfo_cookie = 0xfe <error: Cannot access memory at address 0xfe>, to_mss = 8288, to_wscale = 229 '\345', to_nsacks = 197 '\305', to_tfo_len = 0 '\000', to_spare = 3535675904} hw_tls = false isipv6 = <optimized out> ip6 = 0x0 dont_sendalot = 0 wanted_cookie = 0 ip = 0xfffff8020e774668 if_hw_tsomaxsegsize = 0 if_hw_tsomaxsegcount = 0 error = <optimized out> so = <optimized out> idle = 0 sendalot = 1 tso = 0 flags = 17 recwin = 2098020 sack_rxmit = 1 p = 0xfffff802a256ef20 off = 32622 mtu = 0 sendwin = <optimized out> sack_bytes_rxmt = <optimized out> len = 1 ipoptlen = <optimized out> optlen = <optimized out> hdrlen = <optimized out> curticks = <optimized out> m = 0xfffff8020eecec00 th = <optimized out>
I try kernel without: options RATELIMIT #o TX rate limiting support options TCP_OFFLOAD #o TCP offload options TCP_BLACKBOX #o Enhanced TCP event logging options TCP_HHOOK #o hhook(9) framework for TCP options TCP_RFC7413 #o Server-side implementation of TCP Fast Open (TFO) [RFC7413] options TCP_RFC7413_MAX_KEYS=2 #o options TCPHPTS #o high precision timer system for tcp. not help.
Hi, Try to set: sysctl net.inet.tcp.sack.enable=0 For now. --HPS
Better, disable net.inet.tcp.rfc6675_pipe=0 while retaining SACK.
See https://reviews.freebsd.org/D29315 flags = 17 -> 0x11, 0x10 is TF_SENTFIN, TF_ACKNOW and 6675pipe is enabled, enabling the new rescue-retransmission. Further, this is stated to be a web server, where it is likely that http/1.0 tcp sessions are closed right after an object was sent, and if the very last segment with the FIN is dropped by the network, the rescue retransmission code tried to include the "data byte" of the FIN (which doesn't exist really, only as the last octet in the sequence space stream).
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=e9f029831fa5747ae1b405f5716c52cb4ebf1e04 commit e9f029831fa5747ae1b405f5716c52cb4ebf1e04 Author: Richard Scheffenegger <rscheff@FreeBSD.org> AuthorDate: 2021-03-17 15:44:29 +0000 Commit: Richard Scheffenegger <rscheff@FreeBSD.org> CommitDate: 2021-03-17 16:12:04 +0000 fix panic when rescue retransmission and FIN overlap PR: 254244 PR: 254309 Reviewed By: #transport, hselasky, tuexen MFC after: 3 days Sponsored By: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29315 sys/netinet/tcp_sack.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=703419774f86525a2441d615733993a6fddcd047 commit 703419774f86525a2441d615733993a6fddcd047 Author: Richard Scheffenegger <rscheff@FreeBSD.org> AuthorDate: 2021-03-17 15:44:29 +0000 Commit: Richard Scheffenegger <rscheff@FreeBSD.org> CommitDate: 2021-03-17 19:05:33 +0000 fix panic when rescue retransmission and FIN overlap PR: 254244 PR: 254309 Reviewed By: #transport, hselasky, tuexen Approved by: re (cperciva) MFC after: immediately Sponsored By: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29315 (cherry picked from commit e9f029831fa5747ae1b405f5716c52cb4ebf1e04) sys/netinet/tcp_sack.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
2 days uptime without panic, looks like fixed. Thanks!