Bug 254735 - [tcp] rack and bbr panic
Summary: [tcp] rack and bbr panic
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-03 06:18 UTC by rozhuk.im
Modified: 2021-04-05 09:46 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description rozhuk.im 2021-04-03 06:18:14 UTC
I am trying to use rack and bbr and got panic.
FreeBSD 13 amd64 few days old build from git.


Configs/tunings:
http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/boot/loader.conf
http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/etc/make.conf
http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/etc/sysctl.conf
http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/usr/src/sys/amd64/conf/RIM_BASE
http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/usr/src/sys/amd64/conf/RIM_SRV


igb0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
	ether 70:87:a2:48:27:51
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	nd6 options=9<PERFORMNUD,IFDISABLED>

IPv4+IPv6 configured and used.


BBR:

#0 0xffffffff80638a4b at kdb_backtrace+0x6b
#1 0xffffffff805eeaa1 at vpanic+0x181
#2 0xffffffff805ee913 at panic+0x43
#3 0xffffffff805d6384 at _mtx_lock_indefinite_check+0x64
#4 0xffffffff805d60a5 at _mtx_lock_spin_cookie+0xc5
#5 0xffffffff8065096f at turnstile_lookup+0x5f
#6 0xffffffff805ea98c at __rw_wunlock_hard+0x7c
#7 0xffffffff8144b146 at bbr_do_segment+0x116
#8 0xffffffff80763a18 at tcp_input+0x9f8
#9 0xffffffff80757da2 at ip_input+0xd2
#10 0xffffffff8072c7a8 at swi_net+0x128
#11 0xffffffff805bb045 at ithread_loop+0x315
#12 0xffffffff805b7c77 at fork_exit+0x77
#13 0xffffffff808bebde at fork_trampoline+0xe


__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff805ee6a5 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff805eeb10 in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff805ee913 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff805d6384 in _mtx_lock_indefinite_check (m=<optimized out>,
    ldap=<optimized out>) at /usr/src/sys/kern/kern_mutex.c:1248
#6  0xffffffff805d60a5 in _mtx_lock_spin_cookie (c=c@entry=0xfffff8000229c618,
    v=<optimized out>) at /usr/src/sys/kern/kern_mutex.c:773
#7  0xffffffff8065096f in turnstile_lookup (lock=lock@entry=0xfffff8042fff0020)
    at /usr/src/sys/kern/subr_turnstile.c:664
#8  0xffffffff805ea98c in __rw_wunlock_hard (c=0xfffff8042fff0038,
    v=<optimized out>) at /usr/src/sys/kern/kern_rwlock.c:1266
#9  0xffffffff8144b146 in bbr_do_segment (m=0xfffff800376c0000,
    th=0xfffff800376c007c, so=0xfffff801b0c6e000, tp=0xfffffe01691858f0,
    drop_hdrlen=52, tlen=0, iptos=0 '\000')
    at /usr/src/sys/modules/tcp/bbr/../../../netinet/tcp_stacks/bbr.c:11760
#10 0xffffffff80763a18 in tcp_input (mp=<optimized out>, offp=<optimized out>,
    proto=<optimized out>) at /usr/src/sys/netinet/tcp_input.c:1135
#11 0xffffffff80757da2 in ip_input (m=0x0)
    at /usr/src/sys/netinet/ip_input.c:833
#12 0xffffffff8072c7a8 in netisr_process_workstream_proto (
    nwsp=<optimized out>, proto=1) at /usr/src/sys/net/netisr.c:919
#13 swi_net (arg=<optimized out>) at /usr/src/sys/net/netisr.c:966
#14 0xffffffff805bb045 in intr_event_execute_handlers (p=<optimized out>,
    ie=0xfffff800029da000) at /usr/src/sys/kern/kern_intr.c:1168
#15 ithread_execute_handlers (p=<optimized out>, ie=0xfffff800029da000)
    at /usr/src/sys/kern/kern_intr.c:1181
#16 ithread_loop (arg=0xfffff80002a05160) at /usr/src/sys/kern/kern_intr.c:1269
#17 0xffffffff805b7c77 in fork_exit (
    callout=0xffffffff805bad30 <ithread_loop>, arg=0xfffff80002a05160,
    frame=0xfffffe01140b0c00) at /usr/src/sys/kern/kern_fork.c:1069
#18 <signal handler called>




RACK:

#0 0xffffffff80638a4b at kdb_backtrace+0x6b
#1 0xffffffff805eeaa1 at vpanic+0x181
#2 0xffffffff805ee913 at panic+0x43
#3 0xffffffff808e5d57 at trap_fatal+0x387
#4 0xffffffff808e5daf at trap_pfault+0x4f
#5 0xffffffff808e5576 at trap+0x496
#6 0xffffffff808bdb4e at calltrap+0x8
#7 0xffffffff805d5d75 at __mtx_lock_sleep+0x125
#8 0xffffffff80771f9a at tcp_hpts_thread+0x9a
#9 0xffffffff80609166 at softclock_call_cc+0x126
#10 0xffffffff80608f3f at callout_process+0x1cf
#11 0xffffffff80590355 at handleevents+0x185
#12 0xffffffff8059011c at hardclockintr+0x1ac
#13 0xffffffff808b78a1 at ipi_bitmap_handler+0x91
#14 0xffffffff808bfc73 at Xipi_intr_bitmap_handler+0xb3
#15 0xffffffff804180cb at acpi_cpu_idle+0x33b
#16 0xffffffff808acde1 at cpu_idle_acpi+0x41
#17 0xffffffff808ace97 at cpu_idle+0xa7

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff805ee6a5 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff805eeb10 in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff805ee913 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff808e5d57 in trap_fatal (frame=0xfffffe0113ba9540, eva=80)
    at /usr/src/sys/amd64/amd64/trap.c:915
#6  0xffffffff808e5daf in trap_pfault (frame=frame@entry=0xfffffe0113ba9540, 
    usermode=false, signo=<optimized out>, signo@entry=0x0, 
    ucode=<optimized out>, ucode@entry=0x0)
    at /usr/src/sys/amd64/amd64/trap.c:732
#7  0xffffffff808e5576 in trap (frame=0xfffffe0113ba9540)
    at /usr/src/sys/amd64/amd64/trap.c:398
#8  <signal handler called>
#9  0xffffffff80650d6c in turnstile_wait (ts=0xfffff8000229c780, 
    owner=<optimized out>, queue=queue@entry=0)
    at /usr/src/sys/kern/subr_turnstile.c:794
#10 0xffffffff805d5d75 in __mtx_lock_sleep (c=0xfffff80004cf5618, 
    v=<optimized out>) at /usr/src/sys/kern/kern_mutex.c:664
#11 0xffffffff80771f9a in tcp_hpts_thread (ctx=0xfffff80004cf5600)
    at /usr/src/sys/netinet/tcp_hpts.c:1816
#12 0xffffffff80609166 in softclock_call_cc (c=0xfffff80004cf56c0, 
    cc=cc@entry=0xffffffff80c6bd40 <cc_cpu+4800>, direct=direct@entry=1)
    at /usr/src/sys/kern/kern_timeout.c:696
#13 0xffffffff80608f3f in callout_process (now=now@entry=8227146370453)
    at /usr/src/sys/kern/kern_timeout.c:479
#14 0xffffffff80590355 in handleevents (now=8227146370453, fake=fake@entry=0)
    at /usr/src/sys/kern/kern_clocksource.c:213
#15 0xffffffff8059011c in hardclockintr ()
    at /usr/src/sys/kern/kern_clocksource.c:148
#16 0xffffffff808b78a1 in ipi_bitmap_handler (frame=...)
    at /usr/src/sys/x86/x86/mp_x86.c:1318
#17 <signal handler called>
#18 acpi_cpu_c1 () at /usr/src/sys/x86/x86/cpu_machdep.c:211
#19 0xffffffff804180cb in acpi_cpu_idle (sbt=<optimized out>)
    at /usr/src/sys/dev/acpica/acpi_cpu.c:1185
#20 0xffffffff808acde1 in cpu_idle_acpi (sbt=0)
    at /usr/src/sys/x86/x86/cpu_machdep.c:509
#21 0xffffffff808ace97 in cpu_idle (busy=0)
    at /usr/src/sys/x86/x86/cpu_machdep.c:629
#22 0xffffffff8061fcb4 in sched_idletd (dummy=<optimized out>)
    at /usr/src/sys/kern/sched_ule.c:2874
#23 0xffffffff805b7c77 in fork_exit (
    callout=0xffffffff8061f920 <sched_idletd>, arg=0x0, 
    frame=0xfffffe0113ba9c00) at /usr/src/sys/kern/kern_fork.c:1069
#24 <signal handler called>
(kgdb)
Comment 1 Michael Tuexen freebsd_committer 2021-04-03 07:57:24 UTC
Are the problem reproducible? If yes, how?
Comment 2 rozhuk.im 2021-04-03 08:54:46 UTC
(In reply to Michael Tuexen from comment #1)

IMHO configs provided by links should be enough to reproduce configuration.
Panic happen after some time (1-10 min) after I set net.inet.tcp.functions_default=rack or bbr.
This is home router/firewall/webserver/samba/rtorrent/ssh server.
Comment 3 Michael Tuexen freebsd_committer 2021-04-03 09:09:26 UTC
(In reply to rozhuk.im from comment #2)
I agree about the config. Was asking about some specific things to do to reproduce it (like ssh into the box) or something else.
Comment 4 rozhuk.im 2021-04-03 09:29:51 UTC
(In reply to Michael Tuexen from comment #3)

Nothing specific that I could describe.
Probably some strange tcp traffic from internet by www/torrent/ssh.
One rule in PF with "flags S/SA synproxy state", few with "flags S/SA modulate state".

Probably only sysctl.conf would be enough.

This is old bug for me, from 12, I do not report because first reply was "try with current", and with current in vbox it does not panic.
Comment 5 Michael Tuexen freebsd_committer 2021-04-03 09:49:02 UTC
(In reply to rozhuk.im from comment #4)
So to be clear:
(a) This bug happens on stable/13 of some days ago
(b) This bug does not happen on current
Are both statements correct?
Comment 6 rozhuk.im 2021-04-05 09:46:57 UTC
(In reply to Michael Tuexen from comment #5)

a) yes
b) I do not know now. I do tests on current more than year ago.