Bug 276674 - [panic] [htcp] sysctl net.inet.tcp.cc.algorithm=htcp produces kernel panic
Summary: [panic] [htcp] sysctl net.inet.tcp.cc.algorithm=htcp produces kernel panic
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 13.2-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2024-01-28 02:05 UTC by Vladyslav V. Prodan
Modified: 2024-05-09 08:07 UTC (History)
6 users (show)

See Also:


Attachments
Crash dump file (99.50 KB, application/x-tar)
2024-01-28 02:05 UTC, Vladyslav V. Prodan
no flags Details
Core.9 (41.59 KB, application/x-xz)
2024-02-06 14:47 UTC, Vladyslav V. Prodan
no flags Details
New crashdump (38.64 KB, application/x-xz)
2024-02-11 18:05 UTC, Vladyslav V. Prodan
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vladyslav V. Prodan 2024-01-28 02:05:42 UTC
Created attachment 248024 [details]
Crash dump file

VPS in OVH.
FreeBSD 13.2-STABLE 2cd20d9bc SUPPORT-13-2-0 amd64

The panic occurs when the network opens net.inet.tcp.cc.algorithm=htcp

sysctl options are also used:
net.inet.tcp.cc.htcp.rtt_scaling=1
net.inet.tcp.delayed_ack=0

After applying the option
net.inet.tcp.cc.algorithm=newreno
the panic disappeared.


cpuid        = 0
dynamic pcpu = 0x115b000
curthread    = 0xfffff8000478b740: pid 12 tid 100063 critnest 1 "irq25: virtio_pci0"
curpcb       = 0xfffff8000478bc50
fpcurthread  = none
idlethread   = 0xfffff80004472740: tid 100003 "idle: cpu0"
self         = 0xffffffff83210000
curpmap      = 0xffffffff82368138
tssp         = 0xffffffff83210384
rsp0         = 0xfffffe0044c4e000
kcr3         = 0xffffffffffffffff
ucr3         = 0xffffffffffffffff
scr3         = 0x0
gs32p        = 0xffffffff83210404
ldt          = 0xffffffff83210444
tss          = 0xffffffff83210434
curvnet      = 0xfffff8000416f280
db:0:kdb.enter.panic>  bt
Tracing pid 12 tid 100063 td 0xfffff8000478b740
kdb_enter() at 0xffffffff80d34547 = kdb_enter+0x37/frame 0xfffffe0044c4d650
vpanic() at 0xffffffff80ce6ac3 = vpanic+0x183/frame 0xfffffe0044c4d6a0
panic() at 0xffffffff80ce6933 = panic+0x43/frame 0xfffffe0044c4d700
trap_fatal() at 0xffffffff812faacd = trap_fatal+0x38d/frame 0xfffffe0044c4d760
calltrap() at 0xffffffff812d0bb8 = calltrap+0x8/frame 0xfffffe0044c4d760
--- trap 0x12, rip = 0xffffffff82a732f1, rsp = 0xfffffe0044c4d838, rbp = 0xfffffe0044c4d840 ---
htcp_ack_received() at 0xffffffff82a732f1 = htcp_ack_received+0x231/frame 0xfffffe0044c4d840
cc_ack_received() at 0xffffffff80f5cadc = cc_ack_received+0x28c/frame 0xfffffe0044c4d8a0
tcp_do_segment() at 0xffffffff80f61590 = tcp_do_segment+0x2be0/frame 0xfffffe0044c4d970
tcp_input_with_port() at 0xffffffff80f5dc2d = tcp_input_with_port+0xabd/frame 0xfffffe0044c4dab0
tcp6_input_with_port() at 0xffffffff80f5d10a = tcp6_input_with_port+0x6a/frame 0xfffffe0044c4dae0
tcp6_input() at 0xffffffff80f5e47b = tcp6_input+0xb/frame 0xfffffe0044c4daf0
ip6_input() at 0xffffffff80fb11a4 = ip6_input+0x9b4/frame 0xfffffe0044c4dbd0
netisr_dispatch_src() at 0xffffffff80e6d6af = netisr_dispatch_src+0xaf/frame 0xfffffe0044c4dc20
ether_demux() at 0xffffffff80e37bc9 = ether_demux+0x149/frame 0xfffffe0044c4dc50
ether_nh_input() at 0xffffffff80e38ee9 = ether_nh_input+0x379/frame 0xfffffe0044c4dcb0
netisr_dispatch_src() at 0xffffffff80e6d6af = netisr_dispatch_src+0xaf/frame 0xfffffe0044c4dd00
ether_input() at 0xffffffff80e37f39 = ether_input+0x69/frame 0xfffffe0044c4dd60
vtnet_rxq_eof() at 0xffffffff80a93f2f = vtnet_rxq_eof+0x72f/frame 0xfffffe0044c4de20
ithread_loop() at 0xffffffff80ca3957 = ithread_loop+0x257/frame 0xfffffe0044c4def0
fork_exit() at 0xffffffff80ca039d = fork_exit+0x7d/frame 0xfffffe0044c4df30
fork_trampoline() at 0xffffffff812d1c2e = fork_trampoline+0xe/frame 0xfffffe0044c4df30
--- trap 0xa57f2f0d, rip = 0xb1378a3faf069d41, rsp = 0x7a5d5f3c4604970a, rbp = 0x2b154069806bd89e ---
Comment 1 Vladyslav V. Prodan 2024-01-28 02:17:39 UTC
Sorry, there were problems with translation.
This is more correct:
Panic appears when using the sysctl option net.inet.tcp.cc.algorithm=htcp
Comment 2 Richard Scheffenegger freebsd_committer freebsd_triage 2024-01-28 07:30:11 UTC
would the core be available? the backtrace or minidump don't reveal anything obvious...
Comment 3 Vladyslav V. Prodan 2024-01-28 14:13:05 UTC
(In reply to Richard Scheffenegger from comment #2)

I added the option
sysrc savecore="YES"

And unpacked the debug kernel on the target machine.
# env LANG=en_EN.UTF-8 ls -l /usr/lib/debug/boot/kernel/kernel.debug
-r--r--r--  1 root  wheel  122662424 Jan  9 06:11 /usr/lib/debug/boot/kernel/kernel.debug


Changed sysctl option to trigger crashdump
# sysctl net.inet.tcp.cc.algorithm=htcp
net.inet.tcp.cc.algorithm: newreno -> htcp

You need to wait 2-6 days for crashdump to appear...
Comment 4 Vladyslav V. Prodan 2024-01-28 14:17:56 UTC
Those interested can deploy the archives of my system snapshot and try to reproduce panic.

https://freebsd.support.org.ua/snapshots/13.2/13.2-STABLE%202cd20d9bc%2008-01-2024/
Comment 5 Vladyslav V. Prodan 2024-02-06 14:46:21 UTC
Fresh panic.

Fatal trap 18: integer divide fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer	= 0x20:0xffffffff830172f1
stack pointer	        = 0x28:0xfffffe0044c4d838
frame pointer	        = 0x28:0xfffffe0044c4d840
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 12 (irq25: virtio_pci0)
trap number		= 18
panic: integer divide fault
cpuid = 0
time = 1707207718
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff804e375b = db_trace_self_wrapper+0x2b/frame 0xfffffe0044c4d650
vpanic() at 0xffffffff80ce6a92 = vpanic+0x152/frame 0xfffffe0044c4d6a0
panic() at 0xffffffff80ce6933 = panic+0x43/frame 0xfffffe0044c4d700
trap_fatal() at 0xffffffff812faacd = trap_fatal+0x38d/frame 0xfffffe0044c4d760
calltrap() at 0xffffffff812d0bb8 = calltrap+0x8/frame 0xfffffe0044c4d760
--- trap 0x12, rip = 0xffffffff830172f1, rsp = 0xfffffe0044c4d838, rbp = 0xfffffe0044c4d840 ---
htcp_ack_received() at 0xffffffff830172f1 = htcp_ack_received+0x231/frame 0xfffffe0044c4d840
cc_ack_received() at 0xffffffff80f5cadc = cc_ack_received+0x28c/frame 0xfffffe0044c4d8a0
tcp_do_segment() at 0xffffffff80f61590 = tcp_do_segment+0x2be0/frame 0xfffffe0044c4d970
tcp_input_with_port() at 0xffffffff80f5dc2d = tcp_input_with_port+0xabd/frame 0xfffffe0044c4dab0
tcp6_input_with_port() at 0xffffffff80f5d10a = tcp6_input_with_port+0x6a/frame 0xfffffe0044c4dae0
tcp6_input() at 0xffffffff80f5e47b = tcp6_input+0xb/frame 0xfffffe0044c4daf0
ip6_input() at 0xffffffff80fb11a4 = ip6_input+0x9b4/frame 0xfffffe0044c4dbd0
netisr_dispatch_src() at 0xffffffff80e6d6af = netisr_dispatch_src+0xaf/frame 0xfffffe0044c4dc20
ether_demux() at 0xffffffff80e37bc9 = ether_demux+0x149/frame 0xfffffe0044c4dc50
ether_nh_input() at 0xffffffff80e38ee9 = ether_nh_input+0x379/frame 0xfffffe0044c4dcb0
netisr_dispatch_src() at 0xffffffff80e6d6af = netisr_dispatch_src+0xaf/frame 0xfffffe0044c4dd00
ether_input() at 0xffffffff80e37f39 = ether_input+0x69/frame 0xfffffe0044c4dd60
vtnet_rxq_eof() at 0xffffffff80a93f2f = vtnet_rxq_eof+0x72f/frame 0xfffffe0044c4de20
vtnet_rx_vq_process() at 0xffffffff80a936f8 = vtnet_rx_vq_process+0xb8/frame 0xfffffe0044c4de60
ithread_loop() at 0xffffffff80ca3957 = ithread_loop+0x257/frame 0xfffffe0044c4def0
fork_exit() at 0xffffffff80ca039d = fork_exit+0x7d/frame 0xfffffe0044c4df30
fork_trampoline() at 0xffffffff812d1c2e = fork_trampoline+0xe/frame 0xfffffe0044c4df30
--- trap 0xa57f2f0d, rip = 0xb1378a3faf069d41, rsp = 0x7a5d5f3c4604970a, rbp = 0x2b154069806bd89e ---
KDB: enter: panic
Comment 6 Vladyslav V. Prodan 2024-02-06 14:47:50 UTC
Created attachment 248219 [details]
Core.9
Comment 7 Vladyslav V. Prodan 2024-02-06 14:50:34 UTC
(In reply to Vladyslav V. Prodan from comment #6)

vmcore.9.xz - 52MB
https://mega.nz/file/Y0YSVTDJ#oDOM_UTqcud8JjnJaQ4Y6fcvpLETibLq_6EVKab0FlE
Comment 8 Richard Scheffenegger freebsd_committer freebsd_triage 2024-02-08 17:12:31 UTC
can not find the vmcore.9.xz upload. Also, that hoster appears to require html5 and doesn't serve plain http objects without much fanfare... (curl/wget are incompatible).
Comment 9 Richard Scheffenegger freebsd_committer freebsd_triage 2024-02-08 17:38:14 UTC
Looking at the code in question, a div/0 could happen when cwnd < t_maxseg. 
While it's not clear why and how that may happen, addressing the div/0 should be easy by max(maxseg, cwnd)/maxseg so that this term will be at least 1, and a div/0 is avoided.

HTCP is not actively maintained, so tracking why cwnd could end up smaller than maxseg would be more involved (running with active BBLog and extracting the relevant data once another crash happens; running BBlog continously will probably cost some performance.


diff --git a/sys/netinet/cc/cc_htcp.c b/sys/netinet/cc/cc_htcp.c
index d31720d0099f..a858558d7aa5 100644
--- a/sys/netinet/cc/cc_htcp.c
+++ b/sys/netinet/cc/cc_htcp.c
@@ -229,9 +229,9 @@ htcp_ack_received(struct cc_var *ccv, uint16_t type)
                                 * per RTT.
                                 */
                                CCV(ccv, snd_cwnd) += (((htcp_data->alpha <<
-                                   HTCP_SHIFT) / (CCV(ccv, snd_cwnd) /
-                                   CCV(ccv, t_maxseg))) * CCV(ccv, t_maxseg))
-                                   >> HTCP_SHIFT;
+                                   HTCP_SHIFT) / (max(CCV(ccv, t_maxseg),
+                                   CCV(ccv, snd_cwnd)) / CCV(ccv, t_maxseg))) *
+                                   CCV(ccv, t_maxseg))  >> HTCP_SHIFT;
                }
        }
 }
Comment 10 Vladyslav V. Prodan 2024-02-09 05:02:54 UTC
(In reply to Richard Scheffenegger from comment #8)

New link:
https://support.org.ua/crashdump/vmcore.9.xz
Comment 11 Vladyslav V. Prodan 2024-02-11 18:05:10 UTC
Created attachment 248367 [details]
New crashdump

And vmcore:
https://support.org.ua/crashdump/vmcore.0.xz
Comment 12 commit-hook freebsd_committer freebsd_triage 2024-02-24 16:16:19 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=38983d40c18ec5705dcba19ac320b86c5efe8e7e

commit 38983d40c18ec5705dcba19ac320b86c5efe8e7e
Author:     Richard Scheffenegger <rscheff@FreeBSD.org>
AuthorDate: 2024-02-24 15:35:23 +0000
Commit:     Richard Scheffenegger <rscheff@FreeBSD.org>
CommitDate: 2024-02-24 15:35:59 +0000

    tcp: prevent div by zero in cc_htcp

    Make sure the divident is at least one. While cwnd should
    never be smaller than t_maxseg, this can happen during
    Path MTU Discovery, or when TCP options are considered
    in other parts of the stack.

    PR:                     276674
    MFC after:              3 days
    Reviewed By:            tuexen, #transport
    Sponsored by:           NetApp, Inc.
    Differential Revision:  https://reviews.freebsd.org/D43797

 sys/netinet/cc/cc_htcp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
Comment 13 commit-hook freebsd_committer freebsd_triage 2024-02-27 11:03:10 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=419848219b408cc52befcaa7849a2905f3812a83

commit 419848219b408cc52befcaa7849a2905f3812a83
Author:     Richard Scheffenegger <rscheff@FreeBSD.org>
AuthorDate: 2024-02-24 15:35:23 +0000
Commit:     Richard Scheffenegger <rscheff@FreeBSD.org>
CommitDate: 2024-02-27 11:00:55 +0000

    tcp: prevent div by zero in cc_htcp

    Make sure the divident is at least one. While cwnd should
    never be smaller than t_maxseg, this can happen during
    Path MTU Discovery, or when TCP options are considered
    in other parts of the stack.

    PR:                     276674
    MFC after:              3 days
    Reviewed By:            tuexen, #transport
    Sponsored by:           NetApp, Inc.
    Differential Revision:  https://reviews.freebsd.org/D43797

    (cherry picked from commit 38983d40c18ec5705dcba19ac320b86c5efe8e7e)

 sys/netinet/cc/cc_htcp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
Comment 14 commit-hook freebsd_committer freebsd_triage 2024-02-28 00:26:57 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6e298c3612da3d04a75d380cf457774cb1a25a47

commit 6e298c3612da3d04a75d380cf457774cb1a25a47
Author:     Richard Scheffenegger <rscheff@FreeBSD.org>
AuthorDate: 2024-02-24 15:35:23 +0000
Commit:     Richard Scheffenegger <rscheff@FreeBSD.org>
CommitDate: 2024-02-28 00:21:47 +0000

    tcp: prevent div by zero in cc_htcp

    Make sure the divident is at least one. While cwnd should
    never be smaller than t_maxseg, this can happen during
    Path MTU Discovery, or when TCP options are considered
    in other parts of the stack.

    PR:                     276674
    MFC after:              3 days
    Reviewed By:            tuexen, #transport
    Sponsored by:           NetApp, Inc.
    Differential Revision:  https://reviews.freebsd.org/D43797

    (cherry picked from commit 38983d40c18ec5705dcba19ac320b86c5efe8e7e)

 sys/netinet/cc/cc_htcp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)