Hello, A few days ago after merge of changes to TCP RACK to 13-STABLE I've discovered two problems: 1. kernel config: makeoptions WITH_EXTRA_TCP_STACKS=1 options TCPHPTS options RATELIMIT loader.conf: cc_htcp_load="YES" tcp_rack_load="YES" sysctl.conf: net.inet.tcp.cc.algorithm=htcp After build stable/13-n245948-fc53b7269fed in dmesg I see constantly repeating messages: kernel: cc_algo:htcp is not NEWRENO:newreno 2. If I disable INET6 in kernel config, kernel build fails with: cc -target x86_64-unknown-freebsd13.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin -O2 -pipe -O2 -pipe -march=native -mtune=native -fno-common -DMODNAME=tcp_rack -DSTACKNAME=rack -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -DKLD_TIED -nostdinc -DHAVE_KERNEL_OPTION_HEADERS -include /usr/obj/usr/src/amd64.amd64/sys/IRON/opt_global.h -I. -I/usr/src/sys -I/usr/src/sys/contrib/ck/include -fno-common -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include -I/usr/obj/usr/src/amd64.amd64/sys/IRON -MD -MF.depend.rack.o -MTrack.o -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -Wno-address-of-packed-member -Wno-format-zero-length -mno-aes -mno-avx -std=iso9899:1999 -c /usr/src/sys/modules/tcp/rack/../../../netinet/tcp_stacks/rack.c -o rack.o /usr/src/sys/modules/tcp/rack/../../../netinet/tcp_stacks/rack.c:17676:8: error: use of undeclared identifier 'isipv6' if (isipv6) ^ 1 error generated. *** Error code 1 Please let me know if you need any addition info.
(In reply to iron.udjin from comment #0) > After build stable/13-n245948-fc53b7269fed in dmesg I see constantly repeating messages: > kernel: cc_algo:htcp is not NEWRENO:newreno This can be useful, please compare: https://lists.freebsd.org/archives/freebsd-current/2021-May/000070.html
The first issue reported is covered by the e-mail Marec is citing. I'll have a look at the second one. Thanks for reporting.
(In reply to Marek Zarychta from comment #1) Is that means that ECN won't function properly with htcp & RACK? If yes, I my opinion it would be better to mention it in boot time once instead to write constantly repeating messages in dmesg.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=224cf7b35b9bbe8d075f6004249d850c620b7855 commit 224cf7b35b9bbe8d075f6004249d850c620b7855 Author: Michael Tuexen <tuexen@FreeBSD.org> AuthorDate: 2021-06-11 07:50:46 +0000 Commit: Michael Tuexen <tuexen@FreeBSD.org> CommitDate: 2021-06-11 07:50:46 +0000 tcp: fix compilation of IPv4-only builds PR: 256538 Reported by: iron.udjin@gmail.com MFC after: 3 days Sponsored by: Netflix, Inc. sys/netinet/tcp_stacks/rack.c | 2 ++ 1 file changed, 2 insertions(+)
It messing a bit with kernel message buffer, can't it be silenced a bit: # dmesg | grep NEWRENO | wc -l 575
(In reply to Marek Zarychta from comment #5) It will be taken out: https://reviews.freebsd.org/D30723 BTW: Any specific reason why you use HTCP?
(In reply to Michael Tuexen from comment #6) According to tests I found a few years ago, HTCP performs faster congestion window recovery than NEWRENO in high throughput and low latency networks (1G/10G links). But I don't quite sure if it correctly works with RACK enabled. Probably I should try NEWRENO with RACK. I would be nice if you claryfy this question.
(In reply to iron.udjin from comment #7) I'm only aware of substantial usage and testing of RACK in combination with new-reno.
HTCP bringing some performance improvements already existed when RACK was introduced. After some tests performed I found RACK better than the native FreeBSD TCP stack so transitioned to this on some machines not touching congestion control algo. This way it worked since RACK was introduced to 12-STABLE till now. Now it also works but seems to be a bit noisy with regard to dmesg. TBH I have not tested whether RACK performs better with native FreeBSD cc algo or HTCP. I know RACK breaks MD5, but HTCP incompatibility was so far not mentioned anywhere.
(In reply to Marek Zarychta from comment #9) I wouldn't say that RACK is incompatible with HTCP, it just optimises New Reno when it uses it. This optimisation is just skipped when using a different CC. The debug printf() stating that will be removed soon.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=f1536bb53898b12e2d19938f8fe2d04b5e5d12a6 commit f1536bb53898b12e2d19938f8fe2d04b5e5d12a6 Author: Michael Tuexen <tuexen@FreeBSD.org> AuthorDate: 2021-06-11 13:43:38 +0000 Commit: Michael Tuexen <tuexen@FreeBSD.org> CommitDate: 2021-06-11 18:23:39 +0000 tcp: remove debug output from RACK Reported by: iron.udjin@gmail.com, Marek Zarychta Reviewed by: rrs PR: 256538 MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D30723 Sponsored by: Netflix, Inc. sys/netinet/tcp_stacks/rack.c | 2 -- 1 file changed, 2 deletions(-)
Both issues are resolved in the main branch. I will MFC the changes to stable/13 in 3 days, which will be on Monday. Thanks for reporting the issues!
(In reply to Michael Tuexen from comment #12) Thank you for this removal of debugging code and explaining RACK <--> HTCP compatibility. I have cherry-picked this fix. Such a fast-tracking tracking path has to be really appreciated!
After a few hours of testing I got kernel panic which is indicate that the problem somewhere in TCP HPTS: kernel: Fatal trap 12: page fault while in kernel mode kernel: cpuid = 19; apic id = 17 kernel: fault virtual address = 0x18 kernel: fault code = supervisor read data, page not present kernel: instruction pointer = 0x20:0xffffffff80f75570 kernel: stack pointer = 0x28:0xfffffe0320f52e80 kernel: frame pointer = 0x28:0xfffffe0320f52ec0 kernel: code segment = base rx0, limit 0xfffff, type 0x1b kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 kernel: processor eflags = interrupt enabled, resume, IOPL = 0 kernel: current process = 11 (swi1: hpts) kernel: trap number = 12 kernel: panic: page fault kernel: cpuid = 19 kernel: time = 1623467607 How can I perform further debug of the issue? Should I recompile kernel with debug options enabled? For now I rolled back to previous kernel. works fine: stable/13-n245916-d1b7ff3dac57 panic: stable/13-n245984-4b707591838d
(In reply to iron.udjin from comment #14) Can you dump the image and provide a stacktrace with line numbers? How are you testing? What kind of load is triggering the issue?
(In reply to Michael Tuexen from comment #15) >Can you dump the image and provide a stacktrace with line numbers? Yesterday I recompile kernel with debug options enabled. For now the server works 24 hours without panic. In case of panic I'll do stacktrace. >How are you testing? What kind of load is triggering the issue? Workload is highly loaded web server (nginx) which works in cluster. So temporary server outage is not a big problem. Also the server has a few VMs (bhyve).
Created attachment 225773 [details] panic trace
Oh...after I have been written my previous message, I got kernel panic. I'm lucky today :) Attached trace screen. Please point me what should I do next to debug this issue.
Here is my kernel config https://pastebin.com/3r5e9m9F Should I add or remove something there?
(In reply to iron.udjin from comment #18) Can you type dump at the debug prompt such that the kernel dump is written?
After panic I didn't have debug console. Only stacktrace which I attached. Also after restart /var/crash is empty. In rc.conf I have dumpdev="AUTO". Should I add anything else?
(In reply to iron.udjin from comment #21) How large is you swap partition? Normally this is used to store the core...
swap size is 100Gb.
(In reply to iron.udjin from comment #23) OK. Can you manually panic the VM by issuing sudo sysctl debug.kdb.panic=1 Does this panic the machine? Does it write a core in this case? (I'm not that familiar with non-debug builds...)
(In reply to Michael Tuexen from comment #24) Yes, it panic the server but doesn't write a core file. I'll try to investigate why it doesn't save crush dump.
(In reply to iron.udjin from comment #25) That is great. Once saving the core on intentional panics is working, I guess it will also write a core next time you can trigger the RACK bug.
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7a2030a10686d156270c770d96f4ba4f86d4c58e commit 7a2030a10686d156270c770d96f4ba4f86d4c58e Author: Michael Tuexen <tuexen@FreeBSD.org> AuthorDate: 2021-06-11 07:50:46 +0000 Commit: Michael Tuexen <tuexen@FreeBSD.org> CommitDate: 2021-06-13 23:27:17 +0000 tcp: fix compilation of IPv4-only builds PR: 256538 Reported by: iron.udjin@gmail.com Sponsored by: Netflix, Inc. (cherry picked from commit 224cf7b35b9bbe8d075f6004249d850c620b7855) sys/netinet/tcp_stacks/rack.c | 2 ++ 1 file changed, 2 insertions(+)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=fce16041a86cfc75daea3eaeefa22a30b03df0d6 commit fce16041a86cfc75daea3eaeefa22a30b03df0d6 Author: Michael Tuexen <tuexen@FreeBSD.org> AuthorDate: 2021-06-11 13:43:38 +0000 Commit: Michael Tuexen <tuexen@FreeBSD.org> CommitDate: 2021-06-13 23:28:19 +0000 tcp: remove debug output from RACK Reported by: iron.udjin@gmail.com, Marek Zarychta Reviewed by: rrs PR: 256538 Differential Revision: https://reviews.freebsd.org/D30723 Sponsored by: Netflix, Inc. (cherry picked from commit f1536bb53898b12e2d19938f8fe2d04b5e5d12a6) sys/netinet/tcp_stacks/rack.c | 2 -- 1 file changed, 2 deletions(-)
(In reply to Michael Tuexen from comment #26) Hello Michael, A week ago I configured kernel for trace dump. But haven't catched panic yet. I catched very strange network issue after I installed recent RACK changes on the second test server. When I connected by SSH to the server and run several times commad "dmesg" which gives long console output, it hangs on the middle of the list, connection freezes and I see in /var/log/auth.log: sshd[40044]: Fssh_packet_write_poll: Connection from user root XXX.XXX.XXX.XXX port 34314: Permission denied On the client side I see: Fssh_packet_write_wait: Connection to XXX.XXX.XXX.XXX port 22: Broken pipe When I rollback kernel to stable/13-n245866-75683ed20b70 - issue disappear. How can I debug it?
(In reply to iron.udjin from comment #29) Please compare bug 256657. There is some guidance on how to debug this and maybe it's the same issue.
(In reply to iron.udjin from comment #29) Michael, I sent to your email .pcap file. I logged into target server, ran "top -aHIwt -s 1" and after afew seconds I had connection lost. So in that .pcap you'll find all server side traffic between initialization of ssh session and disconnect (timeout).
(In reply to iron.udjin from comment #31) Got it. Thanks. Will look into it and report back tomorrow.
(In reply to Michael Tuexen from comment #32) OK. The system just stops to respond. Let me put together a description how you can enable BlackBox logging on the server. That hopefully gives an indication what is going on.
Are you able to reproduce the issue or did this just happens once in a while?
(In reply to Michael Tuexen from comment #34) I was able to reproduce the issue. But currently the server where I tested is in production and I can't test right now. I'll restart it into new kernel in a week or so.
(In reply to iron.udjin from comment #35) OK. If you are able to reproduce the issue, I would like to know if setting net.inet.tcp.tolerate_missing_ts=1 would help.
(In reply to Michael Tuexen from comment #36) BTW: Can we close this issue, since the issues reported are fixed, I think. If you still have the connectivity problems or any other problems, it would be great to open a new PR. OK?
(In reply to Michael Tuexen from comment #37) Yes, sure.