Bug 268400 - Page fault kernel panic with KTLS enabled
Summary: Page fault kernel panic with KTLS enabled
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.1-STABLE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Mark Johnston
URL:
Keywords: crash
: 271550 (view as bug list)
Depends on:
Blocks: 14.0r
  Show dependency treegraph
 
Reported: 2022-12-15 22:30 UTC by Daniel Ponte
Modified: 2023-07-07 18:56 UTC (History)
6 users (show)

See Also:


Attachments
backtrace (5.04 KB, text/plain)
2022-12-15 22:38 UTC, Daniel Ponte
no flags Details
workaround (1.08 KB, patch)
2023-03-08 15:59 UTC, Mark Johnston
no flags Details | Diff
proposed patch (1.09 KB, patch)
2023-06-17 16:06 UTC, Mark Johnston
no flags Details | Diff
proposed patch (9.89 KB, patch)
2023-06-17 16:07 UTC, Mark Johnston
no flags Details | Diff
proposed patch (1.09 KB, patch)
2023-06-17 16:09 UTC, Mark Johnston
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Ponte 2022-12-15 22:30:38 UTC
I enabled KTLS in nginx today on this machine, which is a reverse proxy and pf firewall. I am using base openssl.

FreeBSD 13.1-STABLE #14 stable/13-n253193-e84ae60fc510: Wed Nov 30 11:00:39 EST 2022     root@argon.h.c907:/usr/obj/usr/src/amd64.amd64/sys/FARWARL amd64

A few hours later it panicked:
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80b81ea0
stack pointer           = 0x28:0xfffffe0083ed7440
frame pointer           = 0x28:0xfffffe0083ed7470
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 2 (thr_2)
trap number             = 12
panic: page fault
cpuid = 1
time = 1671140551
KDB: stack backtrace:
#0 0xffffffff8094b025 at kdb_backtrace+0x65
#1 0xffffffff808fdf01 at vpanic+0x151
#2 0xffffffff808fdda3 at panic+0x43
#3 0xffffffff80dc4d87 at trap_fatal+0x387
#4 0xffffffff80dc4ddf at trap_pfault+0x4f
#5 0xffffffff80d9c648 at calltrap+0x8
#6 0xffffffff80b78c2c at icmp6_reflect+0x2ac
#7 0xffffffff80b7875c at icmp6_error+0x37c
#8 0xffffffff80be4d61 at pf_route6+0x651
#9 0xffffffff80be4086 at pf_test6+0xa36
#10 0xffffffff80bf7ab0 at pf_check6_out+0x40
#11 0xffffffff80a4f527 at pfil_run_hooks+0x97
#12 0xffffffff80b947f9 at ip6_output+0x1149
#13 0xffffffff80b5b930 at tcp_output+0x1ea0
#14 0xffffffff80b6ddfb at tcp_usr_ready+0x15b
#15 0xffffffff8098f6a7 at ktls_encrypt+0x2a7
#16 0xffffffff8098e8e8 at ktls_work_thread+0x188
#17 0xffffffff808badae at fork_exit+0x7e
Timeout initializing vt_vga
Uptime: 13d6h0m24s
Dumping 1240 out of 8062 MB: (CTRL-C to abort) ..2%..11%..21%..31%..42%..51%..61%..71%..82%..91%

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  dump_savectx () at /usr/src/sys/kern/kern_shutdown.c:394
#2  0xffffffff808fdaf8 in dumpsys (di=0x0) at /usr/src/sys/x86/include/dump.h:87
#3  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:423
#4  kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:497
#5  0xffffffff808fdf6e in vpanic (fmt=<optimized out>, ap=ap@entry=0xfffffe0083ed7290) at /usr/src/sys/kern/kern_shutdown.c:930
#6  0xffffffff808fdda3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:854
#7  0xffffffff80dc4d87 in trap_fatal (frame=0xfffffe0083ed7380, eva=0) at /usr/src/sys/amd64/amd64/trap.c:942
#8  0xffffffff80dc4ddf in trap_pfault (frame=0xfffffe0083ed7380, usermode=false, signo=<optimized out>, ucode=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:761
#9  <signal handler called>
#10 in6_cksum_partial (m=<optimized out>, m@entry=0xfffff8003985f900, nxt=nxt@entry=58 ':', off=<optimized out>, off@entry=40, len=len@entry=1240, cov=<optimized out>, cov@entry=1240)
    at /usr/src/sys/netinet6/in6_cksum.c:319
#11 0xffffffff80b820cd in in6_cksum (m=0x488, m@entry=0xfffff8003985f900, nxt=0 '\000', nxt@entry=58 ':', off=8, off@entry=40, len=893953, len@entry=1240)
    at /usr/src/sys/netinet6/in6_cksum.c:366
#12 0xffffffff80b78c2c in icmp6_reflect (m=m@entry=0xfffff8003985f900, off=off@entry=40) at /usr/src/sys/netinet6/icmp6.c:2159
#13 0xffffffff80b7875c in icmp6_error (m=0xfffff8003985f900, type=type@entry=2, code=code@entry=0, param=1280) at /usr/src/sys/netinet6/icmp6.c:384
#14 0xffffffff80be4d61 in pf_route6 (m=m@entry=0xfffffe0083ed7a20, r=0xfffff8005814a800, dir=dir@entry=2, oifp=<optimized out>, s=0xfffff800a3bfb000, pd=pd@entry=0xfffffe0083ed7698,
    inp=0xfffff801b47313e0) at /usr/src/sys/netpfil/pf/pf.c:6188
#15 0xffffffff80be4086 in pf_test6 (dir=dir@entry=2, pflags=131072, ifp=0xfffff80003bb0800, m0=m0@entry=0xfffffe0083ed7a20, inp=0xfffff801b47313e0) at /usr/src/sys/netpfil/pf/pf.c:7181
#16 0xffffffff80bf7ab0 in pf_check6_out (m=0xfffffe0083ed7a20, ifp=0x488, flags=0, ruleset=<optimized out>, inp=0x200) at /usr/src/sys/netpfil/pf/pf_ioctl.c:5617
#17 0xffffffff80a4f527 in pfil_run_hooks (head=<optimized out>, p=..., ifp=0xfffff80003bb0800, flags=flags@entry=131072, inp=0xfffff801b47313e0) at /usr/src/sys/net/pfil.c:187
#18 0xffffffff80b947f9 in ip6_output (m0=m0@entry=0xfffff80034029600, opt=0x0, ro=0xfffff801b4731570, flags=0, im6o=im6o@entry=0x0, ifpp=ifpp@entry=0x0, inp=0xfffff801b47313e0)
    at /usr/src/sys/netinet6/ip6_output.c:1014
#19 0xffffffff80b5b930 in tcp_output (tp=0xfffffe00c82f90e0) at /usr/src/sys/netinet/tcp_output.c:1501
#20 0xffffffff80b6ddfb in tcp_usr_ready (so=<optimized out>, m=0xfffff800186c5d00, count=<optimized out>) at /usr/src/sys/netinet/tcp_usrreq.c:1302
#21 0xffffffff8098f6a7 in ktls_encrypt (top=0x0, top@entry=0xfffff800186c5d00) at /usr/src/sys/kern/uipc_ktls.c:2332
#22 0xffffffff8098e8e8 in ktls_work_thread (ctx=ctx@entry=0xfffff800031ab700) at /usr/src/sys/kern/uipc_ktls.c:2380
#23 0xffffffff808badae in fork_exit (callout=0xffffffff8098e760 <ktls_work_thread>, arg=0xfffff800031ab700, frame=0xfffffe0083ed7f40) at /usr/src/sys/kern/kern_fork.c:1093
#24 <signal handler called>
#25 mi_startup () at /usr/src/sys/kern/init_main.c:322
#26 0xffffffff80d14399 in swapper () at /usr/src/sys/vm/vm_swapout.c:755
#27 0xffffffff80340022 in btext () at /usr/src/sys/amd64/amd64/locore.S:80


I will keep the core around for a bit; please feel free to let me know of anymore information required.
Comment 1 Daniel Ponte 2022-12-15 22:38:32 UTC
Created attachment 238823 [details]
backtrace

Attaching raw backtrace
Comment 2 Zhenlei Huang freebsd_committer freebsd_triage 2023-01-31 04:25:33 UTC
Can you please share you network interfaces and firewall configuration `pf.conf` ?
Comment 3 Daniel Ponte 2023-03-06 14:54:51 UTC
My interface configuration and pf policy are rather large and complex and it would take some time to sanitize them. Could I send directly to someone?

That said, I can confirm this is still happening in 13.2-STABLE FreeBSD 13.2-STABLE #17 stable/13-n254582-c0e7e1848360: Thu Feb 16 15:19:52 EST 2023.

Digging deeper, the panic appears to be caused by a pf rule with route-to/reply-to rerouting to an IPv6 gif(4) interface with MTU of only 1280. It is caused by KTLS trying to send a packet that gets rerouted by pf (likely by reply-to) that results in ICMPv6 type 2 (Packet Too Big).
Comment 4 Zhenlei Huang freebsd_committer freebsd_triage 2023-03-06 15:28:28 UTC
(In reply to Daniel Ponte from comment #3)
It is best to have a minimal setup (pf rules / interfaces) so that it will be easy to reproduce and debug.

If it is production usage, then you may want to send privately.

From what you describe I suspect it may be a coner case, a combination of KTLS / pf route-to / reduced MTU. I'll CC @kp to see if he has any insights.

You can send to me zlei@FreeBSD.org .
Comment 5 Daniel Ponte 2023-03-06 15:33:07 UTC
Yes, the entire issue is the complexity of the setup, which is a production setup, and is bringing out a corner case. I do not need to use kTLS but I figured the project would be interested in fixing this panic, and I am happy to avail myself to the degree I am able.
Comment 6 Kristof Provost freebsd_committer freebsd_triage 2023-03-06 15:48:32 UTC
It looks like this happens while we're computing the checksum for the icmp6 header, and that we're dereferencing a NULL pointer.

I'm relatively confident that we're processing an unmapped mbuf, and that's why we're panicing here.

Mark taught in_cksum_skip() to handle those, but I think in6_cksum() doesn't handle them. So we either need to insert an mb_unmapped_to_ext() call in in6_cksum(), or we need to teach in6_cksum() to deal with unmapped mbufs.
Comment 7 Mark Johnston freebsd_committer freebsd_triage 2023-03-06 16:00:49 UTC
(In reply to Kristof Provost from comment #6)
I think you're right.  I thought there was a reason for not converting the in6_cksum() routines to use m_apply(), but I can't find it.
Comment 8 Daniel Ponte 2023-03-07 23:56:58 UTC
Would this basically involve porting dfd5240189ca0 to netinet6, or am I missing the big picture?
Comment 9 Mark Johnston freebsd_committer freebsd_triage 2023-03-08 15:59:46 UTC
Created attachment 240668 [details]
workaround

(In reply to Daniel Ponte from comment #8)
That's right.  That would take a bit of time.  In the meantime, if you're willing to test a patch, the attached one should work around the problem.
Comment 10 Daniel Ponte 2023-03-08 16:14:07 UTC
(In reply to Mark Johnston from comment #9)

Thank you. I have applied and am building now.
Comment 11 Mark Johnston freebsd_committer freebsd_triage 2023-05-24 15:01:08 UTC
(In reply to Daniel Ponte from comment #10)
Have you had success with the patch applied?
Comment 12 Daniel Ponte 2023-05-24 19:16:37 UTC
Yes and no. I have had no further crashes from this issue with the patch applied. However, it appears that KTLS is now totally broken for me (clients never receive the response from nginx). I don't think it is because of this patch, though.
Comment 13 Daniel Ponte 2023-05-24 19:28:47 UTC
Actually, I slightly recant that statement. Doing more testing:

* KTLS and IPv4 works fine
* KTLS and IPv6 works fine unless the response is large.

So, maybe this is related to this patch.
Comment 14 Daniel Ponte 2023-05-24 19:48:24 UTC
It looks like hosts are receiving incorrect TCP checksums, both over IPv4 and IPv6.
Comment 15 Daniel Ponte 2023-05-24 20:06:05 UTC
Rolling back this patch appears to resolve the incorrect checksum issues.
Comment 16 Mark Johnston freebsd_committer freebsd_triage 2023-05-24 20:28:59 UTC
(In reply to Daniel Ponte from comment #14)
The patch should not have any effect on IPv4 packets.
Comment 17 Daniel Ponte 2023-05-24 20:49:28 UTC
I know, which is why I am somewhat stumped, here.

15:49:07.223007 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 40)
    71.xxx.xxx.xxx.443 > 67.yyy.yyy.yyy.13889: Flags [R], cksum 0xbbe1 (correct), seq 1468723210, win 0, length 0
15:49:13.723854 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 1500)
    71.xxx.xxx.xxx.443 > 67.yyy.yyy.yyy.52767: Flags [.], cksum 0x06e1 (incorrect -> 0xe541), seq 6527:7975, ack 823, win 1027, options [nop,nop,TS val 739004335 ecr 3884435406], length 1448
15:49:29.282814 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    71.xxx.xxx.xxx.443 > 67.yyy.yyy.yyy.52767: Flags [R.], cksum 0x2746 (correct), seq 14417, ack 823, win 0, options [nop,nop,TS val 739019895 ecr 3884435406], length 0
15:49:29.282858 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    67.yyy.yyy.yyy.52767 > 71.xxx.xxx.xxx.443: Flags [.], cksum 0x7494 (incorrect -> 0x4bbf), seq 823, ack 6527, win 1026, options [nop,nop,TS val 3884512321 ecr 738940511], length 0
15:49:29.297044 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 40)
    71.xxx.xxx.xxx.443 > 67.yyy.yyy.yyy.52767: Flags [R], cksum 0xeabd (correct), seq 190739836, win 0, length 0

The 71.xxx host is the machine that had this patch applied. I have modified the patch to wrap the mb_unmapped_to_ext and goto in a conditional for a sysctl. KTLS appears to work again with no checksum problems, but I will also wait to see whether the machine crashes, in case my sysctl is broken.
Comment 18 Daniel Ponte 2023-05-24 20:51:06 UTC
Note that the tcpdump was being run on the 67 host, so NIC checksum offload would not apply to the incorrect received checksum (but certainly may to the incorrect sent one).
Comment 19 Daniel Ponte 2023-05-24 22:25:48 UTC
I disabled all hardware offloads on the 67 machine as well (70 always had them disabled) and now I am seeing correct checksums everywhere. I will continue to monitor the situation with this patch and KTLS enabled with latest 13-STABLE. I am keeping my sysctl in place in case weird things happen again and I can simply toggle it to see what is going on.
Comment 20 Mark Johnston freebsd_committer freebsd_triage 2023-06-16 20:05:00 UTC
(In reply to Daniel Ponte from comment #19)
Sorry for the delay.  To be clear, the patch now appears to be holding up?  I am working on a proper patch now, for inclusion into 14.0.
Comment 21 Daniel Ponte 2023-06-16 20:06:31 UTC
(In reply to Mark Johnston from comment #20)

Yes, the patch has been stable for some time now.
Comment 22 Mark Johnston freebsd_committer freebsd_triage 2023-06-17 16:06:12 UTC
Created attachment 242830 [details]
proposed patch

This is a proper solution to the problem.  The patch is only lightly tested so far, but please give it a try in a test environment if possible.
Comment 23 Mark Johnston freebsd_committer freebsd_triage 2023-06-17 16:07:36 UTC
Created attachment 242831 [details]
proposed patch

Uploaded the correct patch this time.
Comment 24 Mark Johnston freebsd_committer freebsd_triage 2023-06-17 16:09:46 UTC
Created attachment 242832 [details]
proposed patch

Sigh.  Third time's the charm.  Sorry for the noise.
Comment 25 Daniel Ponte 2023-06-18 02:19:40 UTC
I am now using this patch and will keep an eye on the machine.
Comment 26 Mark Johnston freebsd_committer freebsd_triage 2023-06-18 14:25:34 UTC
(In reply to Daniel Ponte from comment #25)
Thanks.  The main thing to watch for would be incorrect v6 checksums in packets originating from the patched machine, or drops due to "incorrect" checksums in received packets.

The review is here: https://reviews.freebsd.org/D40598
Comment 27 Mark Johnston freebsd_committer freebsd_triage 2023-06-18 14:27:18 UTC
*** Bug 271550 has been marked as a duplicate of this bug. ***
Comment 28 commit-hook freebsd_committer freebsd_triage 2023-06-23 15:10:08 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6775ef4188b4d4c023e76ebd2b71fa8c2c7e7cd2

commit 6775ef4188b4d4c023e76ebd2b71fa8c2c7e7cd2
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2023-06-23 13:55:43 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2023-06-23 13:55:43 +0000

    netinet6: Implement in6_cksum_partial() using m_apply()

    This ensures that in6_cksum_partial() can be applied to unmapped mbufs,
    which can happen at least when icmp6_reflect() quotes a packet.

    The basic idea is to restructure in6_cksum_partial() to operate on one
    mbuf at a time.  If the buffer length is odd or unaligned, an extra
    residual byte may be returned, to be incorporated into the checksum when
    processing the next buffer.

    PR:             268400
    Reviewed by:    cy
    MFC after:      2 weeks
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D40598

 sys/netinet6/in6.h       |   6 +-
 sys/netinet6/in6_cksum.c | 300 +++++++++++++++++++++--------------------------
 2 files changed, 139 insertions(+), 167 deletions(-)
Comment 29 commit-hook freebsd_committer freebsd_triage 2023-07-07 18:53:18 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=db6978e02401cc3c1ea6e965fffd2482b1dd6461

commit db6978e02401cc3c1ea6e965fffd2482b1dd6461
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2023-06-23 13:55:43 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2023-07-07 18:46:41 +0000

    netinet6: Implement in6_cksum_partial() using m_apply()

    This ensures that in6_cksum_partial() can be applied to unmapped mbufs,
    which can happen at least when icmp6_reflect() quotes a packet.

    The basic idea is to restructure in6_cksum_partial() to operate on one
    mbuf at a time.  If the buffer length is odd or unaligned, an extra
    residual byte may be returned, to be incorporated into the checksum when
    processing the next buffer.

    PR:             268400
    Reviewed by:    cy
    MFC after:      2 weeks
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D40598

    (cherry picked from commit 6775ef4188b4d4c023e76ebd2b71fa8c2c7e7cd2)

 sys/netinet6/in6.h       |   6 +-
 sys/netinet6/in6_cksum.c | 300 +++++++++++++++++++++--------------------------
 2 files changed, 139 insertions(+), 167 deletions(-)
Comment 30 Mark Johnston freebsd_committer freebsd_triage 2023-07-07 18:56:56 UTC
Thank you for the report.