After upgrading FreeBSD from 12.2-RELEASE to 13.0-RELEASE, I started to get kernel panics. The server configuration is as follows: - 13.0-RELEASE amd64 GENERIC - NIC x2 - GATEWAY / ipfw + NAT - nginx as reverse proxy /etc/rc.conf: ifconfig_vmx0="inet xxx.yyy.zzz.28 netmask 255.255.255.224" ifconfig_vmx1="inet 192.168.0.1 netmask 255.255.255.0" defaultrouter="xxx.yyy.zzz.1" gateway_enable="YES" firewall_enable="YES" firewall_logging="YES" firewall_quiet="NO" firewall_script="/etc/ipfw.rules" natd_enable="YES" natd_interface="vmx0" natd_flags="-f /etc/natd.conf" /etc/ipfw.conf: ipfw add divert natd all from any to any via ${natd_interface} ipfw add check-state ipfw add allow tcp from me to any established ipfw add allow tcp from any to me established ipfw add allow tcp from me to any setup keep-state ipfw add allow udp from me to any keep-state ipfw add allow icmp from me to any keep-state ipfw add allow tcp from any to me 25,80,443 in ipfw add allow tcp from 192.168.0.0/24 to any established ipfw add allow all from 192.168.0.0/24 to any setup keep-state ipfw add deny log all from any to any /var/log/messages: Apr 17 21:10:14 gateway kernel: Fatal trap 12: page fault while in kernel mode Apr 17 21:10:14 gateway kernel: cpuid = 0; apic id = 00 Apr 17 21:10:14 gateway kernel: fault virtual address = 0x0 Apr 17 21:10:14 gateway kernel: fault code = supervisor read data,page not present Apr 17 21:10:14 gateway kernel: instruction pointer = 0x20:0xffffffff810659f6 Apr 17 21:10:14 gateway kernel: stack pointer = 0x28:0xfffffe008a8a1110 Apr 17 21:10:14 gateway kernel: frame pointer = 0x28:0xfffffe008a8a1120 Apr 17 21:10:14 gateway kernel: code segment = base rx0, limit 0xfffff, type 0x1b Apr 17 21:10:14 gateway kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 Apr 17 21:10:14 gateway kernel: processor eflags = interrupt enabled, resume, IOPL = 0 Apr 17 21:10:14 gateway kernel: current process = 872 (nginx) Apr 17 21:10:14 gateway kernel: trap number = 12 Apr 17 21:10:14 gateway kernel: panic: page fault Apr 17 21:10:14 gateway kernel: cpuid = 0 Apr 17 21:10:14 gateway kernel: time = 1618661379 Apr 17 21:10:14 gateway kernel: KDB: stack backtrace: Apr 17 21:10:14 gateway kernel: #0 0xffffffff80c57345 at kdb_backtrace+0x65 Apr 17 21:10:14 gateway kernel: #1 0xffffffff80c09d21 at vpanic+0x181 Apr 17 21:10:14 gateway kernel: #2 0xffffffff80c09b93 at panic+0x43 Apr 17 21:10:14 gateway kernel: #3 0xffffffff8108b187 at trap_fatal+0x387 Apr 17 21:10:14 gateway kernel: #4 0xffffffff8108b1df at trap_pfault+0x4f Apr 17 21:10:14 gateway kernel: #5 0xffffffff8108a83d at trap+0x27d Apr 17 21:10:14 gateway kernel: #6 0xffffffff810617a8 at calltrap+0x8 Apr 17 21:10:14 gateway kernel: #7 0xffffffff81065907 at in_cksum_skip+0x77 Apr 17 21:10:14 gateway kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 Apr 17 21:10:14 gateway kernel: processor eflags = interrupt enabled, resume, IOPL = 0 Apr 17 21:10:14 gateway kernel: current process = 872 (nginx) Apr 17 21:10:14 gateway kernel: trap number = 12 Apr 17 21:10:14 gateway kernel: panic: page fault Apr 17 21:10:14 gateway kernel: cpuid = 0 Apr 17 21:10:14 gateway kernel: time = 1618661379 Apr 17 21:10:14 gateway kernel: KDB: stack backtrace: Apr 17 21:10:14 gateway kernel: #0 0xffffffff80c57345 at kdb_backtrace+0x65 Apr 17 21:10:14 gateway kernel: #1 0xffffffff80c09d21 at vpanic+0x181 Apr 17 21:10:14 gateway kernel: #2 0xffffffff80c09b93 at panic+0x43 Apr 17 21:10:14 gateway kernel: #3 0xffffffff8108b187 at trap_fatal+0x387 Apr 17 21:10:14 gateway kernel: #4 0xffffffff8108b1df at trap_pfault+0x4f Apr 17 21:10:14 gateway kernel: #5 0xffffffff8108a83d at trap+0x27d Apr 17 21:10:14 gateway kernel: #6 0xffffffff810617a8 at calltrap+0x8 Apr 17 21:10:14 gateway kernel: #7 0xffffffff81065907 at in_cksum_skip+0x77 Apr 17 21:10:14 gateway kernel: #8 0xffffffff80db359d at in_delayed_cksum+0x3d Apr 17 21:10:14 gateway kernel: #9 0xffffffff82350ea3 at divert_packet+0x73 Apr 17 21:10:14 gateway kernel: #10 0xffffffff8232dc81 at ipfw_check_packet+0x2c1 Apr 17 21:10:14 gateway kernel: #11 0xffffffff80d41f87 at pfil_run_hooks+0x97 Apr 17 21:10:14 gateway kernel: #12 0xffffffff80db2d71 at ip_output+0xb61 Apr 17 21:10:14 gateway kernel: #13 0xffffffff80dc94b4 at tcp_output+0x1b04 Apr 17 21:10:14 gateway kernel: #14 0xffffffff80ddab89 at tcp_usr_send+0x229 Apr 17 21:10:14 gateway kernel: #15 0xffffffff80c07c3a at vn_sendfile+0x197a Apr 17 21:10:14 gateway kernel: #16 0xffffffff80c08637 at sendfile+0x127 Apr 17 21:10:14 gateway kernel: #17 0xffffffff8108c0d5 at amd64_syscall+0x755 On another server (multi-homed with no GATEWAY/NAT), the upgrade to 13.0-RELEASE requires the following ipfw rules. ipfw add check-state ipfw add allow tcp from me to any established ipfw add allow tcp from any to me established (This was not necessary in 12.2-RELEASE.) In 13.0-RELEASE, If this rule is not present, the SYN+ACK packet from the internal server will be rejected. Has there been any changes to ipfw in 13.0-RELEASE?
Did you try to disable sendfile for nginx? I think this can be related to lack of mb_unmapped_to_ext() call in ip_divert() code. ipfw_nat and ipfw_nat64 also seems need to be modified. Do you have saved core dump from this panic?
Created attachment 224248 [details] proposed patch (untested)
This might be related to the issue I reported in Bug #255104, where I get random crashes/panics shortly after activating a divert(4) rule in my IPFW firewall to route packets to Snort for inline inspection. WLAN traffic seems to more easily trigger it than wired LAN traffic. I'll look at trying to test this patch in the next few days to see if it resolves the issue somewhat (or makes it less likely to happen).
Thanks for your advices. I disabled the sendfile for nginx and confirmed that it works stably. I'm building a kernel with the patch you provided, so I'm going to apply it and check it out.
I installed the patched ipdivert.ko and enabled the sendfile for nginx. A few hours passed, but still no panic. I will report it if it occurs.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=652908599b6fa7285ee60cb567b97e70b648ac29 commit 652908599b6fa7285ee60cb567b97e70b648ac29 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2021-04-21 19:38:01 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2021-04-21 19:47:05 +0000 Add required checks for unmapped mbufs in ipdivert and ipfw Also add an M_ASSERTMAPPED() macro to verify that all mbufs in the chain are mapped. Use it in ipfw_nat, which operates on a chain returned by m_megapullup(). PR: 255164 Reviewed by: ae, gallatin MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29838 sys/netinet/ip_divert.c | 6 ++++++ sys/netpfil/ipfw/ip_fw_nat.c | 1 + sys/netpfil/ipfw/nat64/nat64_translate.c | 10 ++++++++++ sys/sys/mbuf.h | 11 +++++++++++ 4 files changed, 28 insertions(+)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=2b826286c3b951df0bb3b4250eecbb7adc5c860b commit 2b826286c3b951df0bb3b4250eecbb7adc5c860b Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2021-04-21 19:38:01 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2021-04-28 14:00:13 +0000 Add required checks for unmapped mbufs in ipdivert and ipfw Also add an M_ASSERTMAPPED() macro to verify that all mbufs in the chain are mapped. Use it in ipfw_nat, which operates on a chain returned by m_megapullup(). PR: 255164 Reviewed by: ae, gallatin Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29838 (cherry picked from commit 652908599b6fa7285ee60cb567b97e70b648ac29) sys/netinet/ip_divert.c | 6 ++++++ sys/netpfil/ipfw/ip_fw_nat.c | 1 + sys/netpfil/ipfw/nat64/nat64_translate.c | 10 ++++++++++ sys/sys/mbuf.h | 11 +++++++++++ 4 files changed, 28 insertions(+)