Bug 255164 - Panic with ipfw/nat under 13.0-RELEASE amd64
Summary: Panic with ipfw/nat under 13.0-RELEASE amd64
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords: panic, regression
Depends on:
Blocks:
 
Reported: 2021-04-18 01:21 UTC by 0xcdcdcdcd
Modified: 2021-04-28 14:16 UTC (History)
4 users (show)

See Also:


Attachments
proposed patch (untested) (732 bytes, patch)
2021-04-19 11:11 UTC, Andrey V. Elsukov
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description 0xcdcdcdcd 2021-04-18 01:21:43 UTC
After upgrading FreeBSD from 12.2-RELEASE to 13.0-RELEASE, I started to get kernel panics.

The server configuration is as follows:
- 13.0-RELEASE amd64 GENERIC
- NIC x2
- GATEWAY / ipfw + NAT
- nginx as reverse proxy

/etc/rc.conf:
  ifconfig_vmx0="inet xxx.yyy.zzz.28 netmask 255.255.255.224"
  ifconfig_vmx1="inet 192.168.0.1 netmask 255.255.255.0"
  defaultrouter="xxx.yyy.zzz.1"
  gateway_enable="YES"
  firewall_enable="YES"
  firewall_logging="YES"
  firewall_quiet="NO"
  firewall_script="/etc/ipfw.rules"
  natd_enable="YES"
  natd_interface="vmx0"
  natd_flags="-f /etc/natd.conf"

/etc/ipfw.conf:
  ipfw add divert natd all from any to any via ${natd_interface}
  ipfw add check-state
  ipfw add allow tcp  from me to any established
  ipfw add allow tcp  from any to me established
  ipfw add allow tcp  from me to any setup keep-state
  ipfw add allow udp  from me to any       keep-state
  ipfw add allow icmp from me to any       keep-state
  ipfw add allow tcp from any to me 25,80,443 in
  ipfw add allow tcp from 192.168.0.0/24 to any established
  ipfw add allow all from 192.168.0.0/24 to any setup keep-state
  ipfw add deny log all from any to any


/var/log/messages:
Apr 17 21:10:14 gateway kernel: Fatal trap 12: page fault while in kernel mode
Apr 17 21:10:14 gateway kernel: cpuid = 0; apic id = 00
Apr 17 21:10:14 gateway kernel: fault virtual address   = 0x0
Apr 17 21:10:14 gateway kernel: fault code              = supervisor read data,page not present
Apr 17 21:10:14 gateway kernel: instruction pointer     = 0x20:0xffffffff810659f6
Apr 17 21:10:14 gateway kernel: stack pointer           = 0x28:0xfffffe008a8a1110
Apr 17 21:10:14 gateway kernel: frame pointer           = 0x28:0xfffffe008a8a1120
Apr 17 21:10:14 gateway kernel: code segment            = base rx0, limit 0xfffff, type 0x1b
Apr 17 21:10:14 gateway kernel:                         = DPL 0, pres 1, long 1, def32 0, gran 1
Apr 17 21:10:14 gateway kernel: processor eflags        = interrupt enabled, resume, IOPL = 0
Apr 17 21:10:14 gateway kernel: current process         = 872 (nginx)
Apr 17 21:10:14 gateway kernel: trap number             = 12
Apr 17 21:10:14 gateway kernel: panic: page fault
Apr 17 21:10:14 gateway kernel: cpuid = 0
Apr 17 21:10:14 gateway kernel: time = 1618661379
Apr 17 21:10:14 gateway kernel: KDB: stack backtrace:
Apr 17 21:10:14 gateway kernel: #0 0xffffffff80c57345 at kdb_backtrace+0x65
Apr 17 21:10:14 gateway kernel: #1 0xffffffff80c09d21 at vpanic+0x181
Apr 17 21:10:14 gateway kernel: #2 0xffffffff80c09b93 at panic+0x43
Apr 17 21:10:14 gateway kernel: #3 0xffffffff8108b187 at trap_fatal+0x387
Apr 17 21:10:14 gateway kernel: #4 0xffffffff8108b1df at trap_pfault+0x4f
Apr 17 21:10:14 gateway kernel: #5 0xffffffff8108a83d at trap+0x27d
Apr 17 21:10:14 gateway kernel: #6 0xffffffff810617a8 at calltrap+0x8
Apr 17 21:10:14 gateway kernel: #7 0xffffffff81065907 at in_cksum_skip+0x77
Apr 17 21:10:14 gateway kernel:                         = DPL 0, pres 1, long 1, def32 0, gran 1
Apr 17 21:10:14 gateway kernel: processor eflags        = interrupt enabled, resume, IOPL = 0
Apr 17 21:10:14 gateway kernel: current process         = 872 (nginx)
Apr 17 21:10:14 gateway kernel: trap number             = 12
Apr 17 21:10:14 gateway kernel: panic: page fault
Apr 17 21:10:14 gateway kernel: cpuid = 0
Apr 17 21:10:14 gateway kernel: time = 1618661379
Apr 17 21:10:14 gateway kernel: KDB: stack backtrace:
Apr 17 21:10:14 gateway kernel: #0 0xffffffff80c57345 at kdb_backtrace+0x65
Apr 17 21:10:14 gateway kernel: #1 0xffffffff80c09d21 at vpanic+0x181
Apr 17 21:10:14 gateway kernel: #2 0xffffffff80c09b93 at panic+0x43
Apr 17 21:10:14 gateway kernel: #3 0xffffffff8108b187 at trap_fatal+0x387
Apr 17 21:10:14 gateway kernel: #4 0xffffffff8108b1df at trap_pfault+0x4f
Apr 17 21:10:14 gateway kernel: #5 0xffffffff8108a83d at trap+0x27d
Apr 17 21:10:14 gateway kernel: #6 0xffffffff810617a8 at calltrap+0x8
Apr 17 21:10:14 gateway kernel: #7 0xffffffff81065907 at in_cksum_skip+0x77
Apr 17 21:10:14 gateway kernel: #8 0xffffffff80db359d at in_delayed_cksum+0x3d
Apr 17 21:10:14 gateway kernel: #9 0xffffffff82350ea3 at divert_packet+0x73
Apr 17 21:10:14 gateway kernel: #10 0xffffffff8232dc81 at ipfw_check_packet+0x2c1
Apr 17 21:10:14 gateway kernel: #11 0xffffffff80d41f87 at pfil_run_hooks+0x97
Apr 17 21:10:14 gateway kernel: #12 0xffffffff80db2d71 at ip_output+0xb61
Apr 17 21:10:14 gateway kernel: #13 0xffffffff80dc94b4 at tcp_output+0x1b04
Apr 17 21:10:14 gateway kernel: #14 0xffffffff80ddab89 at tcp_usr_send+0x229
Apr 17 21:10:14 gateway kernel: #15 0xffffffff80c07c3a at vn_sendfile+0x197a
Apr 17 21:10:14 gateway kernel: #16 0xffffffff80c08637 at sendfile+0x127
Apr 17 21:10:14 gateway kernel: #17 0xffffffff8108c0d5 at amd64_syscall+0x755




On another server (multi-homed with no GATEWAY/NAT), the upgrade to 13.0-RELEASE requires the following ipfw rules.
  ipfw add check-state
  ipfw add allow tcp  from me to any established
  ipfw add allow tcp  from any to me established  (This was not necessary in 12.2-RELEASE.)
In 13.0-RELEASE, If this rule is not present, the SYN+ACK packet from the internal server will be rejected.
Has there been any changes to ipfw in 13.0-RELEASE?
Comment 1 Andrey V. Elsukov freebsd_committer 2021-04-19 10:55:36 UTC
Did you try to disable sendfile for nginx? 
I think this can be related to lack of mb_unmapped_to_ext() call in ip_divert() code. ipfw_nat and ipfw_nat64 also seems need to be modified. Do you have saved core dump from this panic?
Comment 2 Andrey V. Elsukov freebsd_committer 2021-04-19 11:11:20 UTC
Created attachment 224248 [details]
proposed patch (untested)
Comment 3 Joshua Kinard 2021-04-19 23:43:56 UTC
This might be related to the issue I reported in Bug #255104, where I get random crashes/panics shortly after activating a divert(4) rule in my IPFW firewall to route packets to Snort for inline inspection.  WLAN traffic seems to more easily trigger it than wired LAN traffic.  I'll look at trying to test this patch in the next few days to see if it resolves the issue somewhat (or makes it less likely to happen).
Comment 4 0xcdcdcdcd 2021-04-20 13:58:18 UTC
Thanks for your advices.

I disabled the sendfile for nginx and confirmed that it works stably.

I'm building a kernel with the patch you provided, so I'm going to apply it and check it out.
Comment 5 0xcdcdcdcd 2021-04-21 14:21:30 UTC
I installed the patched ipdivert.ko and enabled the sendfile for nginx.
A few hours passed, but still no panic.
I will report it if it occurs.
Comment 6 commit-hook freebsd_committer 2021-04-21 20:00:43 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=652908599b6fa7285ee60cb567b97e70b648ac29

commit 652908599b6fa7285ee60cb567b97e70b648ac29
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-04-21 19:38:01 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-04-21 19:47:05 +0000

    Add required checks for unmapped mbufs in ipdivert and ipfw

    Also add an M_ASSERTMAPPED() macro to verify that all mbufs in the chain
    are mapped.  Use it in ipfw_nat, which operates on a chain returned by
    m_megapullup().

    PR:             255164
    Reviewed by:    ae, gallatin
    MFC after:      1 week
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D29838

 sys/netinet/ip_divert.c                  |  6 ++++++
 sys/netpfil/ipfw/ip_fw_nat.c             |  1 +
 sys/netpfil/ipfw/nat64/nat64_translate.c | 10 ++++++++++
 sys/sys/mbuf.h                           | 11 +++++++++++
 4 files changed, 28 insertions(+)
Comment 7 commit-hook freebsd_committer 2021-04-28 14:10:44 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=2b826286c3b951df0bb3b4250eecbb7adc5c860b

commit 2b826286c3b951df0bb3b4250eecbb7adc5c860b
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-04-21 19:38:01 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-04-28 14:00:13 +0000

    Add required checks for unmapped mbufs in ipdivert and ipfw

    Also add an M_ASSERTMAPPED() macro to verify that all mbufs in the chain
    are mapped.  Use it in ipfw_nat, which operates on a chain returned by
    m_megapullup().

    PR:             255164
    Reviewed by:    ae, gallatin
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D29838

    (cherry picked from commit 652908599b6fa7285ee60cb567b97e70b648ac29)

 sys/netinet/ip_divert.c                  |  6 ++++++
 sys/netpfil/ipfw/ip_fw_nat.c             |  1 +
 sys/netpfil/ipfw/nat64/nat64_translate.c | 10 ++++++++++
 sys/sys/mbuf.h                           | 11 +++++++++++
 4 files changed, 28 insertions(+)