Bug 235607

Summary: Incorrect checksums with NAT on vtnet with offloading
Product: Base System Reporter: Jorge Schrauwen <sjorge+signup>
Component: kernAssignee: freebsd-net mailing list <net>
Status: New ---    
Severity: Affects Only Me CC: eugen, kp
Priority: ---    
Version: 12.0-STABLE   
Hardware: amd64   
OS: Any   

Description Jorge Schrauwen 2019-02-08 17:08:14 UTC
### description

The issue only pops up when there is no valid tcp checksum present on the source traffic, it 'works' when the csum is valid. I did verify with the illumos people about if a bad checksum will be pass along as is or not with offloading enabled:

[17:03:15] <sjorge> rzezeski qq, if a guest with csum enabled sends a checksummed packet does it get discarded and recalculate when it hits the physical nic before it goes on the wire? Or is it kept as is?
[17:03:26] <sjorge> From my captures it seems to be kept as is?
[17:05:26] <rzezeski> sjorge: It depends on the guest OS. In illumos, either the IP stack calcs the checksum on Tx or places a flag on the dblk to have it calculated by hardware. If you do both then the hardware will calculate a second sum over the current one and it will be incorrect.
[17:05:42] <rzezeski> 1) <reply to diff question> 2) I have no idea how FBSD/Linux work in that regard, but they would have the same issue to deal with.

I beleive the later case is what is happening here, pf nat adds checksum based on empty initial checksum and then it gets pass as-is out the nic. That would also explain why a packet with a valid checksum that comes in will pass back out correctly after pf nat.

<-> = phsyical link, aka cat5/cat6 plugged into a NIC or Switch
<~> = loopback link, aka traffic between bhyve guests, between host and bhyve guest, ... stuff that never hit the MAC layer.
<.> = wireless link

This flow that is currently broken (only when pf nat is involved):

fbsd_guest1 <~> fbsd_guest_fw <-> switch(1) <-> modem
win10_guest <~> fbsd_guest_fw <-> switch(1) <-> modem

This flow is OK:

macbook <-> switch <->(2) fbsd_guest_fw <-> switch <-> modem
macbook <.> AP <-> switch <-> fbsd_guest_fw <-> switch <-> modem

1) using port-mirror on the switch I was able to confirm packets with a bad checksum end up on the wire
2) the bhyve guest has a vnic on the physical nic (they are considered bridge for the host OS)

#### workaround but comes at a performance hit
root@nattest:~ # ifconfig vtnet0 -rxcsum -txcsum -rxcsum6 -txcsum6 -tso4 -lro
root@nattest:~ # ifconfig vtnet1 -rxcsum -txcsum -rxcsum6 -txcsum6 -tso4 -lro

Re-enabeling these on will make the behavior return, this can be done live!


### uname -a
FreeBSD nattest 12.0-RELEASE FreeBSD 12.0-RELEASE r341666 GENERIC amd64

### ifconfig output

vtnet0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=6c05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:22:06:05:01:0a
        inet 192.168.0.212 netmask 0xffffff00 broadcast 192.168.0.255
        media: Ethernet 10Gbase-T <full-duplex>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vtnet1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=6c05bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:22:06:0a:01:01
        inet 10.23.10.87 netmask 0xffffff00 broadcast 10.23.10.255
        media: Ethernet 10Gbase-T <full-duplex>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=141<UP,RUNNING,PROMISC> metric 0 mtu 33160
groups: pflog


### /etc/pf.conf
if_wan="vtnet0"
if_lan="vtnet1"
net_lan=$if_lan:network

scrub on lo0 all random-id
scrub on vtnet0 all random-id
scrub on vtnet1 all random-id

nat on $if_wan inet from $net_lan to any -> ($if_wan) port 1024:65535

antispoof log for vtnet0
antispoof log for vtnet1

block in all
pass on lo0 all
pass inet proto icmp all icmp-type { echoreq, unreach } keep state
pass out all keep state
pass in on $if_lan proto tcp from $net_lan to any port { 22, 80, 443 }

### /etc/rc.conf

hostname="nattest"
gateway_enable="YES"
ifconfig_vtnet0="DHCP"
ifconfig_vtnet1="10.23.10.87 netmask 255.255.255.0"

zfs_enable="YES"
clear_tmp_enable="YES"
sshd_enable="YES"
dumpdev="AUTO"
pf_enable="YES"
pflog_enable="YES"
Comment 1 Jorge Schrauwen 2019-02-08 17:09:15 UTC
I was also able to reproduce the bug using ipfw with this config

firewall_enable="YES"
firewall_type="OPEN"
firewall_logging="YES"
natd_enable="YES"
natd_interface="vtnet0"
natd_flags="-dynamic -m"
Comment 2 Eugene Grosbein freebsd_committer 2019-02-08 20:50:57 UTC
(In reply to Jorge Schrauwen from comment #1)

For ipfw nat and/or natd both based on libalias, this is known and documented in the ipfw(8) manual page:

     Due to the architecture of libalias(3), ipfw nat is not compatible with
     the TCP segmentation offloading (TSO).  Thus, to reliably nat your
     network traffic, please disable TSO on your NICs using ifconfig(8).
Comment 3 Jorge Schrauwen 2019-02-08 21:58:01 UTC
Good to know about ipfw, I was discussing this with kp and he suggested to try it with a different firewall to confirm or rule out pf nat issues.
Comment 4 Kristof Provost freebsd_committer 2019-02-09 11:04:33 UTC
(In reply to Eugene Grosbein from comment #2)
Right, that makes sense, and I keep forgetting that about ipfw.

Does ipf have the same limitation? I'd quite like to work out if the problem is in vtnet of in pf.

There's another report of issues that look similar and where pf doesn't appear to be a factor:
https://lists.freebsd.org/pipermail/freebsd-questions/2019-February/284348.html
Comment 5 Eugene Grosbein freebsd_committer 2019-02-09 11:20:22 UTC
(In reply to Kristof Provost from comment #4)

I do not know, never used pfnat.
Comment 6 Jorge Schrauwen 2019-04-18 18:49:10 UTC
So for ipf it's https://www.freebsd.org/doc/handbook/firewalls-ipf.html right?
I'm a bit busy but with the long holiday weekend I might have a few hours to try and replicate this with ipf.