Hello, I'm trying to rewrite the mss option on one of my gateway. I have two interfaces (each one is a lagg to a ix VF), both are in fib 1, traffic is natted from lagg1 to lagg0. ifconfig output : ixv0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=405bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWTSO> ether 00:16:3e:22:ac:63 hwaddr 00:16:3e:22:ac:63 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect status: active ixv1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=405bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWTSO> ether 00:16:3e:22:ac:63 hwaddr 00:16:3e:fd:31:cb nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect status: active ixv2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=405bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWTSO> ether 00:16:3e:26:17:b5 hwaddr 00:16:3e:26:17:b5 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect status: active ixv3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=405bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWTSO> ether 00:16:3e:26:17:b5 hwaddr 00:16:3e:3a:73:21 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect status: active lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 inet 127.0.0.1 netmask 0xff000000 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> groups: lo lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=405bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWTSO> ether 00:16:3e:22:ac:63 inet XXX.XXX.XXX.XXX netmask 0xffffff80 broadcast 155.133.140.127 inet XXX.XXX.XXX.XXX netmask 0xffffffff broadcast 155.133.142.65 nd6 options=2b<PERFORMNUD,ACCEPT_RTADV,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect status: active fib: 1 groups: lagg laggproto failover lagghash l2,l3,l4 laggport: ixv0 flags=5<MASTER,ACTIVE> laggport: ixv1 flags=0<> lagg1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=405bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,VLAN_HWTSO> ether 00:16:3e:26:17:b5 inet 172.23.0.253 netmask 0xffff8000 broadcast 172.23.127.255 inet 172.23.0.254 netmask 0xffff0000 broadcast 172.23.255.255 nd6 options=2b<PERFORMNUD,ACCEPT_RTADV,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect status: active fib: 1 groups: lagg laggproto failover lagghash l2,l3,l4 laggport: ixv2 flags=5<MASTER,ACTIVE> laggport: ixv3 flags=0<> ipfw rules : ipfw pipe 1 config bw 2000Mbit/s ipfw pipe 2 config bw 2000Mbit/s ipfw queue 1 config pipe 1 mask src-ip 0xffffffff ipfw queue 2 config pipe 2 mask dst-ip 0xffffffff # Setup tables ipfw table blacklist create type addr ipfw table nonat create type addr ipfw table nonat add XXX.XXX.XXX.XXX/24 ipfw table nat create type addr ipfw table nat add 172.23.0.0/17 # Setup rules ipfw add 00100 allow ip from any to any via lo0 ipfw add 00200 deny ip from any to 127.0.0.0/8 ipfw add 00201 deny ip from 127.0.0.0/8 to any ipfw add 00202 deny ip from 'table(blacklist)' to any ipfw add 00203 deny ip from any to 'table(blacklist)' ipfw add 00500 queue 1 ip from any to any xmit lagg1 out ipfw add 00501 queue 2 ip from any to any recv lagg1 in ipfw add 02100 nat 123 ip from any to not 'table(nonat)' fib 1 ipfw add 64999 allow ip from any to any fib 1 ipfw add 65000 allow ip from any to any fib 0 ipfw add 65535 deny ip from any to any ipfw nat 123 config ip XXX.XXX.XXX.XXX log reset All the configuration above works correctly. If I add : ipfw add 02005 tcp-setmss 1460 tcp from any to any fib 1 I can see that the rule is hit but the mss isn't updated : This is from lagg1 so I guess it's normal that the mss isn't rewritten at this point : 15:17:34.928408 IP 172.23.6.163.58048 > 83.166.144.237.http: Flags [S], seq 1940485466, win 26880, options [mss 8960,sackOK,TS val 414737643 ecr 0,nop,wscale 9], length 0 From lagg0: 15:17:34.929409 IP XXX.XXX.XXX.XXX.53942 > 83.166.144.237.http: Flags [S], seq 1940485466, win 26880, options [mss 8960,sackOK,TS val 414737643 ecr 0,nop,wscale 9], length 0 Is there something I miss ? Thanks
This is working if I set for example a 1400 value and that the traffic originate from the same machine : 15:41:17.834140 IP 155.133.142.65.37976 > 83.166.144.237.http: Flags [S], seq 1088266918, win 65535, options [mss 1400,nop,wscale 6,sackOK,TS val 4711532 ecr 0], length 0
I've just tested without NAT (just bypassing by adding the destination IP on a discarded table) and the situation is the same. So maybe fib are the problems ?
(In reply to Emmanuel Vadot from comment #2) > I've just tested without NAT (just bypassing by adding the destination IP on > a discarded table) and the situation is the same. > So maybe fib are the problems ? Can you clarify a bit, is tcp-setmss works for locally originated traffic, and doesn't work for forwarded traffic not depending from presence of NAT? Also, since you use NAT, you need to disable TSO on the used NICs.
(In reply to Andrey V. Elsukov from comment #3) Yes exactly. I didn't know about disabling TSO when using NAT, I'll test that tomorrow. Thanks.
(In reply to Emmanuel Vadot from comment #4) What is the reason for disabling TSO when using NAT btw ?
(In reply to Emmanuel Vadot from comment #5) Emmanuel, I can't tell you why it matters, but I can say that it does make a significant difference. See my mailing list thread about this from last year at https://lists.freebsd.org/pipermail/freebsd-ipfw/2017-August/006578.html (disabling TSO). Slightly related is another thread https://lists.freebsd.org/pipermail/freebsd-ipfw/2017-September/006585.html (disable TXCSUM). Also, see the ipfw(8) man page in the BUGS section: "Due to the architecture of libalias(3), ipfw nat is not compatible with the TCP segmentation offloading (TSO). Thus, to reliably nat network traffic, please disable TSO on your NICs using ifconfig(8)." Graham
Created attachment 195605 [details] Fix for little endian machines The tcp-setmss option is broken on little endian machines. The comparison between oldmss and mss needs to be done in host byte order.
The incorrect network order byte compare on a little endian processor explains exactly what Emmanuel is seeing. An mss of 8960 (0x0023 network order) which is compared <= to the desired mss of 1460 (0xB405 network order) will be true and break out before setting the mss. The test value of 1400 (0x7805 network order) will also break out. In the case where the locally generated traffic worked with the 1400 rule I suspect the mss was 1460 already. So 1460 (0xB405 network order) <= 1400 (0x7805 network order) is false, since we're comparing the network order values, we don't break and go on to set the mss in the packet. People using it for the usual PPPoE clamp would normally not notice on normal 1460 mss packets. The 1460 (0xB405 network order) <= 1452 (0xAC05 network order) test would be false and the mss would be changed. Any device using an mss less than 1460 might have its mss increased. This bug also affects me as I also use MTU 9000 (mss 8960) on my internal network. After I applied my patch and reloaded the kernel module it started working and setting the mss properly.
A commit references this bug: Author: ae Date: Wed Aug 8 17:32:03 UTC 2018 New revision: 337469 URL: https://svnweb.freebsd.org/changeset/base/337469 Log: Use host byte order when comparing mss values. This fixes tcp-setmss action on little endian machines. PR: 225536 Submitted by: John Zielinski Changes: head/sys/netpfil/ipfw/pmod/tcpmod.c
A commit references this bug: Author: ae Date: Thu Aug 16 09:42:09 UTC 2018 New revision: 337902 URL: https://svnweb.freebsd.org/changeset/base/337902 Log: MFC r337469: Use host byte order when comparing mss values. This fixes tcp-setmss action on little endian machines. PR: 225536 Submitted by: John Zielinski Changes: _U stable/11/ stable/11/sys/netpfil/ipfw/pmod/tcpmod.c
Fixed in head/ and stable/11. Thanks!