Bug 260260 - igb(4) I35{0,4} parrent <--> vlan jumbo frame mtu mismatch
Summary: igb(4) I35{0,4} parrent <--> vlan jumbo frame mtu mismatch
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2021-12-07 06:29 UTC by Marek Zarychta
Modified: 2021-12-17 10:10 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marek Zarychta 2021-12-07 06:29:01 UTC
The vlan(4) children of LACP lagg(4) consistent of two ibg(4) I350 or I354 have to use reduced MTU size to work.

To reproduce:

ifconfig_igb0="mtu 9000 up"
ifconfig_igb1="mtu 9000 up"
ifconfig_lagg0="laggproto lacp laggport igb0 laggport igb1 -lacp_strict"
vlans_lagg0="vlan0 vlan1 ..."
ifconfig_vlan0="inet x.x.x.x/y"

# iperf3 -R -c y.y.y.y
Connecting to host y.y.y.y, port 5201
Reverse mode, remote host y.y.y.y is sending
[  5] local x.x.x.x port 52750 connected to y.y.y.y port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.02   sec  0.00 Bytes  0.00 bits/sec
[  5]   1.02-2.02   sec  0.00 Bytes  0.00 bits/sec
[  5]   2.02-3.02   sec  0.00 Bytes  0.00 bits/sec
[  5]   3.02-3.55   sec  0.00 Bytes  0.00 bits/sec

#ifconfig vlan0 mtu 8996

# iperf3 -R -c y.y.y.y
Connecting to host y.y.y.y, port 5201
Reverse mode, remote host y.y.y.y is sending
[  5] local x.x.x.x port 49056 connected to y.y.y.y port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   118 MBytes   989 Mbits/sec
[  5]   1.00-2.00   sec   118 MBytes   990 Mbits/sec
[  5]   2.00-3.00   sec   118 MBytes   990 Mbits/sec
[  5]   3.00-3.69   sec  81.8 MBytes   989 Mbits/sec

There is no problem with sending jumbo frames, only receiving them is broken. It is not hardware limitation, since bumping MTU on parents also solves the issue and the configuration below is working fine:

ifconfig_igb0="mtu 9004 up"
ifconfig_igb1="mtu 9004 up"
ifconfig_lagg0="laggproto lacp laggport igb0 laggport igb1 -lacp_strict"
vlans_lagg0="vlan0 vlan1 ..."
ifconfig_vlan0="inet x.x.x.x/y mtu 9000"

The issue looks like either ibg(4) or maybe only I35{0,4} specific. I have more machines with em(4) running similar setups, but only a few of them, those with ibg(4) I35{0,4} NICs seem to be affected. Moreover, they all worked fine while running either 11.4-STABLE or even 12.1-STABLE at the beginning of 2021.
Comment 1 Marek Zarychta 2021-12-07 07:16:38 UTC
Last time tested on 13.0-STABLE stable/13-n248421-3b936a8c889 where the issue persists. It is also worth mentioning that turning off vlanmtu vlanhwtag vlanhwfilter vlanhwtso vlanhwcsum on parents doesn't solve it.
Comment 2 Zhenlei Huang 2021-12-07 10:07:50 UTC
So it is weird.

If VLANMTU is disabled on parent interface, MTU of VLAN interface will be limited off by 4 automatically. The MTU of vlans should be 8996 in your case.

Try the following steps:

1. Disabling all vlan hardware offloading features in rc.conf:

ifconfig_igb0="mtu 9000 -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso -vlanhwcsum up"
ifconfig_igb1="mtu 9000 -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso -vlanhwcsum up"

2. reboot

For the 2nd step, you could also destroy cloned interfaces and restart netif service.

# ifconfig vlan0 destroy
# ifconfig vlan1 destroy
# ifconfig ... destroy
# ifconfig lagg0 destroy
# service netif restart
Comment 3 Marek Zarychta 2021-12-07 12:24:13 UTC
(In reply to Zhenlei Huang from comment #2)
1. Indeed, bringing up intrefaces this way:
ifconfig_igb0="mtu 9000 -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso -vlanhwcsum up"
ifconfig_igb1="mtu 9000 -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso -vlanhwcsum up"
makes vlan(4) children brought up with mtu 8996, but IMHO it's not right solution of the problem.

2. I also tried to apply D33154 as suggested Özkan KIRIK on net@ mailing list:
> Please see the https://reviews.freebsd.org/D33154,
> igb driver doesn't honor the interface capabilities like vlanhwtag,
> vlan* and etc now. If you want, you can apply the patches from D33154

The patch unfortunately doesn't solve this issue.

FYI, in another machine, I have the setup described below working flawlessly with the same 13.0-STABLE revision.

em0: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=481049b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,VLAN_HWFILTER,NOMAP>

em1: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=481049b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,VLAN_HWFILTER,NOMAP>


lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=481049b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,VLAN_HWFILTER,NOMAP>
	ether 00:26:55:e4:d3:fa
	laggproto lacp lagghash l2,l3,l4
	laggport: em0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	laggport: em1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>


vlan6: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	options=4000403<RXCSUM,TXCSUM,LRO,NOMAP>
	vlan: 7 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
Comment 4 Marek Zarychta 2021-12-15 16:15:40 UTC
I was told to update the PR with some em vs igb details, so TL;DR:
1. em(4) works like before with the same MTU 9000 on parent em(4), lagg(4) and vlan(4) with the options: VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,VLAN_HWFILTER enabled.
2. igb(4) doesn't work this way and with options: VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO enabled, the size of vlan(4) MTU has to be lowered by 4 bytes to unbreak receiving jumbo frames.
Comment 5 Zhenlei Huang 2021-12-17 10:10:14 UTC
Since the VLAN(4) works as expected when disabling VLAN hardware offloading features, it should be a bug in the driver of igb(4).