Hello, In FreeBSD 11.0 lacp is not working with Qlogic BCM57800. Under FreeBSD 10.3 it is working. Tested with Juniper EX4550 Tested with commands: ifconfig bxe0 up ifconfig bxe1 up ifconfig lagg0 create ifconfig lagg0 laggproto lacp laggport bxe0 laggport bxe1 xxx.xxx.xxx.xxx/xx
what does ifconfig lagg0 report?
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether b0:83:fe:e5:fa:82 inet 172.21.50.2 netmask 0xffffff00 broadcast 172.21.50.255 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect status: active groups: lagg laggproto lacp lagghash l2,l3,l4 laggport: bxe0 flags=0<> laggport: bxe1 flags=0<>
and for bxe0 / bxe1?
bxe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether b0:83:fe:e5:fa:82 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (10Gbase-SR <full-duplex>) status: active bxe1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether b0:83:fe:e5:fa:82 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (10Gbase-SR <full-duplex>) status: active
That looks odd, no options= for the nic's. Can you try something silly, assign a private IP to the bxe0 and see if that changes things?
Looks like it may be an old issues see: https://forums.freenas.org/index.php?threads/lacp-lagg-issues-with-9-3.26227/
Nothing changes and it seems to be the same problem yes. laggproto loadbalance is working fine. LACP was also working in 10.3 so something between 10.3 and 11.0 changed that.
Deja vu with https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=150249 The main symptom was lagg refusing to work in LACP mode. In this case, the reason was that the driver didn't detect media properly, and the "paperwork" with the kernel failed: the interface wasn't marked as full duplex. As a result, LACP (which checks the full-duplex flag for the interface) refused to use it. Remember that full-duplex is a prerequisite for LACP. This seems to be a case of incomplete paperwork as well, although the necessary bits seem to be in place. In my case this was the problem with LACP (ieee8023ad_lacp.c): --------- /* * If the port is not an active full duplex Ethernet link then it can * not be aggregated. */ if (IFM_TYPE(media) != IFM_ETHER || (media & IFM_FDX) == 0 || ifp->if_link_state != LINK_STATE_UP) { lacp_port_disable(lp); } else { lacp_port_enable(lp); } --------- But according to ifconfig the interface is marked as full duplex and media seems to be Ethernet. I would add some printf's here to check if this is really the case and some other check is failing. What does ifconfig -m say of the interfaces? But that lack of options looks like a driver bug. And it would help to see its capabilities as reported by ifconfig. This is an example with an "em" interface. --------- % ifconfig -m -v -v em0 em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO> capabilities=15399b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,NETMAP> ether 68:05:ca:XX:YY:ZZ inet 192.168.1.202 netmask 0xffffff00 broadcast 192.168.1.255 inet 192.168.1.203 netmask 0xffffffff broadcast 192.168.1.203 nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> media: Ethernet autoselect (1000baseT <full-duplex>) status: active supported media: media autoselect media 1000baseT media 1000baseT mediaopt full-duplex media 100baseTX mediaopt full-duplex media 100baseTX media 10baseT/UTP mediaopt full-duplex media 10baseT/UTP ---------
ifconfig -m bxe0 bxe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 capabilities=527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO> ether b0:83:fe:e5:fa:82 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (10Gbase-SR <full-duplex>) status: active supported media: media autoselect media 10Gbase-SR mediaopt full-duplex
And what if you try to mess with the options? Try this: ifconfig bxe0 -rxcsum ifconfig bxe1 -rxcsum I'm just wondering, maybe by doing that you can coerce the driver to complete the paperwork properly.
Sorry, it does not make things better.
Upgraded nic FW to latest version and tried FreeBSD-12.0-CURRENT-amd64-20161117-r308737-disc1.iso. LACP still not working. Is there anything more I can give to help debug this problem?
Hi! I'm having the same issues in recent versions of freebsd, it worked before upgrade, also using bxe driver. When putting the interfaces in promiscious mode, they start working again for some unknown reason, but when taking them out of promisc they stop working after about a minute. If i start a tcpdump (and therefor putting the interface in promisc), i get around 10 LACPv1 packets and then i start seeing other traffic coming in/out as it should, and the interface flags becomes ACTIVE,COLLECTING,DISTRIBUTING. Using laggproto loadbalance or failover works fine. NIC: QLogic NetXtreme II BCM57810 10GbE (B0) BXE v:1.78.81 LACP works on: 10.3-RELEASE-p7 (and at the very least, some earlier versions as well) LACP does NOT work on: FreeBSD 11.0-RELEASE-p1 FreeBSD 11.0-RELEASE-p2
The issues are still there in 11.0-RELEASE-p7 bxe and lagg worked perfectly before upgrading from 10.3 to 11.0 Hope this problem can get some priority. Happy to help debugging.
Still seems to be a problem with 11.1
Everyone having this problem should run "tcpdump -enp -i bxe0" to see if it shows incoming and/or outgoing LACP ethernet frames while lagg negotiates protocol with partner switch, and report back.
Also related, probably: https://community.emc.com/thread/222482?start=0&tstart=0
LACP relies on receiving ethernet multicasts. It seems that bxe(4) hardware fails to deliver incoming multicasts to its host unless switched to promiscuous mode.
Here is suspicious commit that might broke bxe(4) multicast processing: https://svnweb.freebsd.org/base/head/sys/dev/bxe/bxe.c?revision=266979&view=markup See also older PR https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=174850 describing same problem for FreeBSD 8.x Currect bxe(4) code has some processing for IFF_ALLMULTI but it does not seem functional: https://svnweb.freebsd.org/base/head/sys/dev/bxe/bxe.c?revision=266979&view=markup#l12664 Adding edavis that may have some thoughts.
*** Bug 227743 has been marked as a duplicate of this bug. ***