Bug 287395 - bnxt(4): BCM57416 not active after 14.2 to 14.3 upgrade
Summary: bnxt(4): BCM57416 not active after 14.2 to 14.3 upgrade
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.3-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Security Team
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2025-06-09 12:48 UTC by mickael.maillot
Modified: 2025-09-16 16:56 UTC (History)
13 users (show)

See Also:
kbowling: mfc-stable14+
kbowling: mfc-stable13-
delphij: needs_errata+


Attachments
sysctl dev.bnxt output on 14.3 (79.39 KB, text/plain)
2025-06-09 12:48 UTC, mickael.maillot
no flags Details
fix media list for BASE-T (743 bytes, patch)
2025-06-13 20:40 UTC, cyric@mm.st
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description mickael.maillot 2025-06-09 12:48:17 UTC
Created attachment 261116 [details]
sysctl dev.bnxt output on 14.3

I just booted 14.3-RELEASE kernel and cannot reach the box so, here a some info i collected some info before going back to kernel 14.2

dmesg bnxt lines show no diff.
sysctl output attached.

pciconf -vl:
bnxt0@pci0:71:0:0:      class=0x020000 rev=0x01 hdr=0x00 vendor=0x14e4 device=0x16d8 subvendor=0x15d9 subdevice=0x16d8
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller'
    class      = network
    subclass   = ethernet
bnxt1@pci0:71:0:1:      class=0x020000 rev=0x01 hdr=0x00 vendor=0x14e4 device=0x16d8 subvendor=0x15d9 subdevice=0x16d8
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller'
    class      = network
    subclass   = ethernet


ifconfig diff:
bnxt0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
        options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
        ether 3c:ec:ef:a5:d2:50
-       media: Ethernet autoselect (1000baseT <full-duplex>)
+       media: Ethernet autoselect (Unknown <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
 bnxt1: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
        options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
        ether 3c:ec:ef:a5:d2:50
        hwaddr 3c:ec:ef:a5:d2:51
-       media: Ethernet autoselect (1000baseT <full-duplex>)
+       media: Ethernet autoselect (Unknown <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
 lagg0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
@@ -23,8 +23,8 @@
        ether 3c:ec:ef:a5:d2:50
        hwaddr 00:00:00:00:00:00
        laggproto lacp lagghash l2,l3,l4
-       laggport: bnxt0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
-       laggport: bnxt1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
+       laggport: bnxt0 flags=18<COLLECTING,DISTRIBUTING>
+       laggport: bnxt1 flags=18<COLLECTING,DISTRIBUTING>
        groups: lagg
        media: Ethernet autoselect
        status: active
Comment 1 Einar Bjarni Halldórsson 2025-06-13 14:54:42 UTC
We're seeing the same thing after upgrading to 14.3-RELEASE.

bnxt0@pci0:198:0:0:	class=0x020000 rev=0x01 hdr=0x00 vendor=0x14e4 device=0x16d8 subvendor=0x15d9 subdevice=0x16d8
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller'
    class      = network
    subclass   = ethernet
bnxt1@pci0:198:0:1:	class=0x020000 rev=0x01 hdr=0x00 vendor=0x14e4 device=0x16d8 subvendor=0x15d9 subdevice=0x16d8
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller'
    class      = network
    subclass   = ethernet

Exactly same symptoms. Running now on only one interface, no lagg, until we can fix lacp
Comment 2 cyric@mm.st 2025-06-13 20:40:47 UTC
Created attachment 261245 [details]
fix media list for BASE-T

I'm not sure if fixing the difference in media output would solve the real LACP problem here, can't test as I don't have LACP-capable switch, but the regression should be fixed anyway.  The patch is against the main branch, but likely should apply to releng/14.3 as well, please test if it helps with LACP.
Comment 3 Einar Bjarni Halldórsson 2025-06-13 22:18:49 UTC
I tried the patch and it works!

$ ifconfig
lo0: flags=1008049<UP,LOOPBACK,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 16384
	options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
	inet 127.0.0.1 netmask 0xff000000
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
	groups: lo
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
bnxt0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
	options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
	ether 3c:ec:ef:e5:ca:de
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
bnxt1: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
	options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
	ether 3c:ec:ef:e5:ca:de
	hwaddr 3c:ec:ef:e5:ca:df
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lagg0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
	options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
	ether 3c:ec:ef:e5:ca:de
	hwaddr 00:00:00:00:00:00
	inet 185.93.156.40 netmask 0xffffff80 broadcast 185.93.156.127
	inet6 fe80::3eec:efff:fee5:cade%lagg0 prefixlen 64 scopeid 0x4
	inet6 2001:67c:6c:56::40 prefixlen 64
	laggproto lacp lagghash l2,l3,l4
	laggport: bnxt0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	laggport: bnxt1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
	groups: lagg
	media: Ethernet autoselect
	status: active
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Comment 4 commit-hook freebsd_committer freebsd_triage 2025-06-14 23:55:35 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=5e6e4f752833acc96f1efc893318d3f6b74b9689

commit 5e6e4f752833acc96f1efc893318d3f6b74b9689
Author:     Kevin Bowling <kbowling@FreeBSD.org>
AuthorDate: 2025-06-14 23:46:05 +0000
Commit:     Kevin Bowling <kbowling@FreeBSD.org>
CommitDate: 2025-06-14 23:54:22 +0000

    bnxt: Fix BASE-T, 40G AOC, 1G-CX, autoneg and unknown media lists

    This was broken in c63d67e137f3, the early returns prevent building the
    media lists as expected.

    The BASE-T parts of the patch were suggested by "cyric@mm.st", while I
    am adding the additional 40G AOC, 1CX, autoneg and unknown PHY fixes
    based on code inspection.  There may be additional work left here for
    Broadcom but this is certainly better than the returns.

    PR:             287395

    Reported by:    mickael.maillot@gmail.com, cyric@mm.st
    Tested by:      Einar Bjarni Halldórsson <einar@isnic.is>
    MFC after:      1 week

 sys/dev/bnxt/bnxt_en/if_bnxt.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)
Comment 5 Kevin Bowling freebsd_committer freebsd_triage 2025-06-15 08:52:10 UTC
(In reply to cyric@mm.st from comment #2)
Thanks, this indeed looks right - the ambiguity in the commit message is that I am not sure if something else was intended for PAM4.. but there shouldn't be much overlap between the media types in question and situations where newer encoding is in use.
Comment 6 Einar Bjarni Halldórsson 2025-06-15 23:31:06 UTC
(In reply to commit-hook from comment #4)

That's great, but it will have to be merged into releng/14.3 branch as well (no idea about 13.4 or 13.5).
Comment 7 commit-hook freebsd_committer freebsd_triage 2025-06-22 07:19:41 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=33f65f12eba10588827a13d232337616f6f4facf

commit 33f65f12eba10588827a13d232337616f6f4facf
Author:     Kevin Bowling <kbowling@FreeBSD.org>
AuthorDate: 2025-06-14 23:46:05 +0000
Commit:     Kevin Bowling <kbowling@FreeBSD.org>
CommitDate: 2025-06-22 07:18:41 +0000

    bnxt: Fix BASE-T, 40G AOC, 1G-CX, autoneg and unknown media lists

    This was broken in c63d67e137f3, the early returns prevent building the
    media lists as expected.

    The BASE-T parts of the patch were suggested by "cyric@mm.st", while I
    am adding the additional 40G AOC, 1CX, autoneg and unknown PHY fixes
    based on code inspection.  There may be additional work left here for
    Broadcom but this is certainly better than the returns.

    PR:             287395

    Reported by:    mickael.maillot@gmail.com, cyric@mm.st
    Tested by:      Einar Bjarni Halldórsson <einar@isnic.is>

    (cherry picked from commit 5e6e4f752833acc96f1efc893318d3f6b74b9689)

 sys/dev/bnxt/bnxt_en/if_bnxt.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)
Comment 8 Kevin Bowling freebsd_committer freebsd_triage 2025-06-22 07:22:15 UTC
Thank you for the patch and testing.  stable/13 is no affected.
Comment 9 Kevin Bowling freebsd_committer freebsd_triage 2025-06-23 00:17:41 UTC
(In reply to Einar Bjarni Halldórsson from comment #6)
We'd need additional feedback from at least the patch submitter to confirm my changes still work to consider an -EN.
Comment 10 mickael.maillot 2025-06-24 12:06:32 UTC
Switched to 14-STABLE today and everything works now.
Comment 11 crest 2025-07-13 15:42:53 UTC
This needs an errata because it fucking breaks all network connectivity for affected users.
Comment 12 geoffroy desvernay 2025-07-15 14:44:03 UTC
I tried this patch after same error (upgrade to FreeBSD-14.3: lagg0 missing connectivity, media unknown, each bnxt* working alone but not in lagg) and it fixed that for me.

Tested hardware with up-to-date dell firmwares:
dev.bnxt.0.ver.hwrm_min_ver: 1.10.3
dev.bnxt.0.ver.package_ver: 23.21.14.14
dev.bnxt.0.ver.chip_type: ASIC
dev.bnxt.0.ver.chip_bond_id: 0
dev.bnxt.0.ver.chip_metal: 1
dev.bnxt.0.ver.chip_rev: 1
dev.bnxt.0.ver.chip_num: 5848
dev.bnxt.0.ver.phy_partnumber: 
dev.bnxt.0.ver.phy_vendor: 
dev.bnxt.0.ver.roce_fw_name: BONO_FW
dev.bnxt.0.ver.netctrl_fw_name: KONG_FW
dev.bnxt.0.ver.mgmt_fw_name: AFW_232.0.164.4
dev.bnxt.0.ver.hwrm_fw_name: CHIMP_FW
dev.bnxt.0.ver.phy: 13.1.11
dev.bnxt.0.ver.fw_ver: 232.0.164.4/pkg 23.21.14.14
dev.bnxt.0.ver.roce_fw: 232.0.164.4
dev.bnxt.0.ver.netctrl_fw: 232.0.164.4
dev.bnxt.0.ver.mgmt_fw: 232.0.164.4
dev.bnxt.0.ver.driver_hwrm_if: 1.10.3.61
dev.bnxt.0.ver.hwrm_if: 1.10.3
dev.bnxt.0.iflib.override_nrxds: 0,0,0
dev.bnxt.0.iflib.override_ntxds: 0,0,0
dev.bnxt.0.iflib.override_qs_enable: 0
dev.bnxt.0.iflib.override_nrxqs: 0
dev.bnxt.0.iflib.override_ntxqs: 0
dev.bnxt.0.iflib.driver_version: 230.0.133.0
Comment 13 Aurélien Méré 2025-08-31 13:06:03 UTC
Hi

I also have been affected by this bug with bnxt
25G sfp28 interfaces were not concerned but all 10G-baseT of production servers were no longer syncing in lacp.

I confirm that the patch has corrected the problem.

We are currently upgrading all production to 14.3 as 14.2 is no longer supported at the end of next month. I really suggest releasing quickly an errata as it totally breaks connectivity on production release on widespread broadcom interfaces in dell servers.

Best regards
Comment 14 Philippe BEAUMONT 2025-09-02 09:30:37 UTC
Hi,

Same here. The patch correct the regression.

For those that affected by this bug : it's critical to us because we can't upgrade in 14.3 without broke the network. Please release the errata.
Comment 15 Xin LI freebsd_committer freebsd_triage 2025-09-04 21:39:02 UTC
Potential errata draft snippet, please review / comment.

======

I. Background

The bnxt(4) driver provides support for Broadcom NetXtreme-C/NetXtreme-E Family of Ethernet controllers. A key function of the driver is to report the various supported physical media types and operational modes (e.g., 1000base-T, 40GBASE-AOC, full-duplex, autoselect) to the operating system's ifmedia interface. This allows network administrators to view and configure the interface link settings.

II. Problem Description

A logic error was introduced into the bnxt(4) driver which prevented the proper population of the supported media list for several physical connection types. Inside the function responsible for building this list, a switch statement incorrectly used return statements instead of break statements. This caused the function to exit prematurely after identifying certain media types, including common BASE-T (copper), 40G Active Optical Cable (AOC), and 1G-CX connections, before the corresponding speed and duplex options could be registered with the network subsystem.

III. Impact

For network controllers using the affected media types, the driver fails to advertise any supported link modes. An administrator running ifconfig(8) on the interface would see incorrect media (unknown). Because of this, the network interface may be unable to establish a link, as the operating system cannot properly configure it or initiate auto-negotiation. The network port will be unusable.

IV.  Workaround

No workaround is available.  Only systems that uses bnxt(4) device with the affected media types are affected.

V.   Solution

Upgrade your system to a supported FreeBSD stable or release / security
branch (releng) dated after the correction date.

A reboot is required for this fix to take effect.

Perform one of the following:

1) To update your system via a binary patch:

Systems running a RELEASE version of FreeBSD on the amd64 or arm64 platforms,
or the i386 platform on FreeBSD 13, can be updated via the freebsd-update(8)
utility:

# freebsd-update fetch
# freebsd-update install

Reboot the system.

2) To update your system via a source code patch:

The following patches have been verified to apply to the applicable
FreeBSD release branches.

a) Download the relevant patch from the location below, and verify the
detached PGP signature using your PGP utility.

[FreeBSD 14.3]
# fetch https://security.FreeBSD.org/patches/EN-XX:XX/XXXX.patch
# fetch https://security.FreeBSD.org/patches/EN-XX:XX/XXXX.patch.asc
# gpg --verify XXXX.patch.asc

b) Apply the patch.  Execute the following commands as root:

# cd /usr/src
# patch < /path/to/patch

c) Recompile your kernel as described in
<URL:https://www.FreeBSD.org/handbook/kernelconfig.html> and reboot the
system.

====
Comment 16 Einar Bjarni Halldórsson 2025-09-04 21:57:30 UTC
(In reply to Xin LI from comment #15)

> For network controllers using the affected media types, the driver fails to
> advertise any supported link modes. An administrator running ifconfig(8) on the
> interface would see incorrect media (unknown). Because of this, the network
> interface may be unable to establish a link, as the operating system cannot
> properly configure it or initiate auto-negotiation. The network port will be
> unusable.

For me at least, I was able to use the network ports, I just wasn't able to use them in LACP bonds.
Comment 17 Mark Johnston freebsd_committer freebsd_triage 2025-09-08 14:03:27 UTC
(In reply to Xin LI from comment #15)
Thanks.  I've queued that up internally so the next secteam release will include this patch.
Comment 18 commit-hook freebsd_committer freebsd_triage 2025-09-16 16:31:40 UTC
A commit in branch releng/14.3 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=c07b1838f9c9e31696716b188a832ec35003ec2d

commit c07b1838f9c9e31696716b188a832ec35003ec2d
Author:     Kevin Bowling <kbowling@FreeBSD.org>
AuthorDate: 2025-06-14 23:46:05 +0000
Commit:     Gordon Tetlow <gordon@FreeBSD.org>
CommitDate: 2025-09-14 00:24:43 +0000

    bnxt: Fix BASE-T, 40G AOC, 1G-CX, autoneg and unknown media lists

    This was broken in c63d67e137f3, the early returns prevent building the
    media lists as expected.

    The BASE-T parts of the patch were suggested by "cyric@mm.st", while I
    am adding the additional 40G AOC, 1CX, autoneg and unknown PHY fixes
    based on code inspection.  There may be additional work left here for
    Broadcom but this is certainly better than the returns.

    PR:             287395

    Reported by:    mickael.maillot@gmail.com, cyric@mm.st
    Tested by:      Einar Bjarni Halldórsson <einar@isnic.is>
    Approved by:    so
    Security:       FreeBSD-EN-25:17.bnxt

    (cherry picked from commit 5e6e4f752833acc96f1efc893318d3f6b74b9689)
    (cherry picked from commit 33f65f12eba10588827a13d232337616f6f4facf)

 sys/dev/bnxt/bnxt_en/if_bnxt.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)