Bug 284073 - bnxt: kernel panic on 14.2-RELEASE
Summary: bnxt: kernel panic on 14.2-RELEASE
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.2-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2025-01-15 08:19 UTC by Daniel Porsch
Modified: 2025-01-27 17:22 UTC (History)
10 users (show)

See Also:


Attachments
kernel panic image (276.85 KB, image/png)
2025-01-15 08:19 UTC, Daniel Porsch
no flags Details
kernel crash with GENERIC-DEBUG (48.33 KB, image/png)
2025-01-20 08:08 UTC, Daniel Porsch
no flags Details
kernel panic with debug (44.29 KB, image/png)
2025-01-20 23:34 UTC, Daniel Porsch
no flags Details
Patch to assert OOB write on-stack allocated variable (1.23 KB, patch)
2025-01-21 10:40 UTC, Zhenlei Huang
no flags Details | Diff
new panic (28.24 KB, image/png)
2025-01-21 20:10 UTC, Daniel Porsch
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Porsch 2025-01-15 08:19:41 UTC
Created attachment 256708 [details]
kernel panic image

Hi,

Since upgrading from FreeBSD 14.1 to FreeBSD 14.2 we get random kernel panics on multiple servers, referring to bnxt: bnxt_dcb_list_app in the panic error.
See attached screenshot.


Servers: Dell PowerEdge R6615
NIC: 540-BCOD : Broadcom 57416 Dual Port 10GbE BASE-T Adapter, OCP NIC 3.0
Firmware:
dev.bnxt.0.ver.roce_fw_name: BONO_FW
dev.bnxt.0.ver.netctrl_fw_name: KONG_FW
dev.bnxt.0.ver.mgmt_fw_name: AFW_231.0.153.0
dev.bnxt.0.ver.hwrm_fw_name: CHIMP_FW
dev.bnxt.0.ver.fw_ver: 231.0.153.0/pkg 23.11.16.22
dev.bnxt.0.ver.roce_fw: 231.0.153
dev.bnxt.0.ver.netctrl_fw: 231.0.153
dev.bnxt.0.ver.mgmt_fw: 231.0.153
dev.bnxt.0.ver.hwrm_fw: 231.0.153

Has anyone seen something similar and have any ideas how to solve it? Tried to turn of LRO/TSO, but that didn't help.
Downgrading to 14.1 fixes the crashes.
Comment 1 Ed Maste freebsd_committer freebsd_triage 2025-01-15 13:09:41 UTC
The faulting code was added in ac940a8b92ac79df7bab71f50ae3b9aa7cff145d:

    bnxt_en: Add PFC, ETS & App TLVs protocols support
    
    Created new directory "bnxt_en" in /dev/bnxt and /modules/bnxt
    and moved source files and Makefile into respective directory.
    
    ETS support:
    
       - Added new files bnxt_dcb.c & bnxt_dcb.h
       - Added sysctl node 'dcb' and created handlers 'ets' and
         'dcbx_cap'
       - Add logic to validate user input and configure ETS in
         the firmware
       - Updated makefile to include bnxt_dcb.c & bnxt_dcb.h
    
    PFC support:
    
       - Created sysctl handlers 'pfc' under node 'dcb'
       - Added logic to validate user input and configure PFC in
         the firmware.
    
    App TLV support:
    
       - Created 3 new sysctl handlers under node 'dcb'
           - set_apptlv (write only): Sets a specified TLV
           - del_apptlv (write only): Deletes a specified TLV
           - list_apptlv (read only): Lists all APP TLVs configured
       - Added logic to validate user input and configure APP TLVs
         in the firmware.
    
    Added Below DCB ops for management interface:
    
       - Set PFC, Get PFC, Set ETS, Get ETS, Add App_TLV, Del App_TLV
         Lst App_TLV
    
    Reviewed by:            imp
    Approved by:            imp
    Differential revision:  https://reviews.freebsd.org/D45005
    
    (cherry picked from commit 35b53f8c989f62286aad075ef2e97bba358144f8)
Comment 2 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-17 09:04:22 UTC
addr2line shows that the line of source code is https://cgit.freebsd.org/src/tree/sys/dev/bnxt/bnxt_en/bnxt_sysctl.c?h=releng/14.2#n1959 , but no clue how that line can panic.

Maybe the kernel actually panic within `sysctl_handle_string()` ?

Hi Daniel, Can you please build the kernel with INVARIANTS enabled, or directly with the kernel conf `GENERIC-DEBUG` and test with the new kernel / driver ?


```
% readelf -s if_bnxt.ko.debug | grep bnxt_dcb_list_app
   163: 000000000001a5c0   368 FUNC    LOCAL  DEFAULT    1 bnxt_dcb_list_app
% echo "obase=16; ibase=16; 1A5C0 + 144" | bc
1A704
% addr2line -fip -e if_bnxt.ko.debug -j .text 0x1A704
bnxt_dcb_list_app at /usr/src/sys/dev/bnxt/bnxt_en/bnxt_sysctl.c:1959
```
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2025-01-18 16:31:55 UTC
(In reply to Zhenlei Huang from comment #2)
I don't have debug symbols handy to check myself, but maybe the %rip value 0xffffffff80b4dee7 gives a further hint?
Comment 4 Daniel Porsch 2025-01-20 08:08:43 UTC
Created attachment 256835 [details]
kernel crash with GENERIC-DEBUG
Comment 5 Daniel Porsch 2025-01-20 08:09:45 UTC
(In reply to Zhenlei Huang from comment #2)

I have built and installed the GENERIC-DEBUG kernel now, But now it crashes directly on boot, not sure if I did something wrong when i built the kernel or if this is realated to the bug, see new screenshot.
Comment 6 Kristof Provost freebsd_committer freebsd_triage 2025-01-20 08:52:32 UTC
(In reply to Daniel Porsch from comment #5)
That last crash is because the garp timer callback doesn't enter epoch. I've got a fix in progress for that problem.
Comment 7 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-20 09:29:26 UTC
(In reply to Daniel Porsch from comment #5)
That is apparently another genuine bug :)

Did you enable `net.link.ether.inet.garp_rexmit_count` ? You may restore it back to 0 to prevent that panic, if I read the code right.
Comment 8 Daniel Porsch 2025-01-20 09:46:19 UTC
(In reply to Zhenlei Huang from comment #7)
That helped, and now it boots with the debug kernel, I will send the error when it crashes again. It might a day or so for it to crash.
Comment 9 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-20 09:59:59 UTC
(In reply to Mark Johnston from comment #3)
> I don't have debug symbols handy to check myself, but maybe the %rip value 
> 0xffffffff80b4dee7 gives a further hint?

```
% addr2line -fip -e kernel.debug 0xffffffff80b4dee7
sysctl_handle_string at /usr/src/sys/kern/kern_sysctl.c:1787
```

See https://cgit.freebsd.org/src/tree/sys/kern/kern_sysctl.c?h=releng/14.2#n1787

That is interesting. The parameter `req` is actually on kernel stack ( allocated on stack in userland_sysctl() ), the call stack is
```
sys___sysctl()
  userland_sysctl()
    sysctl_root()
      sysctl_root_handler_locked()
        bnxt_dcb_list_app()
          sysctl_handle_string()
```
, but the fault virtual address `0x500000015` appears to be an userland one.
Comment 10 commit-hook freebsd_committer freebsd_triage 2025-01-20 15:11:45 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=38fdcca05d09b4d5426a253d3c484f9481a73ac2

commit 38fdcca05d09b4d5426a253d3c484f9481a73ac2
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2025-01-20 13:24:48 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2025-01-20 13:28:39 +0000

    netinet: enter epoch in garp_rexmit()

    garp_rexmit() is a callback, so is not in net_epoch, which
    arprequest_internal() expects.
    Enter and exit the net_epoch.

    PR:             284073
    MFC after:      1 week
    Sponsored by:   Rubicon Communications, LLC ("Netgate")

 sys/netinet/if_ether.c | 3 +++
 1 file changed, 3 insertions(+)
Comment 11 commit-hook freebsd_committer freebsd_triage 2025-01-20 15:11:58 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b4bd97ec168e97360cf9511b975a20f677864661

commit b4bd97ec168e97360cf9511b975a20f677864661
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2025-01-20 13:27:05 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2025-01-20 13:28:39 +0000

    netinet tests: basic garp test

    Excercise the garp code.
    This doesn't actively verify anything, but is sufficient to trigger the
    panic reported in PR 284073, so it's a useful test case to keep.

    PR:             284073
    Sponsored by:   Rubicon Communications, LLC ("Netgate")

 tests/sys/netinet/arp.sh | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)
Comment 12 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-20 17:06:39 UTC
(In reply to Daniel Porsch from comment #5)
> That helped, and now it boots with the debug kernel, I will send the error
> when it crashes again. It might a day or so for it to crash.

To speed up, after you get and report the crash (again), you can apply https://reviews.freebsd.org/D48495 and https://reviews.freebsd.org/D48496 to releng/14.2 branch locally and test if that helps.
Comment 13 Daniel Porsch 2025-01-20 23:34:51 UTC
Created attachment 256867 [details]
kernel panic with debug

Here is the crash with debug.
I will try the patches.
Comment 14 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-21 01:49:12 UTC
(In reply to Zhenlei Huang from comment #9)
Emm, I was wrong, RIP is next instruction. Stupid ...

```
% objdump --disassemble-symbols=sysctl_handle_string /boot/kernel/kernel 

/boot/kernel/kernel:	file format elf64-x86-64

Disassembly of section .text:

ffffffff80b4de20 <sysctl_handle_string>:
ffffffff80b4de20: 55                   	pushq	%rbp
ffffffff80b4de21: 48 89 e5             	movq	%rsp, %rbp
ffffffff80b4de24: 41 57                	pushq	%r15
ffffffff80b4de26: 41 56                	pushq	%r14
ffffffff80b4de28: 41 55                	pushq	%r13
ffffffff80b4de2a: 41 54                	pushq	%r12
ffffffff80b4de2c: 53                   	pushq	%rbx
ffffffff80b4de2d: 50                   	pushq	%rax
ffffffff80b4de2e: 48 89 cb             	movq	%rcx, %rbx
ffffffff80b4de31: 49 89 f6             	movq	%rsi, %r14
ffffffff80b4de34: 48 85 d2             	testq	%rdx, %rdx
ffffffff80b4de37: 0f 84 9c 00 00 00    	je	0xffffffff80b4ded9 <sysctl_handle_string+0xb9>
ffffffff80b4de3d: b8 00 00 08 40       	movl	$0x40080000, %eax       # imm = 0x40080000
ffffffff80b4de42: 23 47 2c             	andl	0x2c(%rdi), %eax
ffffffff80b4de45: 0f 84 8e 00 00 00    	je	0xffffffff80b4ded9 <sysctl_handle_string+0xb9>
ffffffff80b4de4b: 80 3d 5f 22 cb 00 00 	cmpb	$0x0, 0xcb225f(%rip)    # 0xffffffff818000b1 <kdb_active>
ffffffff80b4de52: 0f 85 81 00 00 00    	jne	0xffffffff80b4ded9 <sysctl_handle_string+0xb9>
ffffffff80b4de58: 49 89 d7             	movq	%rdx, %r15
ffffffff80b4de5b: 48 83 7b 10 00       	cmpq	$0x0, 0x10(%rbx)
ffffffff80b4de60: 0f 84 a0 00 00 00    	je	0xffffffff80b4df06 <sysctl_handle_string+0xe6>
ffffffff80b4de66: 4c 89 ff             	movq	%r15, %rdi
ffffffff80b4de69: 48 c7 c6 c0 5c 8d 81 	movq	$-0x7e72a340, %rsi      # imm = 0x818D5CC0
ffffffff80b4de70: ba 02 00 00 00       	movl	$0x2, %edx
ffffffff80b4de75: e8 e6 51 fc ff       	callq	0xffffffff80b13060 <malloc>
ffffffff80b4de7a: 49 89 c4             	movq	%rax, %r12
ffffffff80b4de7d: 48 c7 c7 70 21 bb 81 	movq	$-0x7e44de90, %rdi      # imm = 0x81BB2170
ffffffff80b4de84: 31 f6                	xorl	%esi, %esi
ffffffff80b4de86: e8 f5 bb ff ff       	callq	0xffffffff80b49a80 <_sx_slock_int>
ffffffff80b4de8b: 4c 89 e7             	movq	%r12, %rdi
ffffffff80b4de8e: 4c 89 f6             	movq	%r14, %rsi
ffffffff80b4de91: 4c 89 fa             	movq	%r15, %rdx
ffffffff80b4de94: e8 00 00 00 00       	callq	0xffffffff80b4de99 <sysctl_handle_string+0x79>
ffffffff80b4de99: 48 c7 c7 70 21 bb 81 	movq	$-0x7e44de90, %rdi      # imm = 0x81BB2170
ffffffff80b4dea0: e8 db c2 ff ff       	callq	0xffffffff80b4a180 <_sx_sunlock_int>
ffffffff80b4dea5: 4c 89 e7             	movq	%r12, %rdi
ffffffff80b4dea8: e8 d3 3c 4d 00       	callq	0xffffffff81021b80 <strlen>
ffffffff80b4dead: 48 8d 50 01          	leaq	0x1(%rax), %rdx
ffffffff80b4deb1: 48 89 df             	movq	%rbx, %rdi
ffffffff80b4deb4: 4c 89 e6             	movq	%r12, %rsi
ffffffff80b4deb7: ff 53 28             	callq	*0x28(%rbx)
ffffffff80b4deba: 41 89 c5             	movl	%eax, %r13d
ffffffff80b4debd: 4c 89 e7             	movq	%r12, %rdi
ffffffff80b4dec0: 48 c7 c6 c0 5c 8d 81 	movq	$-0x7e72a340, %rsi      # imm = 0x818D5CC0
ffffffff80b4dec7: e8 34 50 fc ff       	callq	0xffffffff80b12f00 <free>
ffffffff80b4decc: 44 89 e8             	movl	%r13d, %eax
ffffffff80b4decf: 85 c0                	testl	%eax, %eax
ffffffff80b4ded1: 0f 85 de 00 00 00    	jne	0xffffffff80b4dfb5 <sysctl_handle_string+0x195>
ffffffff80b4ded7: eb 64                	jmp	0xffffffff80b4df3d <sysctl_handle_string+0x11d>
ffffffff80b4ded9: 4c 89 f7             	movq	%r14, %rdi
ffffffff80b4dedc: e8 9f 3c 4d 00       	callq	0xffffffff81021b80 <strlen>
ffffffff80b4dee1: 49 89 c7             	movq	%rax, %r15
ffffffff80b4dee4: 49 ff c7             	incq	%r15
ffffffff80b4dee7: 4c 8b 63 10          	movq	0x10(%rbx), %r12
ffffffff80b4deeb: 4c 89 f7             	movq	%r14, %rdi
ffffffff80b4deee: e8 8d 3c 4d 00       	callq	0xffffffff81021b80 <strlen>
ffffffff80b4def3: 48 89 c2             	movq	%rax, %rdx
ffffffff80b4def6: 4d 85 e4             	testq	%r12, %r12
ffffffff80b4def9: 74 33                	je	0xffffffff80b4df2e <sysctl_handle_string+0x10e>
ffffffff80b4defb: 48 ff c2             	incq	%rdx
ffffffff80b4defe: 48 89 df             	movq	%rbx, %rdi
...
```

The current instruction should be `0xffffffff80b4dee4`.

```
% addr2line -fip -e kernel.debug 0xffffffff80b4dee4
sysctl_handle_string at /usr/src/sys/kern/kern_sysctl.c:1783
```

https://cgit.freebsd.org/src/tree/sys/kern/kern_sysctl.c?h=releng/14.2#n1783

Then that makes sense.
Comment 15 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-21 09:47:40 UTC
Update: After carefully reading the disassembled code, I can confirm the fault address is RIP (0xffffffff80b4dee7).

For `sysctl_handle_string()`, `req` is the last arg which is passed via register %rcx.

```
ffffffff80b4de2e: 48 89 cb             	movq	%rcx, %rbx
```

It was saved to callee-saved register %rbx, and the following flow does not touch it. It was `0000000500000005` when passed in. Then indirect memory access
```
ffffffff80b4dee7: 4c 8b 63 10          	movq	0x10(%rbx), %r12
```
will panic.

Part of disassembled code of if_bnxt.ko,
```
$ objdump --disassemble-symbols=bnxt_dcb_list_app -r /boot/kernel/if_bnxt.ko
...
   1a5cc: 53                           	pushq	%rbx
   1a5cd: 48 81 ec 28 02 00 00         	subq	$0x228, %rsp            # imm = 0x228, reserve app[128] and other local vars. 
   1a5d4: 48 89 cb                     	movq	%rcx, %rbx # save req
...
   1a622: 48 89 5d c8                  	movq	%rbx, -0x38(%rbp)
...
   1a6f0: ba 00 10 00 00               	movl	$0x1000, %edx           # imm = 0x1000
   1a6f5: 4c 89 f7                     	movq	%r14, %rdi
   1a6f8: 4c 89 fe                     	movq	%r15, %rsi
   1a6fb: 48 8b 4d c8                  	movq	-0x38(%rbp), %rcx  # previously saved req
   1a6ff: e8 00 00 00 00               	callq	0x1a704 <bnxt_dcb_list_app+0x144>
		000000000001a700:  R_X86_64_PLT32	sysctl_handle_string-0x4
```

If `bnxt_dcb_ieee_listapp()` OOB write the on stack variable app[128], then it make sense that we get `%rbx ==  0000000500000005`. We can add asserting for that.
Comment 16 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-21 10:40:55 UTC
Created attachment 256871 [details]
Patch to assert OOB write on-stack allocated variable
Comment 17 Daniel Porsch 2025-01-21 12:18:04 UTC
(In reply to Zhenlei Huang from comment #16)

I applied this patch now, hopefully it doesn't crash again.
(In reply to Zhenlei Huang from comment #16)
Comment 18 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-21 14:46:10 UTC
(In reply to Daniel Porsch from comment #17)
No, the last patch only helps debugging OOB write to on-stack allocated variable. I expect one more kernel panic :) , and I believe this time I found the root cause.

Once that is confirmed, I'll prepare the final fix.
Comment 19 Daniel Porsch 2025-01-21 20:10:56 UTC
Created attachment 256886 [details]
new panic

New panic with the latest partch
Comment 20 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-22 04:49:08 UTC
(In reply to Daniel Porsch from comment #19)
So my previous assumption 
> If `bnxt_dcb_ieee_listapp()` OOB write the on stack variable app[128], then
> it make sense that we get `%rbx ==  0000000500000005`
is right.
Comment 21 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-22 12:19:26 UTC
Hi Daniel, now you can apply https://reviews.freebsd.org/D48495 , https://reviews.freebsd.org/D48496 and https://reviews.freebsd.org/D48589 and test again.

Actually applying only the last one ( D48589 ) should be enough. D48496 prevent potential OOB write to heap allocated memory, but currently no sign that happens. I do not have that hardware to test, I'll give you credential for the D48495 and D48496 .
Comment 22 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-22 12:31:27 UTC
Hi Daniel,

BTW, I guess you enabled DCBx on the switch port. To workaround this, you can disable DCBx on either the interface [1] or the switch port, at the cost of breaking your current setup ( traffic flow priority etc. )

1. https://techdocs.broadcom.com/us/en/storage-and-ethernet-connectivity/ethernet-nic-controllers/bcm957xxx/adapters/Configuration-adapter/RoCE/manually-reconfiguring-network-parameters/enable-rdma-and-disable-dcbx.html
Comment 23 Zhenlei Huang freebsd_committer freebsd_triage 2025-01-25 18:16:41 UTC
Hi Danie, any good news ?
Comment 24 Daniel Porsch 2025-01-27 06:52:28 UTC
(In reply to Zhenlei Huang from comment #23)
Hi,
No crashes since applying D48495.diff     D48496.diff     D48589.diff, so it seem to have worked.
Comment 25 commit-hook freebsd_committer freebsd_triage 2025-01-27 17:22:40 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1c465e52920848dec6a76f0672fa209db7d5e5b5

commit 1c465e52920848dec6a76f0672fa209db7d5e5b5
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2025-01-20 13:24:48 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2025-01-27 09:04:31 +0000

    netinet: enter epoch in garp_rexmit()

    garp_rexmit() is a callback, so is not in net_epoch, which
    arprequest_internal() expects.
    Enter and exit the net_epoch.

    PR:             284073
    MFC after:      1 week
    Sponsored by:   Rubicon Communications, LLC ("Netgate")

    (cherry picked from commit 38fdcca05d09b4d5426a253d3c484f9481a73ac2)

 sys/netinet/if_ether.c | 3 +++
 1 file changed, 3 insertions(+)
Comment 26 commit-hook freebsd_committer freebsd_triage 2025-01-27 17:22:43 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e69309223199e56397df3b6f750f012eb729d904

commit e69309223199e56397df3b6f750f012eb729d904
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2025-01-20 13:24:48 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2025-01-27 09:04:34 +0000

    netinet: enter epoch in garp_rexmit()

    garp_rexmit() is a callback, so is not in net_epoch, which
    arprequest_internal() expects.
    Enter and exit the net_epoch.

    PR:             284073
    MFC after:      1 week
    Sponsored by:   Rubicon Communications, LLC ("Netgate")

    (cherry picked from commit 38fdcca05d09b4d5426a253d3c484f9481a73ac2)

 sys/netinet/if_ether.c | 3 +++
 1 file changed, 3 insertions(+)