Bug 155004 - [bce] [panic] kernel panic in bce0 driver
Summary: [bce] [panic] kernel panic in bce0 driver
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2011-02-24 14:10 UTC by Konstantin
Modified: 2022-10-17 12:18 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Konstantin 2011-02-24 14:10:07 UTC
We have some IBM System x3550 servers with Broadcom NetXtreme II BCM5708 NICs.
Sometimes kernel panic occurs on them under high load (about 900-950Mbs and 70-90 kpps).   

# kgdb kernel.debug /var/crash/vmcore.6

Unread portion of the kernel message buffer:
<118>Feb 22 15:39:51 d-ca1 syslogd: exiting on signal 15
Waiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...1 0 done
All buffers synced.
Uptime: 29m2s

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x10
fault code              = supervisor read data, page not present
instruction pointer     = 0x8:0xffffffff802bed2a
stack pointer           = 0x10:0xffffff80001b7b70
frame pointer           = 0x10:0x0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 30 (irq257: bce1)
trap number             = 12
panic: page fault
cpuid = 1
Uptime: 29m2s
Physical memory: 4082 MB
Dumping 1385 MB: 1370 1354 1338 1322 1306 1290 1274 1258 1242 1226 1210 1194 1178 1162 1146 1130 1114 1098 1082 1066 1050 1034 1018 1002 986 970 954 938 922 906 890 874 858 842 826 810 794 778 762 746 730 714 698 682 666 650 634 618 602 586 570 554 538 522 506 490 474 458 442 426 410 394 378 362 346 330 314 298 282 266 250 234 218 202 186 170 154 138 122 106 90 74 58 42 26 10

(kgdb) where 
#0  doadump () at pcpu.h:195
#1  0x0000000000000004 in ?? ()
#2  0xffffffff805285f9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
#3  0xffffffff80528a02 in panic (fmt=0x104 <Address 0x104 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:574
#4  0xffffffff807ec813 in trap_fatal (frame=0xffffff000158aae0, eva=Variable "eva" is not available.
) at /usr/src/sys/amd64/amd64/trap.c:777
#5  0xffffffff807ecbe5 in trap_pfault (frame=0xffffff80001b7ac0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:693
#6  0xffffffff807ed50c in trap (frame=0xffffff80001b7ac0) at /usr/src/sys/amd64/amd64/trap.c:464
#7  0xffffffff807d614e in calltrap () at /usr/src/sys/amd64/amd64/exception.S:218
#8  0xffffffff802bed2a in bce_intr (xsc=Variable "xsc" is not available.
) at /usr/src/sys/dev/bce/if_bce.c:5771
#9  0xffffffff80506a92 in ithread_loop (arg=0xffffff000158ea60) at /usr/src/sys/kern/kern_intr.c:1181
#10 0xffffffff805034e3 in fork_exit (callout=0xffffffff80506930 <ithread_loop>, arg=0xffffff000158ea60, frame=0xffffff80001b7c80)
    at /usr/src/sys/kern/kern_fork.c:811
#11 0xffffffff807d652e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:554
#12 0x0000000000000000 in ?? ()
#13 0x0000000000000000 in ?? ()
#14 0x0000000000000001 in ?? ()
#15 0x0000000000000000 in ?? ()
#16 0x0000000000000000 in ?? ()
#17 0x0000000000000000 in ?? ()
#18 0x0000000000000000 in ?? ()
#19 0x0000000000000000 in ?? ()
#20 0x0000000000000000 in ?? ()
#21 0x0000000000000000 in ?? ()
---Type <return> to continue, or q <return> to quit---

(kgdb) up 8
#8  0xffffffff802bed2a in bce_intr (xsc=Variable "xsc" is not available.
) at /usr/src/sys/dev/bce/if_bce.c:5771
5771                    sc->rx_mbuf_ptr[sw_rx_cons_idx] = NULL;

(kgdb) p *sc
$2 = {bce_ifp = 0xffffff0001591800, bce_dev = 0xffffff0001575100, bce_unit = 1 '\001', bce_res_mem = 0xffffff0001688c00, bce_ifmedia = {ifm_mask = 0,
    ifm_media = 0, ifm_cur = 0x0, ifm_list = {lh_first = 0x0}, ifm_change = 0, ifm_status = 0}, bce_btag = 1, bce_bhandle = 18446742977654030336,
  bce_vhandle = 18446742977654030336, bce_res_irq = 0xffffff0001688580, bce_mtx = {lock_object = {lo_name = 0xffffff00015856d0 "bce1",
      lo_type = 0xffffffff80856f26 "network driver", lo_flags = 16973824, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}},
    mtx_lock = 18446742974220511970, mtx_recurse = 0}, bce_intr = 0xffffffff802bea80 <bce_intr>, bce_intrhand = 0xffffff0001688b00, bce_irq_rid = 1,
  bce_msi_count = 1, bce_chipid = 1460146208, bce_flags = 97, bce_cap_flags = 9, bce_phy_flags = 2, bce_shared_hw_cfg = 325, bce_port_hw_cfg = 0,
  max_bus_addr = 1099511627775, bus_speed_mhz = 133, link_width = 0, link_speed = 0, bce_flash_info = 0xffffffff80a64330, bce_flash_size = 270336,
  bce_shmem_base = 1481728, bce_name = 0x0, bce_bc_ver = 67109637, bce_fw_timed_out = 0, bce_fw_wr_seq = 12, bce_fw_drv_pulse_wr_seq = 1761,
  eaddr = "\000!^o&#9618; ", bce_tx_quick_cons_trip_int = 20, bce_tx_quick_cons_trip = 20, bce_rx_quick_cons_trip_int = 6, bce_rx_quick_cons_trip = 6,
  bce_comp_prod_trip_int = 0, bce_comp_prod_trip = 0, bce_tx_ticks_int = 80, bce_tx_ticks = 80, bce_rx_ticks_int = 18, bce_rx_ticks = 18,
  bce_com_ticks_int = 0, bce_com_ticks = 0, bce_cmd_ticks_int = 0, bce_cmd_ticks = 0, bce_stats_ticks = 999936, bce_phy_addr = 1,
  bce_miibus = 0xffffff0001560500, rx_prod = 4030, rx_cons = 3519, rx_prod_bseq = 2815260672, tx_prod = 3257, tx_cons = 3257,
  tx_prod_bseq = 1054315439, bce_link = 0, bce_tick_callout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0,
        tqe_prev = 0xffffff804035ad90}}, c_time = 1760719, c_arg = 0xffffff800023f000, c_func = 0xffffffff802be580 <bce_tick>,
    c_mtx = 0xffffff800023f068, c_flags = 0}, bce_pulse_callout = {c_links = {sle = {sle_next = 0xffffffff80b77880}, tqe = {
        tqe_next = 0xffffffff80b77880, tqe_prev = 0xffffff8000223180}}, c_time = 1761038, c_arg = 0xffffff800023f000,
    c_func = 0xffffffff802bbe50 <bce_pulse>, c_mtx = 0xffffff800023f068, c_flags = 6}, watchdog_timer = 0, max_frame_size = 0,
  rx_bd_mbuf_alloc_size = 2048, rx_bd_mbuf_data_len = 2048, rx_bd_mbuf_align_pad = 0, rx_mode = 4096, parent_tag = 0xffffff0001688500,
  tx_bd_chain_tag = 0xffffff0001688380, tx_bd_chain_map = {0x0, 0x0}, tx_bd_chain = {0xffffff80001af000, 0xffffff80001b0000}, tx_bd_chain_paddr = {
    23539712, 23646208}, rx_bd_chain_tag = 0xffffff0001688280, rx_bd_chain_map = {0x0, 0x0}, rx_bd_chain = {0xffffff80001b1000, 0xffffff80001b2000},
  rx_bd_chain_paddr = {23650304, 23654400}, status_tag = 0xffffff0001688480, status_map = 0x0, status_block = 0xffffff000168c140,
  status_block_paddr = 23642432, last_status_idx = 57955, hw_rx_cons = 3550, hw_tx_cons = 3257, stats_tag = 0xffffff0001688400, stats_map = 0x0,
  stats_block = 0xffffff0001680800, stats_block_paddr = 23595008, ctx_pages = 0, ctx_tag = 0x0, ctx_map = {0x0, 0x0, 0x0, 0x0}, ctx_block = {0x0, 0x0,
---Type <return> to continue, or q <return> to quit---
    0x0, 0x0}, ctx_paddr = {0, 0, 0, 0}, rx_mbuf_tag = 0xffffff0001688200, tx_mbuf_tag = 0xffffff0001688300, tx_mbuf_map = {0x0 <repeats 512 times>},
  tx_mbuf_ptr = {0x0 <repeats 512 times>}, rx_mbuf_map = {0x0 <repeats 512 times>}, rx_mbuf_ptr = {0x0 <repeats 512 times>}, free_rx_bd = 511,
  max_rx_bd = 510, used_tx_bd = 0, max_tx_bd = 510, stat_IfHCInOctets = 1624963858, stat_IfHCInBadOctets = 11558, stat_IfHCOutOctets = 22619194581,
  stat_IfHCOutBadOctets = 0, stat_IfHCInUcastPkts = 18156296, stat_IfHCInMulticastPkts = 0, stat_IfHCInBroadcastPkts = 59,
  stat_IfHCOutUcastPkts = 19297495, stat_IfHCOutMulticastPkts = 0, stat_IfHCOutBroadcastPkts = 3,
  stat_emac_tx_stat_dot3statsinternalmactransmiterrors = 0, stat_Dot3StatsCarrierSenseErrors = 0, stat_Dot3StatsFCSErrors = 0,
  stat_Dot3StatsAlignmentErrors = 0, stat_Dot3StatsSingleCollisionFrames = 0, stat_Dot3StatsMultipleCollisionFrames = 0,
  stat_Dot3StatsDeferredTransmissions = 0, stat_Dot3StatsExcessiveCollisions = 0, stat_Dot3StatsLateCollisions = 0, stat_EtherStatsCollisions = 0,
  stat_EtherStatsFragments = 0, stat_EtherStatsJabbers = 0, stat_EtherStatsUndersizePkts = 0, stat_EtherStatsOversizePkts = 0,
  stat_EtherStatsPktsRx64Octets = 9279747, stat_EtherStatsPktsRx65Octetsto127Octets = 7403467, stat_EtherStatsPktsRx128Octetsto255Octets = 229375,
  stat_EtherStatsPktsRx256Octetsto511Octets = 1168297, stat_EtherStatsPktsRx512Octetsto1023Octets = 10473,
  stat_EtherStatsPktsRx1024Octetsto1522Octets = 64996, stat_EtherStatsPktsRx1523Octetsto9022Octets = 0, stat_EtherStatsPktsTx64Octets = 2279418,
  stat_EtherStatsPktsTx65Octetsto127Octets = 1215338, stat_EtherStatsPktsTx128Octetsto255Octets = 78344,
  stat_EtherStatsPktsTx256Octetsto511Octets = 388296, stat_EtherStatsPktsTx512Octetsto1023Octets = 771841,
  stat_EtherStatsPktsTx1024Octetsto1522Octets = 14564261, stat_EtherStatsPktsTx1523Octetsto9022Octets = 0, stat_XonPauseFramesReceived = 0,
  stat_XoffPauseFramesReceived = 0, stat_OutXonSent = 0, stat_OutXoffSent = 0, stat_FlowControlDone = 0, stat_MacControlFramesReceived = 0,
  stat_XoffStateEntered = 0, stat_IfInFramesL2FilterDiscards = 175, stat_IfInRuleCheckerDiscards = 0, stat_IfInFTQDiscards = 0,
  stat_IfInMBUFDiscards = 0, stat_IfInRuleCheckerP4Hit = 0, stat_CatchupInRuleCheckerDiscards = 0, stat_CatchupInFTQDiscards = 0,
  stat_CatchupInMBUFDiscards = 0, stat_CatchupInRuleCheckerP4Hit = 0, com_no_buffers = 10335, mbuf_alloc_failed_count = 0, fragmented_mbuf_count = 0,
  unexpected_attention_count = 0, l2fhdr_error_count = 0, dma_map_addr_tx_failed_count = 0, dma_map_addr_rx_failed_count = 0, hc_command = 1572865}
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2011-02-25 02:17:28 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-net

Over to maintainer(s).
Comment 2 Konstantin 2011-03-01 13:12:47 UTC
Looks like 131209 bug report has the same root cause with 155004
http://www.freebsd.org/cgi/query-pr.cgi?pr=131209


--
Konstantin Malov
External Services Group system administrator
Kaspersky Lab. - Russian Federation
Tel: +7(495)797-8700 (ext. 2867)
http://www.kaspersky.com<http://www.kaspersky.com/>

Comment 3 Pyun YongHyeon freebsd_committer freebsd_triage 2011-12-12 19:26:05 UTC
State Changed
From-To: open->feedback

As you see, this backtrace looks wrong. It can't generate a NULL 
pointer dereference at line 5771.  Are you using stock bce(4) 
without any changes? 
Probably the m0 could be a NULL and it could be dereferenced later. 
Could you go frame 8 and do the following? 
p m0 
p sw_rx_cons_idx 
p sc->rx_mbuf_ptr[sw_rx_cons_idx] 

By chance, did the panic happen when you reboot your box or down/up 
the interface?  Or if you happen to know a way to reproduce the 
issue could you share it with us? 


Comment 4 Pyun YongHyeon freebsd_committer freebsd_triage 2011-12-12 19:26:05 UTC
Responsible Changed
From-To: freebsd-net->yongari

Grab.
Comment 5 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:59:15 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 6 Graham Perrin freebsd_committer freebsd_triage 2022-10-17 12:18:03 UTC
Keyword: 

    crash

– in lieu of summary line prefix: 

    [panic]

* bulk change for the keyword
* summary lines may be edited manually (not in bulk). 

Keyword descriptions and search interface: 

    <https://bugs.freebsd.org/bugzilla/describekeywords.cgi>