Bug 126688

Summary: [ixgbe] [patch] 1.4.7 ixgbe driver panic with 4GB and PAE
Product: Base System Reporter: lampa
Component: kernAssignee: jfv
Status: Closed Overcome By Events    
Severity: Affects Only Me CC: sbruno
Priority: Normal Keywords: IntelNetworking
Version: Unspecified   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff
none
file.diff
none
file.diff
none
file.diff none

Description lampa 2008-08-20 17:40:01 UTC
Initialization of ixgbe buffers in ixgbe_get_buf() fails with
errno=ENOMEM from bus_dmamap_load_mbuf_sg() and this error is ignored
in ixgbe_setup_receive_ring(). When this happens, some buffer pointers
are null and later system panics in ixgbe_rxeof():

        mp = rxr->rx_buffers[i].m_head;    /* m_head is NULL!!! */
        mp->m_len = mp->m_pkthdr.len =     /* <--- PANIC HERE */
                (rxr->rx_buffers[i].bigbuf ? MJUMPAGESIZE:MCLBYTES);

Fix: Change this line:
    if (ixgbe_get_buf(rxr, j) == ENOBUFS) {
to
    if (ixgbe_get_buf(rxr, j) != 0) {

I've also noticed some useless assignments and one incorrect comment,
so the full patch contains some other small fixes:

/* Next allocate the RX */
        if (!(adapter->rx_rings =
--- 2106,2111 ----
***************
/* For the ring itself */
        tsize = roundup2(adapter->num_tx_desc *
--- 2115,2120 ----
***************
!               /* Initialize the TX side lock */
                snprintf(name_string, sizeof(name_string), "%s:rx(%d)",
                    device_get_nameunit(dev), rxr->me);
                mtx_init(&rxr->rx_mtx, name_string, NULL, MTX_DEF);
--- 2167,2173 ----
                rxr->adapter = adapter;
                rxr->me = i;

!               /* Initialize the RX side lock */
                snprintf(name_string, sizeof(name_string), "%s:rx(%d)",
                    device_get_nameunit(dev), rxr->me);
                mtx_init(&rxr->rx_mtx, name_string, NULL, MTX_DEF);
***************
!       for (i = 0; i < adapter->num_rx_desc; i++, rxbuf++) {
                rxbuf = &rxr->rx_buffers[i];
                error = bus_dmamap_create(rxr->rxtag[0],
                    BUS_DMA_NOWAIT, &rxbuf->map[0]);
--- 2916,2922 ----
                goto fail;
        }

!       for (i = 0; i < adapter->num_rx_desc; i++) {
                rxbuf = &rxr->rx_buffers[i];
                error = bus_dmamap_create(rxr->rxtag[0],
                    BUS_DMA_NOWAIT, &rxbuf->map[0]);
***************


for (j = 0; j < adapter->num_rx_desc; j++) {
!               if (ixgbe_get_buf(rxr, j) == ENOBUFS) {
                    rxr->rx_buffers[j].m_head = NULL;
                    rxr->rx_base[j].read.pkt_addr = 0;
                    /* If we fail some may have change size */
--- 2979,2985 ----
        }

        for (j = 0; j < adapter->num_rx_desc; j++) {
!               if (ixgbe_get_buf(rxr, j) != 0) {
                    rxr->rx_buffers[j].m_head = NULL;
                    rxr->rx_base[j].read.pkt_addr = 0;
                    /* If we fail some may have change size */

With this patch it correctly prints to syslog:

   ix0: Could not setup receive structures

and doesn't panic. However this patch doesn't handle the primary problem,
why bus_dmamap_load_mbuf_sg() fails with ENOMEM (probably 256*4 bounce
buffers is too much or something like that).--kbwOiPDsoA0uAZYRLJXH2PuQgc9ZQIzPgkIXjszq5C4ds0WU
Content-Type: text/plain; name="file.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="file.diff"

*** ixgbe.c     Wed Jul 30 20:15:18 2008
--- ixgbe.c.new Wed Aug 20 18:34:44 2008
***************
*** 2106,2112 ****
                error = ENOMEM;
                goto fail;
        }
-       txr = adapter->tx_rings;
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2008-08-21 10:34:08 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-net

Over to maintainer(s).
Comment 2 Andre Oppermann freebsd_committer freebsd_triage 2010-08-23 18:59:34 UTC
Responsible Changed
From-To: freebsd-net->jfv

Over to maintainer.
Comment 3 Sean Bruno freebsd_committer freebsd_triage 2015-06-30 16:22:10 UTC
ixgbe(4) doesn't seem to have the code referenced in the attached series of diffs any longer.  If possible, please retest 10.2r or stable/10 in a PAE configuration and reopen this ticket.