Bug 195078

Summary: em tx_dma_fails and dropped packets
Product: Base System Reporter: DJ <fusionfoto>
Component: kernAssignee: freebsd-net mailing list <net>
Status: Closed Overcome By Events    
Severity: Affects Many People CC: fusionfoto, sbruno
Priority: Normal Keywords: IntelNetworking, easy, regression
Version: 9.2-RELEASE   
Hardware: Any   
OS: Any   
URL: http://www.intel.com.au/content/dam/www/public/us/en/documents/specification-updates/82574-gbe-controller-spec-update.pdf

Description DJ 2014-11-16 19:29:11 UTC
It looks like FreeBSD may be a victim of this bug. This likely affects all FreeBSD versions that have defaulted to a higher dev.em.rxd, which could be several. 

I've turned tso on my running machine because I didn't want to reboot which solved one set of problems, and then had to increase the rx_processing threshold to hopefully solve the remaining packet drops. 

I have another couple of machines scheduled to reboot with dev.em.rxd/txd set to 256 which I think is the old value, and hopefully I'll be able to set the rest of the sysctls back to normal.

Hope this helps.





17. Tx Data Corruption When Using TCP Segmentation Offload

Problem: When using TSO, a situation can occur where a PCIe MRd request is repeated with the

same address, resulting in data corruption. At the end of the TCP packet, the Tx DMA

hangs because the length doesn't match. This can only occur when the following are


• The first buffer of the packet is larger than [3 * (max_read_request - 4)].

• There is a 4 KB boundary within 64 bytes following the end of the header bytes in

the buffer

Implication: Possible data corruption since a TCP packet is transmitted containing the wrong data but

with the correct checksum.

Data transmission halts as the Tx DMA module enters a hang state.

Workaround: The failure can be avoided by ensuring at least one of the following:

• The buffer containing the headers should not be larger than [3 *

(max_read_request - 4)]. To meet this requirement even for the minimum value of

128 bytes for max_read_request, the buffer should not be larger than 372 bytes.

• The alignment of the buffer containing the headers should be such that there is no

4 KB boundary within 64 bytes following the end of the header bytes. Assuming

standard Ethernet/IP/TCP headers of 54 bytes, this means that the buffer should

not start 54-118 bytes before a 4 KB boundary. For example, 128-byte alignment

for this buffer could be used to fulfill this condition.

This problem has not been reported when using an Intel Linux* or Windows* drivers.

Current analysis shows it is very unlikely for a situation to exist that would cause the

82574 to be at risk for the errata when using the Intel Linux or Windows drivers.

Linux and other distros seem to have fixed it. This could be getting exercised because FreeBSD recently changed the default buffer size above 256 for this driver.

**** my comments below ****

Since I didn't want to reboot to try the lower buffer size, I turned off TSO on all the machines that I'd checked that were actively incrementing tx_dma_fail for em interfaces then re-enabled their membership into the LACP.

In brief testing, (few gigabits for a few minutes) tx_dma_fail has not incremented and throughput has not been negatively impacted (before vs after re-enable).

On Thu, Nov 13, 2014 at 1:52 PM, FF <fusionfoto@gmail.com> wrote:

    What knob do I need to turn to address this?

    This em0 is in an LACP bundle with an igb0 that isn't showing this problem.

    dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.3.8
    dev.em.0.%driver: em
    dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.GLAN
    dev.em.0.%pnpinfo: vendor=0x8086 device=0x153b subvendor=0x15d9 subdevice=0x153b class=0x020000
    dev.em.0.%parent: pci0
    dev.em.0.nvm: -1
    dev.em.0.debug: -1
    dev.em.0.fc: 3
    dev.em.0.rx_int_delay: 0
    dev.em.0.tx_int_delay: 66
    dev.em.0.rx_abs_int_delay: 66
    dev.em.0.tx_abs_int_delay: 66
    dev.em.0.itr: 488
    dev.em.0.rx_processing_limit: 100
    dev.em.0.eee_control: 1
    dev.em.0.link_irq: 0
    dev.em.0.mbuf_alloc_fail: 52
    dev.em.0.cluster_alloc_fail: 0
    dev.em.0.dropped: 0
    dev.em.0.tx_dma_fail: 1834648
    dev.em.0.rx_overruns: 3109
    dev.em.0.watchdog_timeouts: 0
    dev.em.0.device_control: 1209532992
    dev.em.0.rx_control: 67141634
    dev.em.0.fc_high_water: 23584
    dev.em.0.fc_low_water: 20552
    dev.em.0.queue0.txd_head: 577
    dev.em.0.queue0.txd_tail: 577
    dev.em.0.queue0.tx_irq: 0
    dev.em.0.queue0.no_desc_avail: 0
    dev.em.0.queue0.rxd_head: 967
    dev.em.0.queue0.rxd_tail: 966
    dev.em.0.queue0.rx_irq: 0
    dev.em.0.mac_stats.excess_coll: 0
    dev.em.0.mac_stats.single_coll: 0
    dev.em.0.mac_stats.multiple_coll: 0
    dev.em.0.mac_stats.late_coll: 0
    dev.em.0.mac_stats.collision_count: 0
    dev.em.0.mac_stats.symbol_errors: 0
    dev.em.0.mac_stats.sequence_errors: 0
    dev.em.0.mac_stats.defer_count: 0
    dev.em.0.mac_stats.missed_packets: 61094
    dev.em.0.mac_stats.recv_no_buff: 60008
    dev.em.0.mac_stats.recv_undersize: 0
    dev.em.0.mac_stats.recv_fragmented: 0
    dev.em.0.mac_stats.recv_oversize: 0
    dev.em.0.mac_stats.recv_jabber: 0
    dev.em.0.mac_stats.recv_errs: 0
    dev.em.0.mac_stats.crc_errs: 0
    dev.em.0.mac_stats.alignment_errs: 0
    dev.em.0.mac_stats.coll_ext_errs: 0
    dev.em.0.mac_stats.xon_recvd: 40226659
    dev.em.0.mac_stats.xon_txd: 2132
    dev.em.0.mac_stats.xoff_recvd: 40241216
    dev.em.0.mac_stats.xoff_txd: 2073563
    dev.em.0.mac_stats.total_pkts_recvd: 3219537541
    dev.em.0.mac_stats.good_pkts_recvd: 3139008594
    dev.em.0.mac_stats.bcast_pkts_recvd: 3953817
    dev.em.0.mac_stats.mcast_pkts_recvd: 607157
    dev.em.0.mac_stats.rx_frames_64: 0
    dev.em.0.mac_stats.rx_frames_65_127: 0
    dev.em.0.mac_stats.rx_frames_128_255: 0
    dev.em.0.mac_stats.rx_frames_256_511: 0
    dev.em.0.mac_stats.rx_frames_512_1023: 0
    dev.em.0.mac_stats.rx_frames_1024_1522: 0
    dev.em.0.mac_stats.good_octets_recvd: 3527296369841
    dev.em.0.mac_stats.good_octets_txd: 14348531993101
    dev.em.0.mac_stats.total_pkts_txd: 10735190291
    dev.em.0.mac_stats.good_pkts_txd: 10733114595
    dev.em.0.mac_stats.bcast_pkts_txd: 14
    dev.em.0.mac_stats.mcast_pkts_txd: 54334
    dev.em.0.mac_stats.tx_frames_64: 0
    dev.em.0.mac_stats.tx_frames_65_127: 0
    dev.em.0.mac_stats.tx_frames_128_255: 0
    dev.em.0.mac_stats.tx_frames_256_511: 0
    dev.em.0.mac_stats.tx_frames_512_1023: 0
    dev.em.0.mac_stats.tx_frames_1024_1522: 0
    dev.em.0.mac_stats.tso_txd: 902605586
    dev.em.0.mac_stats.tso_ctx_fail: 0
    dev.em.0.interrupts.asserts: 1392541431
    dev.em.0.interrupts.rx_pkt_timer: 0
    dev.em.0.interrupts.rx_abs_timer: 0
    dev.em.0.interrupts.tx_pkt_timer: 0
    dev.em.0.interrupts.tx_abs_timer: 0
    dev.em.0.interrupts.tx_queue_empty: 0
    dev.em.0.interrupts.tx_queue_min_thresh: 0
    dev.em.0.interrupts.rx_desc_min_thresh: 0
    dev.em.0.interrupts.rx_overrun: 0
    dev.em.0.wake: 0

    dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.10
    dev.igb.0.%driver: igb
    dev.igb.0.%location: slot=0 function=0 handle=\_SB_.PCI0.RP04.PXSX
    dev.igb.0.%pnpinfo: vendor=0x8086 device=0x1533 subvendor=0x15d9 subdevice=0x1533 class=0x020000
    dev.igb.0.%parent: pci5
    dev.igb.0.nvm: -1
    dev.igb.0.enable_aim: 1
    dev.igb.0.fc: 3
    dev.igb.0.rx_processing_limit: 100
    dev.igb.0.dmac: 0
    dev.igb.0.eee_disabled: 0
    dev.igb.0.link_irq: 33
    dev.igb.0.dropped: 0
    dev.igb.0.tx_dma_fail: 0
    dev.igb.0.rx_overruns: 0
    dev.igb.0.watchdog_timeouts: 0
    dev.igb.0.device_control: 1209795137
    dev.igb.0.rx_control: 71335938
    dev.igb.0.interrupt_mask: 4
    dev.igb.0.extended_int_mask: 2147483679
    dev.igb.0.tx_buf_alloc: 0
    dev.igb.0.rx_buf_alloc: 0
    dev.igb.0.fc_high_water: 31328
    dev.igb.0.fc_low_water: 31312
    dev.igb.0.queue0.no_desc_avail: 0
    dev.igb.0.queue0.tx_packets: 62464141
    dev.igb.0.queue0.rx_packets: 73012939
    dev.igb.0.queue0.rx_bytes: 22529663814
    dev.igb.0.queue0.lro_queued: 0
    dev.igb.0.queue0.lro_flushed: 0
    dev.igb.0.queue1.no_desc_avail: 0
    dev.igb.0.queue1.tx_packets: 404298046
    dev.igb.0.queue1.rx_packets: 307675818
    dev.igb.0.queue1.rx_bytes: 185919902229
    dev.igb.0.queue1.lro_queued: 0
    dev.igb.0.queue1.lro_flushed: 0
    dev.igb.0.queue2.no_desc_avail: 0
    dev.igb.0.queue2.tx_packets: 3441053015
    dev.igb.0.queue2.rx_packets: 5511826751
    dev.igb.0.queue2.rx_bytes: 3054219311510
    dev.igb.0.queue2.lro_queued: 0
    dev.igb.0.queue2.lro_flushed: 0
    dev.igb.0.queue3.no_desc_avail: 0
    dev.igb.0.queue3.tx_packets: 1047838830
    dev.igb.0.queue3.rx_packets: 1987495318
    dev.igb.0.queue3.rx_bytes: 2696179247028
    dev.igb.0.queue3.lro_queued: 0
    dev.igb.0.queue3.lro_flushed: 0
    dev.igb.0.mac_stats.excess_coll: 0
    dev.igb.0.mac_stats.single_coll: 0
    dev.igb.0.mac_stats.multiple_coll: 0
    dev.igb.0.mac_stats.late_coll: 0
    dev.igb.0.mac_stats.collision_count: 0
    dev.igb.0.mac_stats.symbol_errors: 0
    dev.igb.0.mac_stats.sequence_errors: 0
    dev.igb.0.mac_stats.defer_count: 283811
    dev.igb.0.mac_stats.missed_packets: 9449
    dev.igb.0.mac_stats.recv_no_buff: 340
    dev.igb.0.mac_stats.recv_undersize: 0
    dev.igb.0.mac_stats.recv_fragmented: 0
    dev.igb.0.mac_stats.recv_oversize: 0
    dev.igb.0.mac_stats.recv_jabber: 0
    dev.igb.0.mac_stats.recv_errs: 0
    dev.igb.0.mac_stats.crc_errs: 0
    dev.igb.0.mac_stats.alignment_errs: 0
    dev.igb.0.mac_stats.coll_ext_errs: 0
    dev.igb.0.mac_stats.xon_recvd: 46255557
    dev.igb.0.mac_stats.xon_txd: 261
    dev.igb.0.mac_stats.xoff_recvd: 46255994
    dev.igb.0.mac_stats.xoff_txd: 7027
    dev.igb.0.mac_stats.total_pkts_recvd: 7975033582
    dev.igb.0.mac_stats.good_pkts_recvd: 7880001465
    dev.igb.0.mac_stats.bcast_pkts_recvd: 5783868
    dev.igb.0.mac_stats.mcast_pkts_recvd: 563315
    dev.igb.0.mac_stats.rx_frames_64: 28412906
    dev.igb.0.mac_stats.rx_frames_65_127: 3310187919
    dev.igb.0.mac_stats.rx_frames_128_255: 784920450
    dev.igb.0.mac_stats.rx_frames_256_511: 17225962
    dev.igb.0.mac_stats.rx_frames_512_1023: 73415350
    dev.igb.0.mac_stats.rx_frames_1024_1522: 3665838878
    dev.igb.0.mac_stats.good_octets_recvd: 5990356613544
    dev.igb.0.mac_stats.good_octets_txd: 46326753008181
    dev.igb.0.mac_stats.total_pkts_txd: 33016014138
    dev.igb.0.mac_stats.good_pkts_txd: 33016006850
    dev.igb.0.mac_stats.bcast_pkts_txd: 834
    dev.igb.0.mac_stats.mcast_pkts_txd: 54331
    dev.igb.0.mac_stats.tx_frames_64: 30741691
    dev.igb.0.mac_stats.tx_frames_65_127: 2174824217
    dev.igb.0.mac_stats.tx_frames_128_255: 139804927
    dev.igb.0.mac_stats.tx_frames_256_511: 59190261
    dev.igb.0.mac_stats.tx_frames_512_1023: 386886648
    dev.igb.0.mac_stats.tx_frames_1024_1522: 30224559106
    dev.igb.0.mac_stats.tso_txd: 2384636909
    dev.igb.0.mac_stats.tso_ctx_fail: 0
    dev.igb.0.interrupts.asserts: 4556119857
    dev.igb.0.interrupts.rx_pkt_timer: 7879778770
    dev.igb.0.interrupts.rx_abs_timer: 0
    dev.igb.0.interrupts.tx_pkt_timer: 0
    dev.igb.0.interrupts.tx_abs_timer: 0
    dev.igb.0.interrupts.tx_queue_empty: 33015268817
    dev.igb.0.interrupts.tx_queue_min_thresh: 7880001470
    dev.igb.0.interrupts.rx_desc_min_thresh: 0
    dev.igb.0.interrupts.rx_overrun: 0
    dev.igb.0.host.breaker_tx_pkt: 0
    dev.igb.0.host.host_tx_pkt_discard: 0
    dev.igb.0.host.rx_pkt: 222702
    dev.igb.0.host.breaker_rx_pkts: 0
    dev.igb.0.host.breaker_rx_pkt_drop: 0
    dev.igb.0.host.tx_good_pkt: 738033
    dev.igb.0.host.breaker_tx_pkt_drop: 0
    dev.igb.0.host.rx_good_bytes: 5990357073320
    dev.igb.0.host.tx_good_bytes: 46326753008181
    dev.igb.0.host.length_errors: 0
    dev.igb.0.host.serdes_violation_pkt: 0
    dev.igb.0.host.header_redir_missed: 0
    dev.igb.0.wake: 0

    hw.em.eee_setting: 1
    hw.em.rx_process_limit: 100
    hw.em.enable_msix: 1
    hw.em.sbp: 0
    hw.em.smart_pwr_down: 0
    hw.em.txd: 1024
    hw.em.rxd: 1024
    hw.em.rx_abs_int_delay: 66
    hw.em.tx_abs_int_delay: 66
    hw.em.rx_int_delay: 0
    hw.em.tx_int_delay: 66

    hw.igb.rx_process_limit: 100
    hw.igb.num_queues: 0
    hw.igb.header_split: 0
    hw.igb.buf_ring_size: 4096
    hw.igb.max_interrupt_rate: 8000
    hw.igb.enable_msix: 1
    hw.igb.enable_aim: 1
    hw.igb.txd: 1024
    hw.igb.rxd: 1024

    FreeBSD systemname.com 9.2-RELEASE-p10 FreeBSD 9.2-RELEASE-p10 #0 r270148M: Mon Aug 18 23:14:36 EDT 2014     root@peta108:/usr/obj/usr/src/sys/CUSTOM10  amd64

    em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
            ether 00:25:90:f2:2d:24
            inet6 fe80::225:90ff:fef2:2d24%em0 prefixlen 64 scopeid 0x2
            media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
    igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
            ether 00:25:90:f2:2d:24
            inet6 fe80::225:90ff:fef2:2d25%igb0 prefixlen 64 scopeid 0x4
            media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
    lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
            inet6 ::1 prefixlen 128
            inet6 fe80::1%lo0 prefixlen 64 scopeid 0x7
            inet netmask 0xff000000
            nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
            ether 00:25:90:f2:2d:24
            inet netmask 0xffffff00 broadcast
            inet6 fe80::225:90ff:fef2:2d24%lagg0 prefixlen 64 scopeid 0x8
            media: Ethernet autoselect
            status: active
            laggproto lacp lagghash l2,l3,l4
            laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
            laggport: em0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>

    Thanks in advance!
Comment 1 DJ 2014-11-16 19:44:34 UTC
** turned tso off to fix, not on -- just to clarify.
Comment 2 DJ 2014-11-21 23:20:06 UTC
... and verified that turning off rxcsum and txscum on the card stops the remaindered errors. 

On machines that have been rebooted with hw.em.rxd=256 and hw.em.txd=256, we have seen zero packet drops.
Comment 3 Sean Bruno freebsd_committer 2015-06-30 16:37:08 UTC
This errata has been implemented in the em(4) driver, the watchdog timer has been updated and significant code improvements have been implemented in the 82574 code since this ticket was filed.  Can you verify behavior is the same failure case or improved in 10.2r beta/stable 10 or current?
Comment 4 Sean Bruno freebsd_committer 2015-08-03 16:46:14 UTC
This issue should be resolved in the current versions of em(4)

If you still see a problem, try using the patch at https://reviews.freebsd.org/D3192
Comment 5 commit-hook freebsd_committer 2015-08-16 19:44:30 UTC
A commit references this bug:

Author: sbruno
Date: Sun Aug 16 19:43:45 UTC 2015
New revision: 286831
URL: https://svnweb.freebsd.org/changeset/base/286831

  Increase EM_MAX_SCATTER to 64 such that the size of em_xmit()::segs[EM_MAX_SCATTER]
  doesn't get overrun by things like NFS that can and do shove more than 32 segs when
  being used with em(4) and TSO4.

  Update tso handling code in em_xmit() with update from jhb@ in email thread:

  set ifp->if_hw_tsomax, ifp->if_hw_tsomaxsegcount & ifp->if_hw_tsomaxsegsize
  to appropriate values.

  Define a TSO workaround "magic" number  of 4 that is used to avoid an
  alignment issue in hardware.

  Change a couple of integer values that were used as booleans to actual
  bool types.

  Ensure that em_enable_intr() enables the appropriate mask of interrupts
  and not just a hardcoded define of values.

  PR:		200221 199174 195078
  Differential Revision:	https://reviews.freebsd.org/D3192
  Reviewed by:	erj jhb hiren
  MFC after:	2 weeks
  Sponsored by:	Limelight Networks

Comment 6 commit-hook freebsd_committer 2016-01-27 22:32:09 UTC
A commit references this bug:

Author: marius
Date: Wed Jan 27 22:31:09 UTC 2016
New revision: 294958
URL: https://svnweb.freebsd.org/changeset/base/294958

  Sync the e1000 drivers with what's in head as of r294327, modulo parts
  that don't apply to stable/10 (driver API, if_inc_counter(), RSS changes
  etc.) and modulo r287465 (which reportedly breaks igb(4)), i. e. assorted
  fixes and improvements only:

  o MFC r267385 (partial):
    - Don't compare bus_dma map pointers for static DMA allocations against
      NULL to determine if bus_dmamap_unload() or bus_dmamem_free() should be
      called. Instead, check the associated bus and virtual addresses.
    - Don't clear static DMA maps to NULL.
  o MFC r284933:
    Delete the refernce to VLAN handling being disabled by default. This is
    no longer the case. [1]
  o MFC r285639:
    Add an adapter CORE lock in the DDB hook em_dump_queue to avoid WITNESS
    panic in em_init_locked() while debugging.
  o MFC r285879:
    - Remove unused txd_saved.
    - Intialize txd_upper, txd_lower and txd_used at declaration.
  o MFC r286162:
    Free mbufs when busdma loading fails.
  o MFC r286829:
    Add capability to disable CRC stripping as it breaks IPMI/BMC capabilities
    on certain adatpers. [2]
  o MFC r286831: [3]
    - Increase EM_MAX_SCATTER to 64 such that the size of em_xmit()::
      segs[EM_MAX_SCATTER] doesn't get overrun by things like NFS that can
      and do shove more than 32 segs when being used with em(4) and TSO4.
    - Update tso handling code in em_xmit() with update from jhb@
    - Set if_hw_tsomax, if_hw_tsomaxsegcount and if_hw_tsomaxsegsize to
      appropriate values.
    - Define a TSO workaround "magic" number of 4 that is used to avoid an
      alignment issue in hardware.
    - Change a couple of integer values that were used as booleans to actual
      bool types.
    - Ensure that em_enable_intr() enables the appropriate mask of interrupts
      and not just a hardcoded define of values.
  o MFC r286832:
    e1000/if_lem.c bump to 1.1.0
  o MFC r286833:
    Bump all copywrite dates to 2015.
  o MFC r287112:
    Style/whitespace cleanup in shared/common code.
  o MFC r293331:
    - Switch em(4) to the extended RX descriptor format.
    - Split rxbuffer and txbuffer apart to support the new RX descriptor
      format structures. Move rxbuffer manipulation to em_setup_rxdesc() to
      unify the new behavior changes.
    - Add a RSSKEYLEN macro for help in generating the RSSKEY data structures
      in the card.
    - Change em_receive_checksum() to process the new rxdescriptor format
      status bit.
  o MFC r293332:
    Disable the reuse of checksum offload context descriptors in the case
    of multiple queues in em(4). Document errata in the code.
  o MFC r293854:
    Given that em(4), lem(4) and igb(4) hardware doesn't require the
    alignment guarantees provided by m_defrag(9), use m_collapse(9)
    instead for performance reasons.
    While at it, sanitize the statistics softc members, i. e. retire
    unused ones and add SYSCTL nodes missing for actually used ones.

  PR:	118693 [1], 161277 [2], 195078 [3], 199174 [3], 200221 [3]

_U  stable/10/