I can easily reproduce the bug: Using a samba share, I calculate a MD5 checksum on 4000 pictures. The network freezes around the 3000 picture. The hardware: ASRock C2550D4i http://www.asrockrack.com/general/productdetail.asp?Model=C2550D4I#Specifications FreeBSD 10.1-RELEASE-p10 FreeBSD 10.1-RELEASE-p10 #0 r282897M: Thu May 14 15:53:59 CEST 2015 root@dev.nas4free.org:/usr/obj/nas4free/usr/src/sys/NAS4FREE-amd64 amd64 dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.4.0 dev.igb.0.%driver: igb dev.igb.0.%location: slot=0 function=0 dev.igb.0.%pnpinfo: vendor=0x8086 device=0x1533 subvendor=0x1849 subdevice=0x1533 class=0x020000 dev.igb.0.%parent: pci7 dev.igb.0.nvm: -1 dev.igb.0.enable_aim: 1 dev.igb.0.fc: 3 dev.igb.0.rx_processing_limit: 100 dev.igb.0.dmac: 0 dev.igb.0.eee_disabled: 0 dev.igb.0.link_irq: 4 dev.igb.0.dropped: 0 dev.igb.0.tx_dma_fail: 0 dev.igb.0.rx_overruns: 0 dev.igb.0.watchdog_timeouts: 1 dev.igb.0.device_control: 1075577409 dev.igb.0.rx_control: 71335938 dev.igb.0.interrupt_mask: 4 dev.igb.0.extended_int_mask: 2147483679 dev.igb.0.tx_buf_alloc: 0 dev.igb.0.rx_buf_alloc: 0 dev.igb.0.fc_high_water: 31328 dev.igb.0.fc_low_water: 31312 dev.igb.0.queue0.no_desc_avail: 0 dev.igb.0.queue0.tx_packets: 943452 dev.igb.0.queue0.rx_packets: 1019555 dev.igb.0.queue0.rx_bytes: 1047787 dev.igb.0.queue0.lro_queued: 0 dev.igb.0.queue0.lro_flushed: 0 dev.igb.0.queue1.no_desc_avail: 0 dev.igb.0.queue1.tx_packets: 2967552 dev.igb.0.queue1.rx_packets: 3052154 dev.igb.0.queue1.rx_bytes: 990821619 dev.igb.0.queue1.lro_queued: 0 dev.igb.0.queue1.lro_flushed: 0 dev.igb.0.queue2.no_desc_avail: 0 dev.igb.0.queue2.tx_packets: 1444357 dev.igb.0.queue2.rx_packets: 1521528 dev.igb.0.queue2.rx_bytes: 581413 dev.igb.0.queue2.lro_queued: 0 dev.igb.0.queue2.lro_flushed: 0 dev.igb.0.queue3.no_desc_avail: 0 dev.igb.0.queue3.tx_packets: 6118751 dev.igb.0.queue3.rx_packets: 6677679 dev.igb.0.queue3.rx_bytes: 579126 dev.igb.0.queue3.lro_queued: 0 dev.igb.0.queue3.lro_flushed: 0 dev.igb.0.mac_stats.excess_coll: 0 dev.igb.0.mac_stats.single_coll: 0 dev.igb.0.mac_stats.multiple_coll: 0 dev.igb.0.mac_stats.late_coll: 0 dev.igb.0.mac_stats.collision_count: 0 dev.igb.0.mac_stats.symbol_errors: 0 dev.igb.0.mac_stats.sequence_errors: 0 dev.igb.0.mac_stats.defer_count: 0 dev.igb.0.mac_stats.missed_packets: 0 dev.igb.0.mac_stats.recv_no_buff: 0 dev.igb.0.mac_stats.recv_undersize: 0 dev.igb.0.mac_stats.recv_fragmented: 0 dev.igb.0.mac_stats.recv_oversize: 0 dev.igb.0.mac_stats.recv_jabber: 0 dev.igb.0.mac_stats.recv_errs: 0 dev.igb.0.mac_stats.crc_errs: 0 dev.igb.0.mac_stats.alignment_errs: 0 dev.igb.0.mac_stats.coll_ext_errs: 0 dev.igb.0.mac_stats.xon_recvd: 0 dev.igb.0.mac_stats.xon_txd: 0 dev.igb.0.mac_stats.xoff_recvd: 0 dev.igb.0.mac_stats.xoff_txd: 0 dev.igb.0.mac_stats.total_pkts_recvd: 12320459 dev.igb.0.mac_stats.good_pkts_recvd: 12320241 dev.igb.0.mac_stats.bcast_pkts_recvd: 106770 dev.igb.0.mac_stats.mcast_pkts_recvd: 74526 dev.igb.0.mac_stats.rx_frames_64: 89382 dev.igb.0.mac_stats.rx_frames_65_127: 4036945 dev.igb.0.mac_stats.rx_frames_128_255: 478553 dev.igb.0.mac_stats.rx_frames_256_511: 56903 dev.igb.0.mac_stats.rx_frames_512_1023: 64832 dev.igb.0.mac_stats.rx_frames_1024_1522: 7593626 dev.igb.0.mac_stats.good_octets_recvd: 11970320160 dev.igb.0.mac_stats.good_octets_txd: 12279791794 dev.igb.0.mac_stats.total_pkts_txd: 18512524 dev.igb.0.mac_stats.good_pkts_txd: 18512524 dev.igb.0.mac_stats.bcast_pkts_txd: 1043 dev.igb.0.mac_stats.mcast_pkts_txd: 4064 dev.igb.0.mac_stats.tx_frames_64: 7877 dev.igb.0.mac_stats.tx_frames_65_127: 10769674 dev.igb.0.mac_stats.tx_frames_128_255: 63733 dev.igb.0.mac_stats.tx_frames_256_511: 29070 dev.igb.0.mac_stats.tx_frames_512_1023: 54471 dev.igb.0.mac_stats.tx_frames_1024_1522: 7587699 dev.igb.0.mac_stats.tso_txd: 647779 dev.igb.0.mac_stats.tso_ctx_fail: 0 dev.igb.0.interrupts.asserts: 8356963 dev.igb.0.interrupts.rx_pkt_timer: 12269756 dev.igb.0.interrupts.rx_abs_timer: 0 dev.igb.0.interrupts.tx_pkt_timer: 0 dev.igb.0.interrupts.tx_abs_timer: 0 dev.igb.0.interrupts.tx_queue_empty: 18492713 dev.igb.0.interrupts.tx_queue_min_thresh: 12270651 dev.igb.0.interrupts.rx_desc_min_thresh: 0 dev.igb.0.interrupts.rx_overrun: 0 dev.igb.0.host.breaker_tx_pkt: 0 dev.igb.0.host.host_tx_pkt_discard: 0 dev.igb.0.host.rx_pkt: 895 dev.igb.0.host.breaker_rx_pkts: 0 dev.igb.0.host.breaker_rx_pkt_drop: 0 dev.igb.0.host.tx_good_pkt: 571 dev.igb.0.host.breaker_tx_pkt_drop: 0 dev.igb.0.host.rx_good_bytes: 11959919100 dev.igb.0.host.tx_good_bytes: 12277353640 dev.igb.0.host.length_errors: 0 dev.igb.0.host.serdes_violation_pkt: 0 dev.igb.0.host.header_redir_missed: 0
I can report the same for 10.3-RELEASE-p4 and stock kernel. # cat /boot/loader.conf: kern.geom.label.gptid.enable="0" kern.ipc.nmbclusters="1000000" hw.pci.enable_msi="1" hw.pci.enable_msix="0" zfs_load="YES" aesni_load="YES" ichsmb_load="YES" ipmi_load="YES" beastie_disable="YES" autoboot_delay="3" # pciconf -lcv igb0@pci0:2:0:0: class=0x020000 card=0x153315d9 chip=0x15338086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I210 Gigabit Network Connection' class = network subclass = ethernet cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks enabled with 1 message cap 11[70] = MSI-X supports 5 messages Table in map 0x1c[0x0], PBA in map 0x1c[0x2000] cap 10[a0] = PCI-Express 2 endpoint max data 128(512) FLR RO NS link x1(x1) speed 2.5(2.5) ASPM L1(L0s/L1) ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected ecap 0003[140] = Serial 1 002590ffffxxxxx2 ecap 0017[1a0] = TPH Requester 1 igb1@pci0:5:0:0: class=0x020000 card=0x153315d9 chip=0x15338086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I210 Gigabit Network Connection' class = network subclass = ethernet cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks enabled with 1 message cap 11[70] = MSI-X supports 5 messages Table in map 0x1c[0x0], PBA in map 0x1c[0x2000] cap 10[a0] = PCI-Express 2 endpoint max data 128(512) FLR RO NS link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1) ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected ecap 0003[140] = Serial 1 002590ffffxxxxx3 ecap 0017[1a0] = TPH Requester 1 This configuration is fairly stable (1h so far, no watchdog timeouts on igb1). Disabling MSI altogether causes igb1 that doesn't pass any traffic and doesn't even come up. Disabling MSI-X seems to help. This is a SuperMicro X11SBA-F board (http://www.supermicro.com/products/motherboard/X11/X11SBA-F.cfm) with BIOS 1.0b (latest).
Spoke too soon, disabling MSI-X seems to help only marginally: Jun 16 13:29:49 <kern.crit> fw1 kernel: igb1: Watchdog timeout -- resetting Jun 16 13:29:49 <kern.crit> fw1 kernel: igb1: Queue(846295657) tdh = -1249464976, hw tdt = 589458993 Jun 16 13:29:49 <kern.crit> fw1 kernel: igb1: TX(846295657) desc avail = 0,Next TX to Clean = 0 Jun 16 13:29:49 <kern.notice> fw1 kernel: igb1: link state changed to DOWN Jun 16 13:29:53 <kern.notice> fw1 kernel: igb1: link state changed to UP Jun 16 13:29:53 <user.notice> fw1 devd: Executing '/etc/rc.d/dhclient quietstart igb1' Jun 16 13:34:26 <kern.crit> fw1 kernel: igb1: Watchdog timeout -- resetting Jun 16 13:34:26 <kern.crit> fw1 kernel: igb1: Queue(846295657) tdh = -1249464976, hw tdt = 589458993 Jun 16 13:34:26 <kern.crit> fw1 kernel: igb1: TX(846295657) desc avail = 0,Next TX to Clean = 0 Jun 16 13:34:26 <kern.notice> fw1 kernel: igb1: link state changed to DOWN Jun 16 13:34:31 <kern.notice> fw1 kernel: igb1: link state changed to UP Jun 16 13:34:31 <user.notice> fw1 devd: Executing '/etc/rc.d/dhclient quietstart igb1'
Possibly related to #200221
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM> igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM> I have TSO, VLAN_HWTSO and LRO disabled.
some PCI-E errors: pcib4@pci0:3:0:0: class=0x060400 card=0x00000000 chip=0x260812d8 rev=0x00 hdr=0x01 vendor = 'Pericom Semiconductor' class = bridge subclass = PCI-PCI PCI-e errors = Correctable Error Detected Unsupported Request Detected Corrected = Receiver Error Bad TLP Bad DLLP REPLAY_NUM Rollover Replay Timer Timeout Advisory Non-Fatal Error igb0@pci0:2:0:0: class=0x020000 card=0x153315d9 chip=0x15338086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I210 Gigabit Network Connection' class = network subclass = ethernet Corrected = Advisory Non-Fatal Error igb1@pci0:5:0:0: class=0x020000 card=0x153315d9 chip=0x15338086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I210 Gigabit Network Connection' class = network subclass = ethernet PCI-e errors = Correctable Error Detected Corrected = Advisory Non-Fatal Error
# netstat -m 2048/3772/5820 mbufs in use (current/cache/total) 2046/2514/4560/1000000 mbuf clusters in use (current/cache/total/max) 2046/2508 mbuf+clusters out of packet secondary zone in use (current/cache) 0/4/4/250101 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/74104 9k jumbo clusters in use (current/cache/total/max) 0/0/0/41683 16k jumbo clusters in use (current/cache/total/max) 4604K/5987K/10591K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile
There appears to be some indication that the X11SBA series has a hardware bug in a PCI-E controller supplying igb1-4 (not affecting igb1). See https://sourceforge.net/p/e1000/bugs/502/
*not affecting igb0
With respect to X11SBA-F board, I can confirm that the issue arises from hardware version 1.01 of the board and is gone with 1.02. The issue with 1.01 is not rectifiable by any EEPROM, BIOS or other firmware updates.
For the original submitter with the ASRock board, can you retry with -CURRENT?
This bug doesn't have enough information to go on as it stands. The driver has changed substantially in the time since the original submitter's report. If we can get a confirmation from them this should reopen, otherwise it's a new issue and should be reported in a new ticket.