Bug 193802

Summary: tso seems broken on RELENG10 for version 7.4.2 of em driver
Product: Base System Reporter: mike
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: Closed FIXED    
Severity: Affects Many People CC: Karli.Sjoberg, bsd, cstef, erj, fredrikb, jfv, meyer.sydney, sbruno
Priority: --- Keywords: IntelNetworking
Version: 10.0-STABLE   
Hardware: amd64   
OS: Any   

Description mike 2014-09-21 01:00:30 UTC
The latest version of the em driver seems to have broken tso support.  When using default nfs mount options, the client's nic will wedge and throw watchdog errors.  Reverting to the previous version of em, or disabling tso works around the issue.

The bug seems to effect at least two variants of the NIC

     vendor     = 'Intel Corporation'
     device     = '82574L Gigabit Network Connection'
     class      = network

and

em0 at pci0:0:25:0:        class=0x020000 card=0x34ec8086 chip=0x10ef8086 
rev=0x05 hdr=0x00
     vendor     = 'Intel Corporation'
     device     = '82578DM Gigabit Network Connection'


More discussion can be found in 
http://lists.freebsd.org/pipermail/freebsd-stable/2014-September/080081.html

To recreate the issue, on the nfs client mount an nfs share with default options. In my test case, it was a RELENG10 box with igb nics acting as the server
generate a number of tcp streams over the share.  running these two scripts at the same time will wedge the client nic in less then a few minutes


#!/bin/sh

while true
do
  dd if=/dev/urandom ibs=64k count=1000 | pbzip2 -c -p3 > /mnt/test.bz2
  dd if=/dev/urandom ibs=63k count=1000 | pbzip2 -c -p3 > /mnt/test.bz2
  dd if=/dev/urandom ibs=66k count=1000 | pbzip2 -c -p3 > /mnt/test.bz2
done
root at backup3:/usr/home/mdtancsa # cat i3
#!/bin/sh

while true
do
dd if=/dev/zero of=/mnt/test2 bs=128k count=2000
sleep 10
done


When wedged, the NIC shows

Interface is RUNNING and ACTIVE
em1: hw tdh = 343, hw tdt = 838
em1: hw rdh = 512, hw rdt = 511
em1: Tx Queue Status = 1
em1: TX descriptors avail = 516
em1: Tx Descriptors avail failure = 1
em1: RX discarded packets = 0
em1: RX Next to Check = 512
em1: RX Next to Refresh = 511

Occasionally, this error message will show up

em0: Watchdog timeout -- resetting
em0: Queue(0) tdh = 349, hw tdt = 176
em0: TX(0) desc avail = 173,Next TX to Clean = 349
em0: link state changed to DOWN
em0: link state changed to UP

At this point, ifconfig em0 down;ifconfig em0 up needs to be done.
work around is either to revert to the previous version of the driver, or
ifconfig em0 -tso

FreeBSD 10.1-BETA1 #10 r271466
on an Intel MB S3420GP
Latest BIOS 
Version: S3420GP.86B.01.00.0052.051620141338

Also tested on an AMD MB with a PCI-E NIC
Comment 1 Friedrich Volkmann 2015-02-25 17:19:43 UTC
Same here. I run into problems sending TCP data bigger than ~4 KB when the TCO option is enabled (which is the default!). See https://forums.freebsd.org/threads/error-sending-tcp-data-4kb.50431 for the details.

The problems occur with both an up-to-date i386 kernel and an up-to-date amd64 kernel (10.1-STABLE r278696).

dmesg reports:
em0: <Intel(R) PRO/1000 Network Connection 7.4.2> port 0xf080-0xf09f mem 0xf7f00000-0xf7f1ffff,0xf7f3c000-0xf7f3cfff irq 20 at device 25.0 on pci0
em0: Using an MSI interrupt

# ifconfig -v em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,VLAN_HWTSO>
        ether ...
        inet ... netmask ... broadcast ...
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active

This is an onboard LAN controller (I218-V according to the manual) on an MSI Z97S SLI Plus motherboard (Intel Z97 Express chipset).

The importance has already been set to "Affects Many People". This is probably true, and it is difficult for affected users to get help as they cannot send data. To make it even worse, the symptoms (error messages such as "The server may be unavailable or is refusing SMTP connections") do not reveal the real cause of the problem.
Comment 2 Karli Sjöberg 2015-02-26 11:32:12 UTC
Same here, on a Supermicro X9SRL-F. It would seem for us that iSCSI is more affected by this than NFS is:

Feb 23 12:18:20 <fileserver> kernel: WARNING: <ipaddress> (<iqn>): no ping reply (NOP-Out) after 5 seconds; dropping connection
Feb 23 12:18:20 <fileserver> kernel: em1: link state changed to UP
Feb 23 12:18:25 <fileserver> kernel: Feb 23 12:18:25 <fileserver> ctld[9389]: <ipaddress>: read: connection lost
Feb 23 13:26:55 <fileserver> kernel: WARNING: <ipaddress> (<iqn>): no ping reply (NOP-Out) after 5 seconds; dropping connection
Feb 23 13:26:56 <fileserver> kernel: em0: link state changed to UP
Feb 23 13:26:56 <fileserver> kernel: WARNING: <ipaddress> (<iqn>): no ping reply (NOP-Out) after 5 seconds; dropping connection
Feb 23 13:26:59 <fileserver> kernel: WARNING: <ipaddress> (<iqn>): connection error; dropping connection
Feb 23 14:46:23 <fileserver> kernel: WARNING: <ipaddress> (<iqn>): no ping reply (NOP-Out) after 5 seconds; dropping connection
Feb 23 14:46:23 <fileserver> kernel: em1: link state changed to UP
Feb 23 14:46:24 <fileserver> kernel: WARNING: <ipaddress> (<iqn>): no ping reply (NOP-Out) after 5 seconds; dropping connection
Feb 23 14:46:30 <fileserver> kernel: Feb 23 14:46:30 <fileserver> ctld[36377]: <ipaddress>: read: connection lost
Feb 23 15:13:40 <fileserver> kernel: em0: link state changed to UP
Feb 23 15:13:40 <fileserver> kernel: WARNING: <ipaddress> (<iqn>): no ping reply (NOP-Out) after 5 seconds; dropping connection
Feb 23 15:20:31 <fileserver> kernel: WARNING: <ipaddress> (<iqn>): no ping reply (NOP-Out) after 5 seconds; dropping connection
Feb 23 15:20:31 <fileserver> kernel: em1: link state changed to UP
Feb 23 15:20:31 <fileserver> kernel: WARNING: <ipaddress> (<iqn>): no ping reply (NOP-Out) after 5 seconds; dropping connection

Feb 22 03:01:36 <fileserver> kernel: em0: Watchdog timeout -- resetting
Feb 22 03:01:36 <fileserver> kernel: em0: Queue(0) tdh = 669, hw tdt = 623
Feb 22 03:01:36 <fileserver> kernel: em0: TX(0) desc avail = 32,Next TX to Clean = 655

# pciconf -lvcb em0
em0@pci0:9:0:0: class=0x020000 card=0x000015d9 chip=0x10d38086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82574L Gigabit Network Connection'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base rxfbe00000, size 131072, enabled
    bar   [18] = type I/O Port, range 32, base rxa000, size 32, enabled
    bar   [1c] = type Memory, range 32, base rxfbe20000, size 16384, enabled
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit 
    cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
                 speed 2.5(2.5) ASPM disabled(L0s/L1)
    cap 11[a0] = MSI-X supports 5 messages, enabled
                 Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
    ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
    ecap 0003[140] = Serial 1 0cc47affff0cc034

FreeBSD 10.1-STABLE #0 r278568M

Disabling TSO seems to have stabilized it, for now.

/K
Comment 3 Hiren Panchasara freebsd_committer freebsd_triage 2015-02-27 23:15:17 UTC
Moving to -net and CCing Jack and Eric from Intel.
Comment 4 mike 2015-02-28 15:16:04 UTC
https://lists.freebsd.org/pipermail/freebsd-stable/2014-September/080088.html
has some insights / discussion on it from Rick M
Comment 5 Sean Bruno freebsd_committer freebsd_triage 2015-06-30 16:29:41 UTC
I've committed and updated enhancements to the watchdog handler and significant error handlers for this specific chipset to em(4).  

In addition, the EM_MULTIQUEUE kernel conf configuration is available to turn on the 2 queues in the card.  If you feel like testing these, let me know.
Comment 6 Sean Bruno freebsd_committer freebsd_triage 2015-08-03 16:49:06 UTC
https://reviews.freebsd.org/D3192

I think we have a good fix for this problem, if you guys have time to validate my findings, please do.
Comment 7 Sean Bruno freebsd_committer freebsd_triage 2016-03-03 15:01:04 UTC
Please retest this issue with the 10.3 BETA version of the em(4) driver.

There's a bit of stuff around DMA that got sorted out finally and hopefully for good that should make this problem a solved one.
Comment 8 mike 2016-03-03 15:07:56 UTC
Thanks, I was working with marius@ to test on certain hardware and media configs and it seems to have resolved this issue. See the discussion in the thread
https://lists.freebsd.org/pipermail/freebsd-stable/2016-January/084028.html