Bug 251267 - iflib/em Intel I217-LM NIC : sporadic connection loss with 12-STABLE
Summary: iflib/em Intel I217-LM NIC : sporadic connection loss with 12-STABLE
Status: Closed Feedback Timeout
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-net (Nobody)
URL:
Keywords: IntelNetworking
Depends on:
Blocks:
 
Reported: 2020-11-20 10:21 UTC by O. Hartmann
Modified: 2023-02-09 06:56 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description O. Hartmann 2020-11-20 10:21:18 UTC
OS: FreeBSD 12.2-STABLE #56 r367863: Fri Nov 20 08:12:33 CET 2020 amd64

The box in question is a Fujitsu Celsius M750 system, dated 2015. Since the first occurence of iflib() in FreeBSD and the related new design, this specific chipset seems to have massive problems keeping up a connection open via ssh. Since then, the problem is present up to this day. We swapped all NICs of that type, if possible, with dedicated NIC boards carrying two i350 ports (igb) and the problem has been mitigated so far (still sporadic connection losses, but very rare). 
For a reference, we used also some Xubunto hosts with 20.04 LTS on the same box connectiong to the very same hosts and thoese connections were rock stable and lasted days. Even connections with the igb/i350 chipsets lasted sometimes for days, but not i217-LM.

Below some hardware specific informations for the i217-LM:

pciconf -lvbc:

em0@pci0:0:25:0:        class=0x020000 card=0x11ed1734 chip=0x153a8086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I217-LM'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base rxfb400000, size 131072, enabled
    bar   [14] = type Memory, range 32, base rxfb439000, size 4096, enabled
    bar   [18] = type I/O Port, range 32, base rxf020, size 32, enabled
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 13[e0] = PCI Advanced Features: FLR TP

dmesg extract:
pci1: <simple comms> at device 22.0 (no driver attached)
em0: <Intel(R) PRO/1000 Network Connection> port 0xf020-0xf03f mem 0xfb400000-0xfb41ffff,0xfb439000-0xfb439fff at device 25.0 numa-domain 0 on pci1
em0: Using 1024 TX descriptors and 1024 RX descriptors
em0: Using an MSI interrupt
em0: Ethernet address: XXXX
em0: netmap queues/slots: TX 1/1024, RX 1/1024


and ifconfig:

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether xx:xx:xx:xx:xx:xx
        inet 192.168.0.2 netmask 0xffffff00 broadcast 192.168.0.255
        inet6 fe80::xxxx:eff:xxxx:xxxx%em00 prefixlen 64 scopeid 0x1
        inet6 fd60:b403:101::233 prefixlen 64
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=61<PERFORMNUD,AUTO_LINKLOCAL,NO_RADR>
Comment 1 Kevin Bowling freebsd_committer freebsd_triage 2021-04-30 01:37:34 UTC
(In reply to O. Hartmann from comment #0)
Can you give me a dump of sysctl dev.em.0 during functional and non-functional states?  I'll take a look at this.