Bug 234766 - em(4) Intel 82579LM regression on Supermicro X9SCM-F
Summary: em(4) Intel 82579LM regression on Supermicro X9SCM-F
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-net mailing list
URL:
Keywords: IntelNetworking, regression
Depends on:
Blocks:
 
Reported: 2019-01-08 18:44 UTC by Henry David Bartholomew
Modified: 2019-01-10 01:57 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Henry David Bartholomew 2019-01-08 18:44:48 UTC
12.0-RELEASE-p1 on a Supermicro X9SCM-F using the on board em0 which is a 82579LM.

MSI-X is enabled by default despite the driver initialisation showing a problem with MSI-X: "Unable to map MSIX table"

em0: <Intel(R) PRO/1000 Network Connection> port 0xf020-0xf03f mem 0xfbb00000-0xfbb1ffff,0xfbb24000-0xfbb24fff irq 20 at device 25.0 on pci0
em0: attach_pre capping queues at 1
em0: using 1024 tx descriptors and 1024 rx descriptors
em0: msix_init qsets capped at 1
em0: Unable to map MSIX table 
em0: Using an MSI interrupt
em0: allocated for 1 tx_queues
em0: allocated for 1 rx_queues
em0: Ethernet address: xx:xx:xx:xx:xx:xx
em0: netmap queues/slots: TX 1/1024, RX 1/1024

# pciconf -lv
[...]
em0@pci0:0:25:0:        class=0x020000 card=0x150215d9 chip=0x15028086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82579LM Gigabit Network Connection (Lewisville)'
    class      = network
    subclass   = ethernet

This eventually leads to the interface going down, yet the box remains up and responsive, with the only recourse to restoring the network being a reboot.

em0: TX(0) desc avail = 42, pidx = 988
em0: link state changed to DOWN
em0: TX(0) desc avail = 1024, pidx = 0
em0: TX(0) desc avail = 1024, pidx = 0
[...]
em0: TX(0) desc avail = 1024, pidx = 0


The workaround appears to be to disable MSI-X:

# sysctl dev.em.0.iflib.disable_msix=0
Comment 1 Henry David Bartholomew 2019-01-08 18:52:19 UTC
Its worth adding that this interface worked flawlessly with MSI-X enabled on 11.2-RELEASE.

dmesg portion from 11.2:
em0: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0xf020-0xf03f mem 0xfbb00000-0xfbb1ffff,0xfbb24000-0xfbb24fff irq 20 at device 25.0 on pci0
em0: Using an MSI interrupt
em0: Ethernet address: xx:xx:xx:xx:xx:xx
em0: netmap queues/slots: TX 1/1024, RX 1/1024
Comment 2 Henry David Bartholomew 2019-01-09 19:50:14 UTC
So disabling msix does not actually prevent this happening, just appears to increase the time till its triggered.

I've now disabled LRO to see if that helps.
Comment 3 Eric Joyner freebsd_committer 2019-01-09 21:24:42 UTC
(In reply to Henry David Bartholomew from comment #2)

I was going to post a comment about how setting that tunable doesn't really cause any other change other than to suppress that error message, because the only difference would be that that iflib tries and fails to map the MSI-X bar.

Regardless, that error message shouldn't be appearing at all on most em(4) devices since only 82574 supports MSI-X, and so mapping the MSI-X bar shouldn't even be attempted on the 82579.

I don't know what to do about the queue hangs, though. Have you tried disabling TSO, if it is enabled?
Comment 4 Henry David Bartholomew 2019-01-10 01:56:00 UTC
Disabling LRO doesnt help.

TSO wasnt enabled.

I've switched to using the kernel module from net/intel-em-kmod in the hope its less buggy.