Bug 235147

Summary: em(4) driver not working for Intel 82583V Gigabit chip
Product: Base System Reporter: Joshua Kinard <freebsd>
Component: kernAssignee: Marius Strobl <marius>
Status: Closed FIXED    
Severity: Affects Only Me CC: brent, brewmeister2112, krzysztof.galazka, luigiitaliano, mail, marius, naito.yuichiro, ncrogers, net, wolfgang
Priority: --- Keywords: IntelNetworking, regression
Version: 12.0-RELEASE   
Hardware: Any   
OS: Any   

Description Joshua Kinard 2019-01-23 10:09:46 UTC
Trying to load FreeBSD 12.0-RELEASE on a small, six-port firewall appliance, a Protectli FW6A (https://protectli.com/product/fw6a/).  The device's six ports are powered by Intel's 82583V gigabit chipset, and supposed to be supported by the em(4) driver.  I've opened a ticket with Protectli support, and they have confirmed that 11.2-RELEASE will work, but have verified my observation that 12.0-RELEASE does not.  My suspicion is this is a regression from the iflib updates done between 11.2 and 12.0.

I've tried a couple of things found in other bugs for the em(4) driver, including disabling TSO, several sysctl tweakables, disabling MSI-X, different ethernet cables, different ports, etc.  Nothing seems to work.  Also tried forcing the igb(4) driver, to see if that would pick the ports up, but no go on that.  Both the em(4) and igb(4) man pages say they can support the 82580 chipsets.

The port will take an IP address assigned statically, but cannot look one up via DHCP.  It does seem capable of seeing ARP "Who am I?" requests, but cannot see the responses and does not update the ARP tables w/ new MAC addresses, even after fresh ping attempts (MAC and IPs below redacted).  It doesn't appear to process any other ethertype protocol at all outside of ARP.  Though, I have not verified that via tcpdump real well yet.

# arp -a
? (192.168.w.x) at xx:xx:xx:xx:xx:xx on em0 permanent [ethernet]
? (192.168.w.y) at (incomplete) on em0 expired [ethernet]
? (192.168.w.z) at (incomplete) on em0 expired [ethernet]

Some additional info from various utilities, with addresses masked:

# ifconfig em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether xx:xx:xx:xx:xx:xx
        inet 192.168.w.x netmask 0xffffff00 broadcast 192.168.w.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>


dmesg:
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci1: <ACPI PCI bus> on pcib1
em0: <Intel(R) PRO/1000 Network Connection> port 0xe000-0xe01f mem 0xdfe00000-0xdfe1ffff,0xdfe20000-0xdfe23fff irq 16 at device 0.0 on pci1
em0: attach_pre capping queues at 1
em0: using 1024 tx descriptors and 1024 rx descriptors
em0: msix_init qsets capped at 1
em0: pxm cpus: 2 queue msgs: 6 admincnt: 1
em0: using 1 rx queues 1 tx queues
em0: Using MSIX interrupts with 2 vectors
em0: allocated for 1 tx_queues
em0: allocated for 1 rx_queues
em0: Ethernet address: xx:xx:xx:xx:xx:xx
em0: netmap queues/slots: TX 1/1024, RX 1/1024
((repeat five more times to em5)
em0: link state changed to UP


If any additional information is needed to debug this, please let me know.
Comment 1 Joshua Kinard 2019-01-24 07:13:30 UTC
# pciconf -lvBbcV  (for em0 only, repeats through em5, only the BARs change)

em0@pci0:1:0:0: class=0x020000 card=0x00008086 chip=0x150c8086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82583V Gigabit Network Connection'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base rxdfe00000, size 131072, enabled
    bar   [18] = type I/O Port, range 32, base rxe000, size 32, enabled
    bar   [1c] = type Memory, range 32, base rxdfe20000, size 16384, enabled
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit
    cap 10[e0] = PCI-Express 1 endpoint max data 128(256) RO NS
                 link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1)
    cap 11[a0] = MSI-X supports 7 messages, enabled
                 Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
    ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
    ecap 0003[140] = Serial 1 xxxxxxffffxxxxxx

(MAC redacted on ecap 0003)
Comment 2 Niels Kristensen 2019-01-29 06:35:42 UTC
I'm having the exact same problem with the Minisys IBOX-501 N13.
Looks like the hardware is identical.
Comment 3 Luigi Italiano 2019-02-01 17:40:10 UTC
Same here.
Asrock 990FX Extreme 9 integrated LAN
Comment 4 Krzysztof Galazka 2019-02-07 12:13:46 UTC
(In reply to Joshua Kinard from comment #1)
According to datasheet (https://www.intel.com/content/www/us/en/embedded/products/networking/82583v-gbe-controller-datasheet.html) 82583V supports only MSI and legacy interrupts. That might be the cause why it's not working when iflib tries to use MSI-X. I think this patch https://reviews.freebsd.org/rS343864 in 12-STABLE should fix that.
Comment 5 Joshua Kinard 2019-02-08 09:08:14 UTC
(In reply to Krzysztof Galazka from comment #4)

I gave it a really quick test by trying to apply the diff from D18980 via 'cat <patch> | patch -p2' while in /usr/src/sys, and rebuilding a GENERIC 12.0-RELEASE-p3 kernel on a secondary machine.  Copied the newer /boot/kernel over to the networking appliance via USB and booted into the new kernel.  Doesn't seem to have had any effect:

FreeBSD 12.0-RELEASE-p3 GENERIC amd64
FreeBSD clang version 6.0.1 (tags/RELEASE_601/final 335540) (based on LLVM 6.0.1)
[snip]
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci1: <ACPI PCI bus> on pcib1
em0: <Intel(R) PRO/1000 Network Connection> port 0xe000-0xe01f mem 0xdfe00000-0xdfe1ffff,0xdfe20000-0xdfe23fff irq 16 at device 0.0 on pci1
em0: Using 1024 tx descriptors and 1024 rx descriptors
em0: Using 1 rx queues 1 tx queues
em0: Using MSI-X interrupts with 2 vectors
em0: Ethernet address: xx:xx:xx:xx:xx:xx
em0: netmap queues/slots: TX 1/1024, RX 1/1024
[snip]

# ifconfig -a
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether xx:xx:xx:xx:xx:xx
        inet a.b.c.d netmask 0xffffff00 broadcast a.b.c.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host

I can't rule out that I need to apply other parts of that patch, but I find that Phabricator interface to be incredibly confusing.  Is there a classic gitweb sitting around somewhere that will cough up a standard unidiff patch of the whole commit?  Or maybe the patch alone doesn't fix it?

---

So that said, the info on this chipset not being compatible w/ MSI-X was very helpful, thank you.  I had to dig around, as the iflib change made obsolete the old "hw.x.*" tunables in /boot/loader.conf (and that's all that stupid Google wants to find right now), but I found the correct tunable names for newer iflib stuff, and disabled MSI-X there.  After another reboot, we have contact:

# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=120 time=12.144 ms
64 bytes from 8.8.8.8: icmp_seq=1 ttl=120 time=12.667 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=120 time=10.681 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=120 time=10.519 ms
^C
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 10.519/11.503/12.667/0.923 ms
root@armenelos:~ #

So!  It looks like the fix for these specific chips would be for em(4) to automatically disable MSI-X when it detects the specific PCI ID of these chipsets.
Comment 6 Krzysztof Galazka 2019-02-08 12:18:40 UTC
(In reply to Joshua Kinard from comment #5)

There is a new patch under review, which disables MSI-X support in em(4) for devices other than 82574: https://reviews.freebsd.org/D19108. Thank you for all the details you have provided and thank Marius for the patch.
Comment 7 commit-hook freebsd_committer freebsd_triage 2019-02-09 11:59:52 UTC
A commit references this bug:

Author: marius
Date: Sat Feb  9 11:58:41 UTC 2019
New revision: 343934
URL: https://svnweb.freebsd.org/changeset/base/343934

Log:
  - Remove the redundant device disabled hint handling; ever since
    r241119 that's performed globally by device_attach(9).
  - As for the EM-class of devices, em(4) supports multiple queues
    and MSI-X respectively only with 82574 devices. However, since
    the conversion to iflib(4), em(4) relies on the interrupt type
    fallback mechanism, i. e. MSI-X -> MSI -> INTx, of iflib(4) to
    figure out the interrupt type to use for the EM-class (as well
    as the IGB-class) of MACs. Moreover, despite the datasheet for
    82583V not mentioning any support of MSI-X, there actually are
    82583V devices out there that report a varying number of MSI-X
    messages as supported. The interrupt type fallback of iflib(4)
    is causing two failure modes depending on the actual number of
    MSI-X messages supported for such instances of 82583V:
    1) With only one MSI-X message supported, none is left for the
       RX/TX queues as that one message gets assigned to the admin
       interrupt. Worse, later on - which will be addressed with a
       separate fix - iflib(4) interprets that one messages as MSI
       or INTx to be set up, but fails to actually do so as it has
       previously called pci_alloc_msix(9). [1, 2]
    2) With more message supported, their distribution is okay but
       then em_if_msix_intr_assign() doesn't work for 82583V, with
       the interface being left in a non-working state, too. [3]
    Thus, let em_if_attach_pre() indicate to iflib(4) to try MSI-X
    with 82574 only, and at most MSI for the remainder of EM-class
    devices.
    While at it, remove "try_second_bar" as it's polarity inverted
    and not actually needed.
  - Remove code from em_if_timer() that effectively is a NOP since
    the conversion to iflib(4) ("trigger" is no longer read).
    While at it, let the comment for em_if_timer() reflect reality
    after said conversion.
  - Implement an ifdi_watchdog_reset method which only updates the
    em(4) "watchdog_events" counter but doesn't perform any reset,
    so that the em(4) "watchdog_timeouts" SYSCTL (iflib(4) doesn't
    provide a counterpart) reflects reality and these timeouts add
    to IFCOUNTER_OERRORS again after the iflib(4) conversion.
  - Remove the "mbuf_defrag_fail" and "tx_dma_fail" SYSCTLS; since
    the iflib(4) conversion, associated counters are disconnected,
    but iflib(4) provides "mbuf_defrag_failed" and "tx_map_failed"
    respectively as equivalents.
  - Move the description preceding lem_smartspeed() to the correct
    spot before em_reset() and bring back appropriate comments for
    {igb,em}_initialize_rss_mapping() and lem_smartspeed() lost in
    the iflib(4) conversion.
  - Adapt some other function descriptions and INIT_DEBUGOUT() use
    to match reality after the iflib(4) conversion.
  - Put the debugging message of em_enable_vectors_82574() (missed
    in r343578) under bootverbose, too.

  PR:		219428 [1], 235246 [2], 235147 [3]
  Reviewed by:	erj (previous version)
  Differential Revision:	https://reviews.freebsd.org/D19108

Changes:
  head/sys/dev/e1000/if_em.c
  head/sys/dev/e1000/if_em.h
Comment 8 wolfgang 2019-02-09 22:48:23 UTC
I had the same problem on probably the same hardware (Axiomtek NA-320) after updating from 11.2-STABLE to 12-STABLE r343942
Adding hw.pci.enable_msix=0 to /boot/loader.conf and rebooting fixed it for me.
Comment 9 commit-hook freebsd_committer freebsd_triage 2019-02-13 14:40:00 UTC
A commit references this bug:

Author: marius
Date: Wed Feb 13 14:39:17 UTC 2019
New revision: 344098
URL: https://svnweb.freebsd.org/changeset/base/344098

Log:
  MFC: r343934

  - Remove the redundant device disabled hint handling; ever since
    r241119 that's performed globally by device_attach(9).
  - As for the EM-class of devices, em(4) supports multiple queues
    and MSI-X respectively only with 82574 devices. However, since
    the conversion to iflib(4), em(4) relies on the interrupt type
    fallback mechanism, i. e. MSI-X -> MSI -> INTx, of iflib(4) to
    figure out the interrupt type to use for the EM-class (as well
    as the IGB-class) of MACs. Moreover, despite the datasheet for
    82583V not mentioning any support of MSI-X, there actually are
    82583V devices out there that report a varying number of MSI-X
    messages as supported. The interrupt type fallback of iflib(4)
    is causing two failure modes depending on the actual number of
    MSI-X messages supported for such instances of 82583V:
    1) With only one MSI-X message supported, none is left for the
       RX/TX queues as that one message gets assigned to the admin
       interrupt. Worse, later on - which will be addressed with a
       separate fix - iflib(4) interprets that one messages as MSI
       or INTx to be set up, but fails to actually do so as it has
       previously called pci_alloc_msix(9). [1, 2]
    2) With more message supported, their distribution is okay but
       then em_if_msix_intr_assign() doesn't work for 82583V, with
       the interface being left in a non-working state, too. [3]
    Thus, let em_if_attach_pre() indicate to iflib(4) to try MSI-X
    with 82574 only, and at most MSI for the remainder of EM-class
    devices.
    While at it, remove "try_second_bar" as it's polarity inverted
    and not actually needed.
  - Remove code from em_if_timer() that effectively is a NOP since
    the conversion to iflib(4) ("trigger" is no longer read).
    While at it, let the comment for em_if_timer() reflect reality
    after said conversion.
  - Implement an ifdi_watchdog_reset method which only updates the
    em(4) "watchdog_events" counter but doesn't perform any reset,
    so that the em(4) "watchdog_timeouts" SYSCTL (iflib(4) doesn't
    provide a counterpart) reflects reality and these timeouts add
    to IFCOUNTER_OERRORS again after the iflib(4) conversion.
  - Remove the "mbuf_defrag_fail" and "tx_dma_fail" SYSCTLS; since
    the iflib(4) conversion, associated counters are disconnected,
    but iflib(4) provides "mbuf_defrag_failed" and "tx_map_failed"
    respectively as equivalents.
  - Move the description preceding lem_smartspeed() to the correct
    spot before em_reset() and bring back appropriate comments for
    {igb,em}_initialize_rss_mapping() and lem_smartspeed() lost in
    the iflib(4) conversion.
  - Adapt some other function descriptions and INIT_DEBUGOUT() use
    to match reality after the iflib(4) conversion.
  - Put the debugging message of em_enable_vectors_82574() (missed
    in r343578) under bootverbose, too.

  PR:		219428 [1], 235246 [2], 235147 [3]
  Reviewed by:	erj (previous version)
  Differential Revision:	https://reviews.freebsd.org/D19108

Changes:
_U  stable/12/
  stable/12/sys/dev/e1000/if_em.c
  stable/12/sys/dev/e1000/if_em.h
Comment 10 Marius Strobl freebsd_committer freebsd_triage 2019-02-13 14:45:33 UTC
Close; thanks for the report!