Bug 235031 - [em] em0: poor NFS performance, strange behavior
Summary: [em] em0: poor NFS performance, strange behavior
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net mailing list
URL:
Keywords: IntelNetworking, regression
Depends on:
Blocks:
 
Reported: 2019-01-17 20:59 UTC by Martin Birgmeier
Modified: 2019-08-31 13:35 UTC (History)
9 users (show)

See Also:


Attachments
loader.conf (1.35 KB, text/plain)
2019-01-18 17:42 UTC, Martin Birgmeier
no flags Details
sysctl.conf (610 bytes, text/plain)
2019-01-18 17:47 UTC, Martin Birgmeier
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Birgmeier 2019-01-17 20:59:41 UTC
Scenario
========
I have a laptop with the following for em0 in /var/run/dmesg.boot:

em0: <Intel(R) PRO/1000 Network Connection> port 0x6040-0x605f mem 0xd5300000-0xd531ffff,0xd532b000-0xd532bfff irq 20 at device 25.0 on pci0
em0: attach_pre capping queues at 1
em0: using 1024 tx descriptors and 1024 rx descriptors
em0: msix_init qsets capped at 1
em0: Unable to map MSIX table 
em0: Using an MSI interrupt
em0: allocated for 1 tx_queues
em0: allocated for 1 rx_queues

[0]# ifconfig em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether f0:de:f1:98:86:a9
        inet 192.168.1.19 netmask 0xffffff00 broadcast 192.168.1.255 
        inet6 fe80::f2de:f1ff:fe98:86a9%em0 prefixlen 64 scopeid 0x1 
        inet6 fec0:0:0:4d42::13 prefixlen 64 
        inet6 fec0::4d42:f2de:f1ff:fe98:86a9 prefixlen 64 autoconf 
        inet6 2002:5875:4c34:4d42:f2de:f1ff:fe98:86a9 prefixlen 64 autoconf 
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
[0]# 

This laptop exports a filesystem via NFS to a second machine running a 100 Mbit interface.

The second machine performs read requests via NFS.

Result
======
The result is that the NFS performance is very poor, on the order of 500 kBps to 1 MBps (5..10% of the 100 Mbps of the second machine).

Scenario (continued)
====================
There is a third machine with a gigabit interface.

Result
======
If I run iperf between the laptop and the third machine, the read performance for the second machine increases (!) to several MB/s (20..40% of the 100 Mbps of the second machine). The iperf performance itself is close to wire speed (914 Mbps).

Note
====
Maybe this problem is the one which underlies the issue I reported in bug #234520; machine B in that scenario is the laptop, machine A is the third machine.
Comment 1 Martin Birgmeier 2019-01-17 21:03:47 UTC
Another note: performance was fine under 11.2.
Comment 2 Martin Birgmeier 2019-01-17 21:13:08 UTC
Another scenario
================

Just for experimenting, I issued the following commands:

ifconfig em0 -rxcsum .............. result: connectivity to other machines lost for existing connections (TCP and UDP), new connections can be established
ifconfig em0 rxcsum ............... result: same
ifconfig em0 down ................. result: same
ifocnfig em0 up ................... result: same

In addition, these actions resulted in the following message being repeated on the console every few seconds:

TX(0) desc avail = 1024, pidx = 0

I had to reboot the machine to get it working normally again.

[0]# grep em0 /var/log/messages
...
Jan 17 21:29:18 mizar kernel: em0: link state changed to DOWN
Jan 17 21:29:20 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:30:02 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:36:14 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:36:22 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:36:25 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:36:28 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:36:31 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:37:59 mizar kernel: Syncing disks, vnodes remaining... 11 12 em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:37:59 mizar kernel: 8 8 em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:37:59 mizar kernel: 8 8 8 em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:37:59 mizar kernel: 8 2 2 em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:37:59 mizar kernel: 1 0 0 em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:37:59 mizar kernel: 0 0 0 em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:37:59 mizar kernel: Waiting (max 60 seconds) for system thread `bufspacedaemon-0' to stop... em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:37:59 mizar kernel: Waiting (max 60 seconds) for system thread `bufspacedaemon-6' to stop... em0: TX(0) desc avail = 1024, pidx = 0
Jan 17 21:37:59 mizar kernel: em0: <Intel(R) PRO/1000 Network Connection> port 0x6040-0x605f mem 0xd5300000-0xd531ffff,0xd532b000-0xd532bfff irq 20 at device 25.0 on pci0
Jan 17 21:37:59 mizar kernel: em0: attach_pre capping queues at 1
Jan 17 21:37:59 mizar kernel: em0: using 1024 tx descriptors and 1024 rx descriptors
Jan 17 21:37:59 mizar kernel: em0: msix_init qsets capped at 1
Jan 17 21:37:59 mizar kernel: em0: Unable to map MSIX table 
Jan 17 21:37:59 mizar kernel: em0: Using an MSI interrupt
Jan 17 21:37:59 mizar kernel: em0: allocated for 1 tx_queues
Jan 17 21:37:59 mizar kernel: em0: allocated for 1 rx_queues
Jan 17 21:37:59 mizar kernel: em0: Ethernet address: f0:de:f1:98:86:a9
Jan 17 21:37:59 mizar kernel: em0: link state changed to UP
[0]#
Comment 3 Martin Birgmeier 2019-01-17 21:21:30 UTC
Next scenario:
==============

If I run

iperf -c hal -b 500m,0

i.e., using only 50% of the available bandwidth between the laptop and the third machine, NFS read performance between the laptop and the second machine increases to 5..7 MBps, i.e., 50..70% of the available bandwidth of 100 Mbps.

It seems that in this case there is sufficient but not too much activity on em0 in order to optimally talk to the second machine.

Maybe this is a case of lost or stuck interrupts? Or some problems with handling UDP?
Comment 4 Martin Birgmeier 2019-01-17 21:40:37 UTC
A tcpdump running on the laptop filtering packets to the second machine shows the following (mizar = laptop, gandalf = second machine):

22:37:34.527491 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83752633, win 3583, options [nop,nop,TS val 830264279 ecr 2409434096], length 0
22:37:34.527739 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83755529, win 3537, options [nop,nop,TS val 830264279 ecr 2409434096], length 0
22:37:34.527984 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83758425, win 3492, options [nop,nop,TS val 830264280 ecr 2409434096], length 0
22:37:34.528230 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83761321, win 3447, options [nop,nop,TS val 830264280 ecr 2409434096], length 0
22:37:34.528476 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83764217, win 3402, options [nop,nop,TS val 830264280 ecr 2409434096], length 0
22:37:34.528722 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83767113, win 3356, options [nop,nop,TS val 830264280 ecr 2409434096], length 0
22:37:34.620273 IP mizar.xyzzy.x11 > gandalf.xyzzy.60338: Flags [.], ack 8228, win 1026, options [nop,nop,TS val 4065857024 ecr 3569015095], length 0
22:37:34.620470 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83769069, win 4352, options [nop,nop,TS val 830264372 ecr 2409434097], length 0
22:37:34.625029 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [P.], seq 171240:171368, ack 83769069, win 4352, options [nop,nop,TS val 830264377 ecr 2409434097], length 128: NFS request xid 1082723581 124 read fh Unknown/F6D49DE9DE48A22F0A00F50500000000194002000000000000000000 65536 bytes @ 2883584
22:37:34.625262 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83769069:83770517, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448: NFS reply xid 1082723581 reply ok 1444 read
22:37:34.625268 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83770517:83771965, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625270 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83771965:83773413, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625273 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83773413:83774861, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625276 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83774861:83776309, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625278 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83776309:83777757, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625281 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83777757:83779205, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625284 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83779205:83780653, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625286 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83780653:83782101, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625288 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83782101:83783549, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625291 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83783549:83784997, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625294 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83784997:83786445, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625296 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83786445:83787893, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625299 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83787893:83789341, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625302 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83789341:83790789, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625304 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83790789:83792237, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625306 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83792237:83793685, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625309 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83793685:83795133, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625311 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83795133:83796581, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625314 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83796581:83798029, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625316 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83798029:83799477, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625318 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83799477:83800925, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625320 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83800925:83802373, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625323 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83802373:83803821, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625325 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83803821:83805269, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625328 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83805269:83806717, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625330 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83806717:83808165, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625332 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83808165:83809613, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625335 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83809613:83811061, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625337 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83811061:83812509, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625340 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83812509:83813957, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625342 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83813957:83815405, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625344 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83815405:83816853, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625347 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83816853:83818301, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625349 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83818301:83819749, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625352 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83819749:83821197, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625354 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83821197:83822645, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625356 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83822645:83824093, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625359 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83824093:83825541, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625362 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83825541:83826989, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625364 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83826989:83828437, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625366 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83828437:83829885, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625369 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83829885:83831333, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625372 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83831333:83832781, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625375 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83832781:83834229, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 1448
22:37:34.625377 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [P.], seq 83834229:83834737, ack 171368, win 29127, options [nop,nop,TS val 2409434199 ecr 830264377], length 508
22:37:34.625627 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83771965, win 4307, options [nop,nop,TS val 830264377 ecr 2409434199], length 0
22:37:34.625876 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83774861, win 4261, options [nop,nop,TS val 830264378 ecr 2409434199], length 0
22:37:34.626204 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83777757, win 4216, options [nop,nop,TS val 830264378 ecr 2409434199], length 0
22:37:34.626448 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83780653, win 4171, options [nop,nop,TS val 830264378 ecr 2409434199], length 0
22:37:34.626691 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83783549, win 4126, options [nop,nop,TS val 830264378 ecr 2409434199], length 0
22:37:34.626938 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83786445, win 4080, options [nop,nop,TS val 830264379 ecr 2409434199], length 0
22:37:34.627185 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83789341, win 4035, options [nop,nop,TS val 830264379 ecr 2409434199], length 0
22:37:34.627438 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83792237, win 3990, options [nop,nop,TS val 830264379 ecr 2409434199], length 0
22:37:34.627677 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83795133, win 3945, options [nop,nop,TS val 830264379 ecr 2409434199], length 0
22:37:34.627921 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83798029, win 3899, options [nop,nop,TS val 830264380 ecr 2409434199], length 0
22:37:34.628166 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83800925, win 3854, options [nop,nop,TS val 830264380 ecr 2409434199], length 0
22:37:34.628411 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83803821, win 3809, options [nop,nop,TS val 830264380 ecr 2409434199], length 0
22:37:34.628658 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83806717, win 3764, options [nop,nop,TS val 830264380 ecr 2409434199], length 0
22:37:34.628920 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83809613, win 3718, options [nop,nop,TS val 830264381 ecr 2409434199], length 0
22:37:34.629152 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83812509, win 3673, options [nop,nop,TS val 830264381 ecr 2409434199], length 0
22:37:34.629396 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83815405, win 3628, options [nop,nop,TS val 830264381 ecr 2409434199], length 0
22:37:34.629643 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83818301, win 3583, options [nop,nop,TS val 830264381 ecr 2409434199], length 0
22:37:34.629896 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83821197, win 3537, options [nop,nop,TS val 830264382 ecr 2409434199], length 0
22:37:34.630145 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83824093, win 3492, options [nop,nop,TS val 830264382 ecr 2409434199], length 0
22:37:34.630388 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83826989, win 3447, options [nop,nop,TS val 830264382 ecr 2409434199], length 0
22:37:34.630634 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83829885, win 3402, options [nop,nop,TS val 830264382 ecr 2409434199], length 0
22:37:34.630881 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83832781, win 3356, options [nop,nop,TS val 830264382 ecr 2409434199], length 0
22:37:34.667289 IP mizar.xyzzy.46612 > gandalf.xyzzy.hosts2-ns: Flags [.], ack 1, win 1026, options [nop,nop,TS val 3514464195 ecr 3121876329], length 0
22:37:34.667481 IP gandalf.xyzzy.hosts2-ns > mizar.xyzzy.46612: Flags [.], ack 1, win 1026, options [nop,nop,TS val 3121886328 ecr 3511103843], length 0
22:37:34.667488 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83834737, win 4352, options [nop,nop,TS val 830264419 ecr 2409434199], length 0
22:37:34.668080 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [P.], seq 171368:171496, ack 83834737, win 4352, options [nop,nop,TS val 830264420 ecr 2409434199], length 128: NFS request xid 1082723582 124 read fh Unknown/F6D49DE9DE48A22F0A00F50500000000194002000000000000000000 65536 bytes @ 2949120
22:37:34.668287 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83834737:83836185, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448: NFS reply xid 1082723582 reply ok 1444 read
22:37:34.668292 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83836185:83837633, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668294 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83837633:83839081, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668297 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83839081:83840529, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668300 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83840529:83841977, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668302 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83841977:83843425, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668305 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83843425:83844873, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668307 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83844873:83846321, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668309 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83846321:83847769, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668312 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83847769:83849217, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668315 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83849217:83850665, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668317 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83850665:83852113, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668320 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83852113:83853561, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668322 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83853561:83855009, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668325 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83855009:83856457, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668327 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83856457:83857905, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668329 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83857905:83859353, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668331 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83859353:83860801, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668334 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83860801:83862249, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668337 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83862249:83863697, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668339 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83863697:83865145, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668342 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83865145:83866593, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668344 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83866593:83868041, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668346 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83868041:83869489, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668349 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83869489:83870937, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668351 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83870937:83872385, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668354 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83872385:83873833, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668357 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83873833:83875281, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668360 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83875281:83876729, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668363 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83876729:83878177, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668365 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83878177:83879625, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668367 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83879625:83881073, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668370 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83881073:83882521, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668372 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83882521:83883969, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668375 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83883969:83885417, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668378 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83885417:83886865, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668380 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83886865:83888313, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668383 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83888313:83889761, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668385 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83889761:83891209, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668388 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83891209:83892657, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668390 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83892657:83894105, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668392 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83894105:83895553, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668395 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83895553:83897001, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668397 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83897001:83898449, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668400 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [.], seq 83898449:83899897, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 1448
22:37:34.668403 IP mizar.xyzzy.nfsd > gandalf.xyzzy.uuidgen: Flags [P.], seq 83899897:83900405, ack 171496, win 29127, options [nop,nop,TS val 2409434242 ecr 830264420], length 508
22:37:34.668649 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83837633, win 4307, options [nop,nop,TS val 830264420 ecr 2409434242], length 0
22:37:34.668899 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83840529, win 4261, options [nop,nop,TS val 830264421 ecr 2409434242], length 0
22:37:34.669227 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83843425, win 4216, options [nop,nop,TS val 830264421 ecr 2409434242], length 0
22:37:34.669466 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83846321, win 4171, options [nop,nop,TS val 830264421 ecr 2409434242], length 0
22:37:34.669715 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83849217, win 4126, options [nop,nop,TS val 830264421 ecr 2409434242], length 0
22:37:34.669961 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83852113, win 4080, options [nop,nop,TS val 830264422 ecr 2409434242], length 0
22:37:34.670208 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83855009, win 4035, options [nop,nop,TS val 830264422 ecr 2409434242], length 0
22:37:34.670453 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83857905, win 3990, options [nop,nop,TS val 830264422 ecr 2409434242], length 0
22:37:34.670703 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83860801, win 3945, options [nop,nop,TS val 830264422 ecr 2409434242], length 0
22:37:34.670945 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83863697, win 3899, options [nop,nop,TS val 830264423 ecr 2409434242], length 0
22:37:34.671191 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83866593, win 3854, options [nop,nop,TS val 830264423 ecr 2409434242], length 0
22:37:34.671434 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83869489, win 3809, options [nop,nop,TS val 830264423 ecr 2409434242], length 0
22:37:34.671683 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83872385, win 3764, options [nop,nop,TS val 830264423 ecr 2409434242], length 0
22:37:34.671933 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83875281, win 3718, options [nop,nop,TS val 830264424 ecr 2409434242], length 0
22:37:34.672174 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83878177, win 3673, options [nop,nop,TS val 830264424 ecr 2409434242], length 0
22:37:34.672419 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83881073, win 3628, options [nop,nop,TS val 830264424 ecr 2409434242], length 0
22:37:34.672667 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83883969, win 3583, options [nop,nop,TS val 830264424 ecr 2409434242], length 0
22:37:34.672914 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83886865, win 3537, options [nop,nop,TS val 830264425 ecr 2409434242], length 0
22:37:34.673158 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83889761, win 3492, options [nop,nop,TS val 830264425 ecr 2409434242], length 0
22:37:34.673404 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83892657, win 3447, options [nop,nop,TS val 830264425 ecr 2409434242], length 0
22:37:34.673650 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83895553, win 3402, options [nop,nop,TS val 830264425 ecr 2409434242], length 0
22:37:34.673899 IP gandalf.xyzzy.uuidgen > mizar.xyzzy.nfsd: Flags [.], ack 83898449, win 3356, options [nop,nop,TS val 830264426 ecr 2409434242], length 0
Comment 5 Eugene Grosbein freebsd_committer 2019-01-18 08:50:32 UTC
The problem may be related to interrupt handling and/or timekeeping.

1) Do you run ntpd there? If yes, show contents of /var/db/ntpd.drift. If not, start it, make sure it synced time successfully (ntpq -np), wait an hour and show /var/db/ntpd.drift.

2) Show output of sysctl kern.timecounter and sysctl kern.eventtimer commands. If kern.eventtimer.periodic is 0, set it to 1 and repeat your tests and post results.

3) Show output of "grep irq /var/run/dmesg.boot" for 12.0 and use 11.2 live boot media (flash stick or CD/DVD) and repeat "grep irq /var/run/dmesg.boot", save output and post it too.
Comment 6 Martin Birgmeier 2019-01-18 10:18:05 UTC
Hi Eugene,

Thank you for looking into this.

I have always been using ntp on all (physical, not VMs) machines. But I am not using a drift file because in my experience, the machines do not run long enough to arrive at stable values in the drift file, leading to bad ntp startup behavior (drifting) on the next boot (I have my own config files; maybe I should restart using drift files because my experience is from very long ago ~ 15 years :-)). Here is the output of ntpq -p:

[0]# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*gandalf.xyzzy   192.53.103.103   2 u   11   64  377    0.187  -18.657   0.447
+hal.xyzzy       193.204.114.233  2 u   23   64  377    0.180   -2.254   0.096
[0]# 

I am syncing two machines to stratum-1 servers; they in turn are references for all the other machines (including the laptop with the em0).

[0]# sysctl kern.timecounter 
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: TSC(1000) ACPI-fast(900) HPET(950) i8254(0) dummy(-1000000)
kern.timecounter.hardware: ACPI-fast
kern.timecounter.alloweddeviation: 5
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.TSC.quality: 1000
kern.timecounter.tc.TSC.frequency: 1993570808
kern.timecounter.tc.TSC.counter: 3362030552
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 11926669
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.counter: 114481493
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 14744
kern.timecounter.tc.i8254.mask: 65535
[0]# sysctl kern.eventtimer 
kern.eventtimer.periodic: 0
kern.eventtimer.timer: LAPIC
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 2
kern.eventtimer.choice: LAPIC(600) HPET(550) i8254(100) RTC(0)
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.HPET.quality: 550
kern.eventtimer.et.HPET.frequency: 14318180
kern.eventtimer.et.HPET.flags: 7
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.LAPIC.quality: 600
kern.eventtimer.et.LAPIC.frequency: 1993570808
kern.eventtimer.et.LAPIC.flags: 7
[0]# 

[0]# sysctl kern.eventtimer.periodic=1 
[0]# 
This does not improve the situation. Again, additionally running iperf between the laptop and the 3rd machine still improves it.

[0]# grep irq /var/run/dmesg.boot
ioapic0 <Version 2.0> irqs 0-23 on motherboard
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
vgapci0: <VGA-compatible display> port 0x5000-0x507f mem 0xd2000000-0xd2ffffff,0xc0000000-0xcfffffff,0xd0000000-0xd1ffffff irq 16 at device 0.0 on pci1
hdac0: <NVIDIA GT108 HDA Controller> mem 0xd3000000-0xd3003fff irq 17 at device 0.1 on pci1
em0: <Intel(R) PRO/1000 Network Connection> port 0x6040-0x605f mem 0xd5300000-0xd531ffff,0xd532b000-0xd532bfff irq 20 at device 25.0 on pci0
ehci0: <Intel Cougar Point USB 2.0 controller> mem 0xd532a000-0xd532a3ff irq 16 at device 26.0 on pci0
hdac1: <Intel Cougar Point HDA Controller> mem 0xd5320000-0xd5323fff irq 22 at device 27.0 on pci0
pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pcib3: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0
iwn0: <Intel Centrino Ultimate-N 6300> mem 0xd5200000-0xd5201fff irq 17 at device 0.0 on pci3
pcib4: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0
pcib5: <ACPI PCI-PCI bridge> irq 16 at device 28.4 on pci0
sdhci_pci0: <RICOH R5CE823 SD> mem 0xd4201000-0xd42010ff irq 16 at device 0.0 on pci4
fwohci0: <1394 Open Host Controller Interface> mem 0xd4200000-0xd42007ff irq 19 at device 0.3 on pci4
pcib6: <ACPI PCI-PCI bridge> irq 18 at device 28.6 on pci0
xhci0: <NEC uPD720200 USB 3.0 controller> mem 0xd4100000-0xd4101fff irq 18 at device 0.0 on pci5
ehci1: <Intel Cougar Point USB 2.0 controller> mem 0xd5329000-0xd53293ff irq 23 at device 29.0 on pci0
ahci0: <Intel Cougar Point AHCI SATA controller> port 0x6068-0x606f,0x607c-0x607f,0x6060-0x6067,0x6078-0x607b,0x6020-0x603f mem 0xd5328000-0xd53287ff irq 19 at device 31.2 on pci0
ichsmb0: <Intel Cougar Point SMBus controller> port 0xefa0-0xefbf mem 0xd5324000-0xd53240ff irq 18 at device 31.3 on pci0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
[0]# 

[0]# cd /z/backup/rsync/mizar/ROOT/.zfs/snapshot/backup.2018-12-24.21:03:49
[0]# egrep -i 'FreeBSD|irq' var/run/dmesg.boot                             
Copyright (c) 1992-2018 The FreeBSD Project.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.2-RELEASE-p5 #4 r341248M: Thu Nov 29 19:30:14 CET 2018
    root@v904.xyzzy:/.../hal/z/OBJ/FreeBSD/amd64/releng/11.2/sys/XYZZY_SMP amd64
FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLVM 6.0.0)
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads
ioapic0 <Version 2.0> irqs 0-23 on motherboard
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
vgapci0: <VGA-compatible display> port 0x5000-0x507f mem 0xd2000000-0xd2ffffff,0xc0000000-0xcfffffff,0xd0000000-0xd1ffffff irq 16 at device 0.0 on pci1
hdac0: <NVIDIA GT108 HDA Controller> mem 0xd3000000-0xd3003fff irq 17 at device 0.1 on pci1
em0: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0x6040-0x605f mem 0xd5300000-0xd531ffff,0xd532b000-0xd532bfff irq 20 at device 25.0 on pci0
ehci0: <Intel Cougar Point USB 2.0 controller> mem 0xd532a000-0xd532a3ff irq 16 at device 26.0 on pci0
hdac1: <Intel Cougar Point HDA Controller> mem 0xd5320000-0xd5323fff irq 22 at device 27.0 on pci0
pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pcib3: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0
iwn0: <Intel Centrino Ultimate-N 6300> mem 0xd5200000-0xd5201fff irq 17 at device 0.0 on pci3
pcib4: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0
pcib5: <ACPI PCI-PCI bridge> irq 16 at device 28.4 on pci0
sdhci_pci0: <RICOH R5CE823 SD> mem 0xd4201000-0xd42010ff irq 16 at device 0.0 on pci4
fwohci0: <1394 Open Host Controller Interface> mem 0xd4200000-0xd42007ff irq 19 at device 0.3 on pci4
pcib6: <ACPI PCI-PCI bridge> irq 18 at device 28.6 on pci0
xhci0: <NEC uPD720200 USB 3.0 controller> mem 0xd4100000-0xd4101fff irq 18 at device 0.0 on pci5
ehci1: <Intel Cougar Point USB 2.0 controller> mem 0xd5329000-0xd53293ff irq 23 at device 29.0 on pci0
ahci0: <Intel Cougar Point AHCI SATA controller> port 0x6068-0x606f,0x607c-0x607f,0x6060-0x6067,0x6078-0x607b,0x6020-0x603f mem 0xd5328000-0xd53287ff irq 19 at device 31.2 on pci0
ichsmb0: <Intel Cougar Point SMBus controller> port 0xefa0-0xefbf mem 0xd5324000-0xd53240ff irq 18 at device 31.3 on pci0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
fuse-freebsd: version 0.4.4, FUSE ABI 7.8
[0]#
Comment 7 Eugene Grosbein freebsd_committer 2019-01-18 10:39:56 UTC
Routing of interrupts not changed between 11.2 and 12.0.

Post output of grep em0 /z/backup/rsync/mizar/ROOT/.zfs/snapshot/backup.2018-12-24.21:03:49/var/run/dmesg.boot

Please restart ntpd with drift file, wait an hour then post its contents.

Can you reboot the system and drop to loader prompt at boot time? If yes, do at loader prompt:

set hw.pci.enable_msi=0
boot

Then repeat your tests to see if it gets better with MSI disabled.
Comment 8 Martin Birgmeier 2019-01-18 11:16:05 UTC
[0]# egrep 'FreeBSD|em0' var/run/dmesg.boot
Copyright (c) 1992-2018 The FreeBSD Project.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.2-RELEASE-p5 #4 r341248M: Thu Nov 29 19:30:14 CET 2018
    root@v904.xyzzy:/.../hal/z/OBJ/FreeBSD/amd64/releng/11.2/sys/XYZZY_SMP amd64
FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLVM 6.0.0)
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads
em0: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0x6040-0x605f mem 0xd5300000-0xd531ffff,0xd532b000-0xd532bfff irq 20 at device 25.0 on pci0
em0: Using an MSI interrupt
em0: Ethernet address: f0:de:f1:98:86:a9
ahciem0: <AHCI enclosure management bridge> on ahci0
pass2 at ahciem0 bus 0 scbus4 target 0 lun 0
em0: link state changed to UP
[0]# 

Rebooting the system takes a little longer... ntp in progress
Comment 9 Martin Birgmeier 2019-01-18 16:14:24 UTC
I added

    hw.pci.enable_msi=0

to /boot/loader.conf, which was a bad idea, because with this, after loading the kernel, the system does not find ada0 anymore. (I recovered using a pxe netboot.)

Regarding ntpd.drift, after adding "driftfile /var/db/ntpd.drift" to /etc/ntp.conf ntpd only complains about "format error frequency file /var/db/ntpd.drift" in the syslog (I put a zero-sized file with owner/group ntpd there). I probably have to wait longer for it to write anything to it.

-- Martin
Comment 10 Eugene Grosbein freebsd_committer 2019-01-18 17:34:54 UTC
(In reply to Martin Birgmeier from comment #9)

>I added hw.pci.enable_msi=0 to /boot/loader.conf, which was a bad idea, because with this, after loading the kernel, the system does not find ada0 anymore.

That's why I suggested to use loader prompt. Next time, if you need to test something like this and cannot use loader prompt directly, use nextboot(8) feature that guarantees safe fallback to known working configuration even if boot is broken during experiment: you add "hw.pci.enable_msi=0" (or another setting) to /boot/nextboot.conf that is created with nextboot(8) command and loader uses nextboot.conf for one time only ignoring it for successive loading attempts and nextboot.conf is removed completely during first successfull boot.

You can even change root file system name, kernel name or set of loaded kernel modules using nextboot safely.
Comment 11 Eugene Grosbein freebsd_committer 2019-01-18 17:38:07 UTC
Do you have any non-default settings in your /boot/loader.conf or /etc/sysctl.conf ?
Comment 12 Martin Birgmeier 2019-01-18 17:42:24 UTC
Created attachment 201243 [details]
loader.conf
Comment 13 Martin Birgmeier 2019-01-18 17:42:40 UTC
Yes; I just thought it was going to help and wanted to make it permanent right away. Bad idea.

In the meantime:

[0]# cat /var/db/ntpd.drift
-6.596
[0]# 

What can you get from the ntp drift?

Is there a way to selectively disable MSI for em0 only?

-- Martin
Comment 14 Martin Birgmeier 2019-01-18 17:47:23 UTC
Created attachment 201244 [details]
sysctl.conf

This might be more interesting (kern.timecounter.hardware setting).
Comment 15 Eugene Grosbein freebsd_committer 2019-01-19 01:13:31 UTC
(In reply to Martin Birgmeier from comment #13)

> What can you get from the ntp drift?

Now we see this machine has good enough hardware timer to rule out problems linked with bad hardware timer.

> Is there a way to selectively disable MSI for em0 only?

No, driver does not support that.

However, you have cpufreq enabled so the box may run with lower CPU frequency unless there is something like iperf running making extra load. Can you temporary configure system to run at max all the time or just unload cpufreq and repeat your tests?
Comment 16 Martin Birgmeier 2019-01-20 08:22:31 UTC
In the tcpdump log shown in comment #4 it seems that there is always a lot of traffic in one direction followed by a lot of ACKs in the other (if I interpret it correctly).

Could it be that something is buffering too many packets, thereby destroying NFS "streaming"?

What about this strange "em0: TX(0) desc avail = 1024, pidx = 0" error in comment #2 when I tried to set -rxcsum? Does it mean that there are 1024 buffers available?

-- Martin
Comment 17 Martin Birgmeier 2019-01-20 09:04:18 UTC
Because bug #234550 might be related to this one, I tried to issue "ifconfig -lro em0". This resulted in the interface not working anymore, with the already known syslog messages:

Jan 20 09:29:18 mizar kernel: em0: link state changed to DOWN
Jan 20 09:29:20 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 20 09:34:36 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 20 09:34:42 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 20 09:34:51 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0
Jan 20 09:34:54 mizar kernel: em0: TX(0) desc avail = 1024, pidx = 0

None of the following commands (from the shell history) helped:

 4981  2019-01-20 09:29  0:01  ifconfig em0 -lro
 4982  2019-01-20 09:29  0:00  ifconfig em0 up
 4985  2019-01-20 09:30  0:00  ifconfig em0
 4987  2019-01-20 09:30  0:03  ifconfig em0 lro up
 4989  2019-01-20 09:31  0:00  ifconfig em0
 4990  2019-01-20 09:32  0:00  ifconfig em0 down
 4991  2019-01-20 09:32  0:02  ifconfig em0 up
 4999  2019-01-20 09:40  0:00  ifconfig em0

Then I tried adding "-lro" in rc.conf:

ifconfig_em0="-lro inet $hostname"

This resulted in a rather long wait for the interface to be available to some rc processes. When the machine was finally up, NFS to it did not work at all, it just hung.

I removed the -lro flag again.

What is the issue that using ifconfig on this interface breaks it?

-- Martin
Comment 18 Martin Birgmeier 2019-01-20 09:08:27 UTC
Another candidate for being related is bug #234570.
Comment 19 Martin Birgmeier 2019-01-21 18:14:57 UTC
I have installed net/intel-em-kmod and switched the kernel module from if_em to if_em_updated which is installed by that package. This results in the interface running at full speed again.

However, looking at the sources shipped with this port, they are dated 27 Sep 2017.

So it would be nice if the if_em shipped with the FreeBSD sources could be corrected.

-- Martin
Comment 20 Martin Birgmeier 2019-01-21 19:43:54 UTC
With the switch to the "if_em_updated" kernel module from net/intel-em-kmod, the issue described in bug #234520 is gone.
Comment 21 Martin Birgmeier 2019-01-27 17:00:56 UTC
Hi,

Is there anybody out there who might want to take on this issue? It is a regression from FreeBSD 11, so most likely should be fixable.

-- Martin
Comment 22 Rodney W. Grimes freebsd_committer 2019-01-27 19:12:54 UTC
(In reply to Martin Birgmeier from comment #21)
I wont take the bug as an assignee, but shall speak to what is in the works with respect to em(4) and other iflib'ed devices in the 12.0 release.

Release engineering has become aware of several regressions in drivers that are first released in 12.0 using the new iflib design.  There are several commits to ^head, some of which have been merged to stable/12 that address some of these issues.  There are several more issues being worked on as well.

There has been at least 2 formal requests for 2 different commits related to em(4) regressions to have errata issued.

Release engineering is waiting on the first group of fixes to be MFC'ed after  testing and plans to work with the security team (so@) to issue an EN for these fixes.

The additional fixes that are not in the first EN well be done in a future EN.

Regards,
Rod >RE@

PS  I do not know if this set of changes fixes your specific bug, but it does have a high probability of that.
Comment 23 Martin Birgmeier 2019-01-28 17:36:21 UTC
Hi Rodney,

Thank you for your feedback, I very much appreciate it.

Best regards,

Martin
Comment 24 Martin Birgmeier 2019-02-01 18:06:34 UTC
I have observed that using le0 in a VirtualBox virtual machine also has very poor NFS performance. Is le0 affected by the same issues?
Comment 25 Martin Birgmeier 2019-02-01 18:11:08 UTC
Just for the record, the le0 (le(4)) issue is with the client running head as of Dec. 8:

Copyright (c) 1992-2018 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-CURRENT #0 r341726M: Sat Dec  8 19:24:08 CET 2018
    root@v903.xyzzy:/.../hal/z/OBJ/FreeBSD/amd64/head/amd64.amd64/sys/XYZZY_SMP_NOSPIN amd64
FreeBSD clang version 6.0.1 (tags/RELEASE_601/final 335540) (based on LLVM 6.0.1)


The exactly same client runs fine under Hyper-V using de0 (de(4)).
Comment 26 Eugene Grosbein freebsd_committer 2019-02-02 14:56:48 UTC
I've found that NFS client of FreeBSD 11.2-STABLE has bad NFS performance with FreeBSD NFS servers and distinct real hardware, so the problem may be more complex.

As workaround, use set of mount options at client side: rsize=10240,wsize=10240,readahead=4

This improves performance in many orders.
Comment 27 Rick Macklem freebsd_committer 2019-02-02 23:42:25 UTC
A little off topic, but rsize,wsize will be set to a power of 2 at or
below the argument. As such, rsize=10240,wsize=10240 will result in
a setting of 8K (or less if the server forces it) for both of them.
You can use "nfsstat -m" on the client to see what is actually being used.

I can't help much, but I would suggest trying increasing powers of 2 for
rsize,wsize until performance tanks.

A couple of reasons smaller rsize,wsize might perform better.
- TSO is broken. This can be tested by using the sysctl to disable TSO.
  (I suggest the generic sysctl and not a driver specific ifconfig, in
   case the driver config flag is broken.)
- Small send or receive rings that can't handle the burst of packets/mbufs
  that the larger I/O size causes.
  A write request/receive reply will typically be a chain of size/2K + 2 mbufs
  for the sender. (Sometimes +3 depending on how the driver handles the MAC
  header.)
  The receiver will typically see this as a burst of roughly size/1500 + 1 packets
  making up one TCP segment.
If jumbo packets are in use, all bets are off for the receive side, since
at least the >4K ones can fragment the mbuf cluster pool.
- Interrupt throttling. Since most NFS RPCs are small messages requiring
  a small reply message in the opposite direction, anything that increases
  latency (such as delaying interrupts assuming more packets of a stream
  will arrive soon) can hit NFS performance.
Comment 28 Eugene Grosbein freebsd_committer 2019-02-03 04:54:00 UTC
(In reply to Rick Macklem from comment #27)

Hardware-assisted TSO itself may be not useful in case of 1Gbps or less linerate for multiple reasons: overwhelming horsepower of modern (and even pretty old modern) amd64 CPUs, NFS mounts over UDP, extensive use of ipfw nat/libalias for routed jails/guest virtual machines as libalias is not compartible with TSO etc. So when not dealing with 40Gbps or faster hardware, I choose to disable TSO per-interface and not fight against corresponding bugs.

As for NIC rings, even pretty old Intel Gigabit have tunnables to setup big enough rings to not bother of this issue.

Again, even old Intel Gigabit NICs have multiple interrupt tunnables to setup right timings for local traffic profile.

As for testing different client mount option sets, I'll try it later and respond.
Comment 29 Martin Birgmeier 2019-04-28 15:42:52 UTC
I have just been reading https://lists.freebsd.org/pipermail/freebsd-stable/2019-April/090829.html which seems to be the same issue I am describing here.

To summarize:
- em(4) has very poor performance in 12.0
- I had to switch to net/intel-em-kmod to get back a decent performance
- net/intel-em-kmod is old, therefore I would like to see the real issue being fixed

Kris: I added you to this bug's CC list - hope you don't mind.

To summarize even further :-)
- Could you please fix this regression

-- Martin
Comment 30 Martin Birgmeier 2019-04-28 15:45:06 UTC
Kris, one more thing: To get the driver from ports loading, you must remove if_em_load="YES" from /boot/loader.conf and add if_em_updated_load="YES"

I am quite sure this is the reason you do not get good performance after installing the driver from ports.

-- Martin
Comment 31 Martin Birgmeier 2019-04-28 16:14:04 UTC
I guess bug #219428 is related?

Also https://lists.freebsd.org/pipermail/freebsd-stable/2019-April/090927.html ?

-- Martin
Comment 32 commit-hook freebsd_committer 2019-05-07 08:29:29 UTC
A commit references this bug:

Author: marius
Date: Tue May  7 08:28:35 UTC 2019
New revision: 347221
URL: https://svnweb.freebsd.org/changeset/base/347221

Log:
  o Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
    MSI. Unlike as with iflib_fast_intr_ctx(), the former will also enqueue
    _task_fn_tx() in addition to _task_fn_rx() if appropriate, bringing TCP
    TX throughput of EM-class devices on par with the MSI-X case and, thus,
    close to wirespeed/pre-iflib(4) times again. [1]
    Note that independently of the interrupt type, the UDP performance with
    these MACs still is abysmal and nowhere near to where it was before the
    conversion of em(4) to iflib(4).
  o In iflib_init_locked(), announce which free list failed to set up.
  o In _task_fn_tx() when running netmap(4), issue ifdi_intr_enable instead
    of the ifdi_tx_queue_intr_enable method in case of a "legacy" interrupt
    as the latter is valid with MSI-X only.
  o Instead of adding the missing - and apparently convoluted enough that a
    DBG_COUNTER_INC was put into a wrong spot in _task_fn_rx() - checks for
    ifdi_{r,t}x_queue_intr_enable being available in the MSI-X case also to
    iflib_fast_intr_rxtx(), factor these out to iflib_device_register() and
    make the checks fail gracefully rather than panic. This avoids invoking
    the checks at runtime over and over again in iflib_fast_intr_rxtx() and
    _task_fn_{r,t}x() - even if it's just in case of INVARIANTS - and makes
    these functions more readable.
  o In iflib_rx_structures_setup(), only initialize LRO resources if device
    and driver have LRO capability in order to not waste memory. Also, free
    the LRO resources again if setting them up fails for one of the queues.
    However, don't bother invoking iflib_rx_sds_free() in that case because
    iflib_rx_structures_setup() doesn't call iflib_rxsd_alloc() either (and
    iflib_{device,pseudo}_register() will issue iflib_rx_sds_free() in case
    of failure via iflib_rx_structures_free(), but there definitely is some
    asymmetry left to be fixed, though).
  o Similarly, free LRO resources again in iflib_rx_structures_free().
  o In iflib_irq_set_affinity(), handle get_core_offset() errors gracefully
    instead of panicing (but only in case of INVARIANTS). This is a follow-
    up to r344132, as such driver bugs shouldn't be fatal.
  o Likewise, handle unknown iflib_intr_type_t in iflib_irq_alloc_generic()
    gracefully, too.
  o Bring yet more sanity to iflib_msix_init():
    - If the device doesn't provide enough MSI-X vectors or not all vectors
      can be allocate so the expected number of queues in addition to admin
      interrupts can't be supported, try MSI next (and then INTx) as proper
      MSI-X vector distribution can't be assured in such cases. In essence,
      this change brings r254008 forward to iflib(4). Also, this is the fix
      alluded to in the commit message of r343934.
    - If the MSI-X allocation has failed, don't prematurely announce MSI is
      going to be used as the latter in fact may not be available either.
    - When falling back to MSI, only release the MSI-X table resource again
      if it was allocated in iflib_msix_init(), i. e. isn't supplied by the
      driver, in the first place.
  o In mp_ndesc_handler(), handle unknown type arguments gracefully, too.

  PR:		235031 (likely) [1]
  Reviewed by:	shurd
  Differential Revision:	https://reviews.freebsd.org/D20175

Changes:
  head/sys/net/iflib.c
Comment 33 Martin Birgmeier 2019-05-07 17:09:45 UTC
Thank you for this interesting commit note.

Seeing that "the UDP performance with these MACs still is abysmal", should I already backport these changes to 12.0 or will they not help in my case of using a lot of NFS?

Should I expect further changes which will also address the UDP performance?

-- Martin
Comment 34 commit-hook freebsd_committer 2019-06-16 15:26:36 UTC
A commit references this bug:

Author: marius
Date: Sun Jun 16 15:25:46 UTC 2019
New revision: 349112
URL: https://svnweb.freebsd.org/changeset/base/349112

Log:
  MFC: r347221, r347245

  o Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
    MSI. Unlike as with iflib_fast_intr_ctx(), the former will also enqueue
    _task_fn_tx() in addition to _task_fn_rx() if appropriate, bringing TCP
    TX throughput of EM-class devices on par with the MSI-X case and, thus,
    close to wirespeed/pre-iflib(4) times again. [1]
    Note that independently of the interrupt type, the UDP performance with
    these MACs still is abysmal and nowhere near to where it was before the
    conversion of em(4) to iflib(4).
  o In iflib_init_locked(), announce which free list failed to set up.
  o In _task_fn_tx() when running netmap(4), issue ifdi_intr_enable instead
    of the ifdi_tx_queue_intr_enable method in case of a "legacy" interrupt
    as the latter is valid with MSI-X only.
  o Instead of adding the missing - and apparently convoluted enough that a
    DBG_COUNTER_INC was put into a wrong spot in _task_fn_rx() - checks for
    ifdi_{r,t}x_queue_intr_enable being available in the MSI-X case also to
    iflib_fast_intr_rxtx(), factor these out to iflib_device_register() and
    make the checks fail gracefully rather than panic. This avoids invoking
    the checks at runtime over and over again in iflib_fast_intr_rxtx() and
    _task_fn_{r,t}x() - even if it's just in case of INVARIANTS - and makes
    these functions more readable.
  o In iflib_rx_structures_setup(), only initialize LRO resources if device
    and driver have LRO capability in order to not waste memory. Also, free
    the LRO resources again if setting them up fails for one of the queues.
    However, don't bother invoking iflib_rx_sds_free() in that case because
    iflib_rx_structures_setup() doesn't call iflib_rxsd_alloc() either (and
    iflib_{device,pseudo}_register() will issue iflib_rx_sds_free() in case
    of failure via iflib_rx_structures_free(), but there definitely is some
    asymmetry left to be fixed, though).
  o Similarly, free LRO resources again in iflib_rx_structures_free().
  o In iflib_irq_set_affinity(), handle get_core_offset() errors gracefully
    instead of panicing (but only in case of INVARIANTS). This is a follow-
    up to r344132 (MFCed to stable/12 in r344163) as such bugs shouldn't be
    fatal.
  o Likewise, handle unknown iflib_intr_type_t in iflib_irq_alloc_generic()
    gracefully, too.
  o Bring yet more sanity to iflib_msix_init():
    - If the device doesn't provide enough MSI-X vectors or not all vectors
      can be allocate so the expected number of queues in addition to admin
      interrupts can't be supported, try MSI next (and then INTx) as proper
      MSI-X vector distribution can't be assured in such cases. In essence,
      this change brings r254008 forward to iflib(4). Also, this is the fix
      alluded to in the commit message of r343934.
    - If the MSI-X allocation has failed, don't prematurely announce MSI is
      going to be used as the latter in fact may not be available either.
    - When falling back to MSI, only release the MSI-X table resource again
      if it was allocated in iflib_msix_init(), i. e. isn't supplied by the
      driver, in the first place.
  o In mp_ndesc_handler(), handle unknown type arguments gracefully, too.

  PR:		235031 (likely) [1]
  Reviewed by:	shurd
  Differential Revision:	https://reviews.freebsd.org/D20175

Changes:
_U  stable/12/
  stable/12/sys/net/iflib.c
Comment 35 Martin Birgmeier 2019-08-15 12:32:02 UTC
Update: Today I tried to revert to if_em from releng/12.0 updated to latest. Result: The performance is still abysmal.

I'll switch to net/intel-em-kmod again.

This is with FreeBSD mizar.xyzzy 12.0-RELEASE-p9 FreeBSD 12.0-RELEASE-p9 #9 r350677M: Wed Aug  7 18:35:55 CEST 2019     root@v908.xyzzy:/.../hal/z/OBJ/FreeBSD/amd64/releng/12.0/amd64.amd64/sys/XYZZY_SMP  amd64.
Comment 36 Martin Birgmeier 2019-08-15 12:34:21 UTC
I just notice that the console and syslog have about 20 messages of

em: frame error: ignored
em: frame error: ignored
em: frame error: ignored
em: frame error: ignored
em: frame error: ignored

Uptime is 2 1/2 hours.
Comment 37 Martin Birgmeier 2019-08-31 13:30:34 UTC
(Adding Bruce's mail here:)

On 31.08.19 13:44, Bruce Evans wrote:
> On Thu, 15 Aug 2019 a bug that doesn't want replies at freebsd.org wrote:
>
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235031
>>
>> --- Comment #36 from Martin Birgmeier <d8zNeCFG@aon.at> ---
>> I just notice that the console and syslog have about 20 messages of
>>
>> em: frame error: ignored
>> em: frame error: ignored
>> em: frame error: ignored
>> em: frame error: ignored
>> em: frame error: ignored
>>
>> Uptime is 2 1/2 hours. 
>
> You seem to be using my old patch which is not in -current:
>
> Index: em_txrx.c
> XX ===================================================================
> XX --- em_txrx.c    (revision 348771)
> XX +++ em_txrx.c    (working copy)
> XX @@ -629,9 +629,20 @@
> XX XX          /* Make sure bad packets are discarded */
> XX          if (errors & E1000_RXD_ERR_FRAME_ERR_MASK) {
> XX +#if 0
> XX              adapter->dropped_pkts++;
> XX -            /* XXX fixup if common */
> XX              return (EBADMSG);
> XX +#else
> XX +            /*
> XX +             * XXX the above error handling is worse than none.
> XX +             * First it it drops 'i' packets before the current
> XX +             * one and doesn't count them.  Then it returns an
> XX +             * error.  iflib can't really handle this error.
> XX +             * It just resets, and this usually drops many more
> XX +             * packets (without counting them) and much time.
> XX +             */
> XX +            printf("lem: frame error: ignored\n");
> XX +#endif
> XX          }
> XX XX          ri->iri_frags[i].irf_flid = 0;
> XX @@ -692,8 +703,12 @@
> XX XX          /* Make sure bad packets are discarded */
> XX          if (staterr & E1000_RXDEXT_ERR_FRAME_ERR_MASK) {
> XX +#if 0
> XX              adapter->dropped_pkts++;
> XX              return EBADMSG;
> XX +#else
> XX +            printf("em: frame error: ignored\n");
> XX +#endif
> XX          }
> XX XX          ri->iri_frags[i].irf_flid = 0;
>
> Without this patch, no message is printed and the device takes a long
> time to recover (when I wrote the patch, recovery was from something
> like a watchdog timeout after many seconds).  With the patch, the recovery
> is good enough for nfs over udp to not lose any data or time out, but I
> don't trust this so I print the message.
>
> Pre-iflib versions of [l]em handled this correctly by dropping a single
> packet, which was easy to do.  Unpatched iflib makes a mess by returning
> with subsequent packets unprocessed.  It apparently just stops receiving
> until kicked by a watchdog.
>
> I don't know what causes this error.  Maybe just a bad cable or switch.
> I don't see it for I218V with the same cable and switch.
>
> Bruce
Comment 38 Martin Birgmeier 2019-08-31 13:35:21 UTC
Your are right, I forgot to remove this patch when updating to the latest 12.0 patchlevel. But I guess that should not adversely affect the driver.

Anyway, with the patch performance was just good enough to let some NFS through.

However, this certainly is no hardware problem because when using the driver from ports (net/intel-em-kmod) performance is perfect.

Something is still very broken with the conversion of this driver to iflib.

I just noticed that comment #34 just shows a merge to stable/12, but I am running releng/12.0. So I should probably merge this stuff and see how that works.

-- Martin