Bug 206932

Summary: Realtek 8111 card stops responding under high load in netmap mode
Product: Base System Reporter: Olivier - interfaSys sàrl <software-freebsd>
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: Closed Overcome By Events    
Severity: Affects Only Me CC: hippi-viking, vmaffione
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Olivier - interfaSys sàrl 2016-02-04 21:53:27 UTC
I've filed a bug report with netmap, but it seems the FreeBSD project is using a different tree, so I'm reporting it here as well.
I've reproduced the problem with
* 10.1
* 10.2
* 10.2 with the netmap + re code from 11-CURRENT
* 10.2 with netmap from the official repository (master)

The problem is always the same

Using pkt-gen and after 20 or so "batches", the card is overloaded and stops responding. I've tried various driver settings (polling, fast queue, no MSI, irq filtering, etc.), but nothing helped.

There is a driver from Realtek, but it doesn't support netmap, so I've tried to patch it, but I've got exactly the same results as described in other netmap issues. Only one batch makes it. If I limit the rate, it fails after the total of each batch matches the one of a default batch.

One thing I've noticed in my tests is that the generic software implementation (which works flawlessly, but eats a lot of CPU) has 1024 queues and when looking at the number of mbufs used with netstat, I can see that 1024 are in use.
In dmesg, I can see that the realtek driver support 256 queues, but in netstat, it uses 512 and sometimes even more (erratic changes up to 600+ at which point things fail).

Could this be the reason? Is this fixable in netmap or is this a driver issue which should be reported in the FreeBSD project?

Details about the card

```
re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 0xe000-0xe0ff mem 0x81300000-0x81300fff,0xa0100000-0xa0103fff irq 17 at device 0.0 on pci2
re0: Using 1 MSI-X message
re0: turning off MSI enable bit.
re0: Chip rev. 0x4c000000
re0: MAC rev. 0x00000000
miibus0: <MII bus> on re0
rgephy0: <RTL8251 1000BASE-T media interface> PHY 1 on miibus0
rgephy0:  none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow
re0: Using defaults for TSO: 65518/35/2048
re0: netmap queues/slots: TX 1/256, RX 1/256

re0@pci0:2:0:0: class=0x020000 card=0x012310ec chip=0x816810ec rev=0x0c hdr=0x00
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8111/8168B PCI Express Gigabit Ethernet controller'
    class      = network
    subclass   = ethernet
    bar   [10] = type I/O Port, range 32, base rxe000, size 256, enabled
    bar   [18] = type Memory, range 64, base rx81300000, size 4096, enabled
    bar   [20] = type Prefetchable Memory, range 64, base rxa0100000, size 16384, enabled
```
Comment 1 Olivier - interfaSys sàrl 2016-02-05 02:09:19 UTC
I've just tested on 11-CURRENT and got the same results.
Comment 2 Olivier - interfaSys sàrl 2016-02-05 15:38:00 UTC
Setting re0 to use a MTU of 9000 and the connection stays alive. Instead of timing out, the packet rate drops drastically once and things go back to normal. 
The main difference in netstat is that the mbuf clusters are split between standard and jumbo frames

```
768/2787/3555 mbufs in use (current/cache/total)
256/1524/1780/500200 mbuf clusters in use (current/cache/total/max)
256/1515 mbuf+clusters out of packet secondary zone in use (current/cache)
0/46/46/250099 4k (page size) jumbo clusters in use (current/cache/total/max)
256/65/321/74103 9k jumbo clusters in use (current/cache/total/max)
0/0/0/41683 16k jumbo clusters in use (current/cache/total/max)
3008K/4513K/7521K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
```

The rate in vmstat keeps rising, but that doesn't seem to be a problem

```
interrupt                          total       rate
irq16: sdhci_pci0                      1          0
cpu0:timer                       3008083       1113
irq256: ahci0                      10125          3
irq257: xhci0                      11363          4
irq258: hdac0                          3          0
irq259: re0                     13105929       4850
irq260: re1                       101440         37
cpu2:timer                       1095578        405
cpu1:timer                       1083354        400
cpu3:timer                       1123144        415
Total                           19539020       7231
```
Comment 3 Olivier - interfaSys sàrl 2016-02-05 16:52:56 UTC
I think this is logged when things start failing
```
231.020147 [2925] netmap_transmit           re0 full hwcur 0 hwtail 0 qlen 255 len 42 m 0xfffff80052005400
235.997171 [2925] netmap_transmit           re0 full hwcur 0 hwtail 0 qlen 255 len 42 m 0xfffff80023ec9c00
240.989245 [2925] netmap_transmit           re0 full hwcur 0 hwtail 0 qlen 255 len 42 m 0xfffff800521ad500
247.887586 [2925] netmap_transmit           re0 full hwcur 0 hwtail 0 qlen 255 len 42 m 0xfffff80023da9b00
253.069781 [2925] netmap_transmit           re0 full hwcur 0 hwtail 0 qlen 255 len 42 m 0xfffff80023ec7700
258.110746 [2925] netmap_transmit           re0 full hwcur 0 hwtail 0 qlen 255 len 42 m 0xfffff800521ade00
263.188076 [2925] netmap_transmit           re0 full hwcur 0 hwtail 0 qlen 255 len 42 m 0xfffff800237d6900
```
Comment 4 Olivier - interfaSys sàrl 2016-05-24 09:43:14 UTC
Using fresh netmap from FreeBSD 11 and a newer pkt-gen, this is what I see.

986.519903 [2163] netmap_ioctl              nr_cmd must be 0 not 12
047.486179 [1481] nm_txsync_prologue        fail head < kring->rhead || head > kring->rtail
047.510386 [1511] nm_txsync_prologue        re0 TX0 kring error: head 107 cur 107 tail 106 rhead 52 rcur 52 rtail 106 hwcur 52 hwtail 106
047.534818 [1612] netmap_ring_reinit        called for re0 TX0
051.945718 [1481] nm_txsync_prologue        fail head < kring->rhead || head > kring->rtail
051.990215 [1511] nm_txsync_prologue        re0 TX0 kring error: head 225 cur 225 tail 224 rhead 223 rcur 223 rtail 224 hwcur 223 hwtail 224
052.009143 [1612] netmap_ring_reinit        called for re0 TX0

At this point pkt-gen exits with error. 

I've also tried using the netmap software emulation and it crashes even earlier.
Comment 5 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-11 09:46:32 UTC
Current netmap code in HEAD, stable/11 and stable/12 is aligned to the github
(and code has changed quite a lot since 2016).
I just tried to run pkt-gen (tx or rx) in a VM with a r8169 emulated NIC, and everything seems to work fine to me.

Can you check if the issue is still there?
Comment 6 Istvan Sipos-Szabo 2019-03-20 10:30:02 UTC
(In reply to Vincenzo Maffione from comment #5)
I still can see this issue under load on FreeBSD 11.2-RELEASE-p9-HBSD with an integrated Realtek RTL8111/8168/8411 chip:

re0@pci0:1:0:0:	class=0x020000 card=0x11c01734 chip=0x816810ec rev=0x06 hdr=0x00
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet

351.711973 [1613] netmap_ring_reinit        called for re0 RX0
002.919057 [1613] netmap_ring_reinit        called for re0 RX0
908.475302 [1512] nm_txsync_prologue        re0 TX1 kring error: head 765 cur 766 tail 765 rhead 766 rcur 766 rtail 765 hwcur 766 hwtail 765
908.488304 [1613] netmap_ring_reinit        called for re0 TX1
177.477234 [1613] netmap_ring_reinit        called for re0 RX0
603.209901 [1613] netmap_ring_reinit        called for re0 RX1
873.183945 [1512] nm_txsync_prologue        re0 TX1 kring error: head 5 cur 6 tail 5 rhead 6 rcur 6 rtail 5 hwcur 6 hwtail 5
873.195708 [1613] netmap_ring_reinit        called for re0 TX1
912.229349 [1613] netmap_ring_reinit        called for re0 RX1
921.732103 [1512] nm_txsync_prologue        re0 TX0 kring error: head 148 cur 150 tail 148 rhead 150 rcur 150 rtail 148 hwcur 150 hwtail 148
921.744464 [1613] netmap_ring_reinit        called for re0 TX0
931.657772 [1613] netmap_ring_reinit        called for re0 RX0
931.727492 [1512] nm_txsync_prologue        re0 TX1 kring error: head 734 cur 735 tail 734 rhead 735 rcur 735 rtail 734 hwcur 735 hwtail 734
931.746382 [1613] netmap_ring_reinit        called for re0 TX1
605.399805 [1613] netmap_ring_reinit        called for re0 RX1
043.944163 [1613] netmap_ring_reinit        called for re0 RX0
942.272725 [1512] nm_txsync_prologue        re0 TX1 kring error: head 771 cur 773 tail 772 rhead 773 rcur 773 rtail 772 hwcur 773 hwtail 772
942.286046 [1613] netmap_ring_reinit        called for re0 TX1
597.788061 [1512] nm_txsync_prologue        re0 TX1 kring error: head 330 cur 331 tail 330 rhead 331 rcur 331 rtail 330 hwcur 331 hwtail 330
597.800517 [1613] netmap_ring_reinit        called for re0 TX1
921.844295 [1512] nm_txsync_prologue        re0 TX1 kring error: head 448 cur 449 tail 448 rhead 449 rcur 449 rtail 448 hwcur 449 hwtail 448
921.857727 [1613] netmap_ring_reinit        called for re0 TX1
929.117950 [1613] netmap_ring_reinit        called for re0 RX0
955.503793 [1512] nm_txsync_prologue        re0 TX1 kring error: head 694 cur 695 tail 694 rhead 695 rcur 695 rtail 694 hwcur 695 hwtail 694
955.517123 [1613] netmap_ring_reinit        called for re0 TX1
054.130772 [1613] netmap_ring_reinit        called for re0 RX0
239.912053 [1613] netmap_ring_reinit        called for re0 RX1
426.000205 [1613] netmap_ring_reinit        called for re0 RX0
431.708618 [1512] nm_txsync_prologue        re0 TX0 kring error: head 482 cur 483 tail 481 rhead 483 rcur 483 rtail 481 hwcur 483 hwtail 481
431.721230 [1613] netmap_ring_reinit        called for re0 TX0
466.558021 [1613] netmap_ring_reinit        called for re0 RX0
505.101756 [1613] netmap_ring_reinit        called for re0 RX0
626.222834 [1613] netmap_ring_reinit        called for re0 RX1
745.757168 [1613] netmap_ring_reinit        called for re0 RX1
128.048022 [1512] nm_txsync_prologue        re0 TX0 kring error: head 126 cur 126 tail 125 rhead 127 rcur 127 rtail 125 hwcur 127 hwtail 125
128.060369 [1613] netmap_ring_reinit        called for re0 TX0
301.754435 [1613] netmap_ring_reinit        called for re0 RX1
305.059117 [1613] netmap_ring_reinit        called for re0 RX0
476.838054 [1512] nm_txsync_prologue        re0 TX1 kring error: head 1023 cur 0 tail 1023 rhead 0 rcur 0 rtail 1023 hwcur 0 hwtail 1023
476.850660 [1613] netmap_ring_reinit        called for re0 TX1
810.251369 [1613] netmap_ring_reinit        called for re0 RX0
973.553584 [1613] netmap_ring_reinit        called for re0 RX1
991.498730 [1613] netmap_ring_reinit        called for re0 RX0
016.098769 [1613] netmap_ring_reinit        called for re0 RX1
068.708621 [1613] netmap_ring_reinit        called for re0 RX1
072.092164 [1613] netmap_ring_reinit        called for re0 RX1
128.526783 [1512] nm_txsync_prologue        re0 TX0 kring error: head 706 cur 707 tail 705 rhead 707 rcur 707 rtail 705 hwcur 707 hwtail 705
128.539348 [1613] netmap_ring_reinit        called for re0 TX0
157.709591 [1613] netmap_ring_reinit        called for re0 RX1
167.110939 [1613] netmap_ring_reinit        called for re0 RX1
176.127982 [1512] nm_txsync_prologue        re0 TX0 kring error: head 155 cur 156 tail 154 rhead 156 rcur 156 rtail 154 hwcur 156 hwtail 154
176.140343 [1613] netmap_ring_reinit        called for re0 TX0
197.908583 [1613] netmap_ring_reinit        called for re0 RX1
246.999784 [1613] netmap_ring_reinit        called for re0 RX0
582.108866 [1613] netmap_ring_reinit        called for re0 RX0
648.049984 [1613] netmap_ring_reinit        called for re0 RX1
782.862717 [1512] nm_txsync_prologue        re0 TX1 kring error: head 154 cur 155 tail 154 rhead 155 rcur 155 rtail 154 hwcur 155 hwtail 154
782.875882 [1613] netmap_ring_reinit        called for re0 TX1
790.772680 [1613] netmap_ring_reinit        called for re0 RX0
794.549076 [1613] netmap_ring_reinit        called for re0 RX0
950.451374 [1512] nm_txsync_prologue        re0 TX1 kring error: head 672 cur 673 tail 672 rhead 673 rcur 673 rtail 672 hwcur 673 hwtail 672
950.464259 [1613] netmap_ring_reinit        called for re0 TX1
965.438616 [1613] netmap_ring_reinit        called for re0 RX1
083.794645 [1512] nm_txsync_prologue        re0 TX0 kring error: head 1 cur 2 tail 0 rhead 2 rcur 2 rtail 0 hwcur 2 hwtail 0
083.805552 [1613] netmap_ring_reinit        called for re0 TX0
085.044778 [1613] netmap_ring_reinit        called for re0 RX0
152.580241 [1512] nm_txsync_prologue        re0 TX0 kring error: head 509 cur 511 tail 510 rhead 511 rcur 511 rtail 510 hwcur 511 hwtail 510
152.592439 [1613] netmap_ring_reinit        called for re0 TX0
387.089610 [1613] netmap_ring_reinit        called for re0 RX0
414.618461 [1512] nm_txsync_prologue        re0 TX1 kring error: head 940 cur 941 tail 940 rhead 941 rcur 941 rtail 940 hwcur 941 hwtail 940
414.631376 [1613] netmap_ring_reinit        called for re0 TX1
504.044908 [1613] netmap_ring_reinit        called for re0 RX0
508.430972 [1613] netmap_ring_reinit        called for re0 RX1
572.384770 [1512] nm_txsync_prologue        re0 TX1 kring error: head 769 cur 770 tail 769 rhead 770 rcur 770 rtail 769 hwcur 770 hwtail 769
572.398158 [1613] netmap_ring_reinit        called for re0 TX1
728.112014 [1613] netmap_ring_reinit        called for re0 RX1
Comment 7 Vincenzo Maffione freebsd_committer freebsd_triage 2019-03-20 11:06:21 UTC
(In reply to hippi-viking from comment #6)

Hi,
 Can you please specify the exact command line (e.g. pkt-gen if you are using pkt-gen) that causes the issue?

Also, can you build and run this program and reports its output? It will just print the number of rings and slots.

````
#include <stdio.h>
#include <stdint.h>
#include <net/if.h>
#define NETMAP_WITH_LIBS
#include <net/netmap_user.h>
#include <assert.h>

int main(int argc, char **argv)
{
	const char *ifname = argv[1];
	struct nm_desc *nmd;
	char name[100];

	assert(argc >= 2);
	snprintf(name, sizeof(name), "netmap:%s", ifname);
	nmd = nm_open(name, NULL, 0, NULL);
	assert(nmd);
	printf("txr %u rxr %u txd %u rxd %u\n",
		nmd->nifp->ni_tx_rings,
		nmd->nifp->ni_rx_rings,
		NETMAP_TXRING(nmd->nifp, 0)->num_slots,
		NETMAP_RXRING(nmd->nifp, 0)->num_slots);
	nm_close(nmd);

	return 0;
}

````

Command line
  # ./info re0
Comment 8 Istvan Sipos-Szabo 2019-03-20 11:39:53 UTC
(In reply to Vincenzo Maffione from comment #7)

Hello :)

I suspect this issue is caused by suricata for me running in IPS mode. I don't know the exact command line used but will look into it - and run your code as well - when I will have some more time.

The background is that this is a Fujitsu Futro S900 thin client running Opnsense - which is based on HardenedBSD, being close to vanilla FreeBSD in this issue, but(?) supposedly using the latest official Realtek driver - I don't know exactly which re0 driver FreeBSD is using at the moment. The integrated re0 nic (RTL8111E according to dmidecode) is connected to the ISP's cable modem at a 100Mbps full-duplex connection. The upline connection is 10/2Mbps, I get the kring errors when downloading at full-speed.

Thanks for looking into this!
Cheers,
Istvan
Comment 9 Vincenzo Maffione freebsd_committer freebsd_triage 2019-03-20 14:55:26 UTC
(In reply to hippi-viking from comment #8)

Sounds good. If you provide suricata configuration files I can try to reproduce your issue...
Also please check the netmap version coming with your HardenedBSD, as I don't know how much that is aligned with FreeBSD.

The fact that you are using the official Realtek drivers explain why it looks like your re0 has 1024 descriptors (RX and TX), while the driver in FreeBSD only allows for 256 (non configurable).
Comment 10 Istvan Sipos-Szabo 2019-03-26 08:33:58 UTC
Hi Vincenzo,

Sorry for the late reply! I was stress-testing the firewall and so far it seems tha above was just a one-off issue unrelated to load or Suricata. I continued to see the same errors with very little to no load at all even with Suricata stopped. After a reboot the card seems stable (no errors) even when loading it as much as I can (with the slow upstream cable connection I have).
I think it probably might be safe to close off this issue.

Thank you for your help!
Cheers,
Istvan