Bug 230465

Summary: ixl: not working in netmap mode
Product: Base System Reporter: Charles Goncalves <halfling>
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: In Progress ---    
Severity: Affects Some People CC: 43381155, franco, jeffrey.e.pieper, krzysztof.galazka, m.muenz, m.muenz, net, ozkan.kirik, sergey, shurd, slw, strongswan, v.maffione, vmaffione
Priority: --- Keywords: IntelNetworking
Version: 12.1-STABLE   
Hardware: amd64   
OS: Any   
See Also: https://github.com/OISF/suricata/pull/3616

Description Charles Goncalves 2018-08-08 17:57:27 UTC
Hello!

I have a ixl NIC (chip=0x15728086 'Ethernet Controller X710 for 10GbE SFP+) and I'm trying to work with netmap.

When I was compiling kernel with ixl support I have errors with missing reference of: "'ixl_rx_miss', 'ixl_rx_miss_bufs and 'ixl_crcstrip'" so then I modify ixl_txrx.c and added this references like this:

#ifdef DEV_NETMAP                                         
#include <dev/netmap/if_ixl_netmap.h>                     
int ixl_rx_miss = 0, ixl_rx_miss_bufs = 0, ixl_crcstrip = 1;
#endif /* DEV_NETMAP */  

When I did this my kernel was compiled with sucess and now I see ixl interfaces in "ifconfig" command.

Then now I'm trying netmap on then, but seems not working. In my application on top of netmap I see "dmesg" like this:

Aug  8 14:46:56 rt1 kernel: 415.918289 [1637] nm_txsync_prologue        ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 2 c 2 t 1387166207 rh 1 rc 1 rt 1387166207 hc 1 ht 1387166207
Aug  8 14:46:56 rt1 kernel: 415.973692 [1758] netmap_ring_reinit        called for ixl0 TX3
Aug  8 14:46:56 rt1 kernel: 415.990602 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:46:56 rt1 kernel: 416.006198 [1787] netmap_ring_reinit        ixl0 TX3 reinit, cur 2 -> 1 tail 1387166207 -> 1387166207
Aug  8 14:46:56 rt1 kernel: 416.032990 [1637] nm_txsync_prologue        ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 2 c 2 t 1387166207 rh 1 rc 1 rt 1387166207 hc 1 ht 1387166207
Aug  8 14:46:56 rt1 kernel: 416.088614 [1758] netmap_ring_reinit        called for ixl0 TX3
Aug  8 14:46:56 rt1 kernel: 416.105520 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:46:56 rt1 kernel: 416.121113 [1787] netmap_ring_reinit        ixl0 TX3 reinit, cur 2 -> 1 tail 1387166207 -> 1387166207
Aug  8 14:46:57 rt1 kernel: 417.089185 [1637] nm_txsync_prologue        ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 2 c 2 t 1387166207 rh 1 rc 1 rt 1387166207 hc 1 ht 1387166207
Aug  8 14:46:57 rt1 kernel: 417.144605 [1758] netmap_ring_reinit        called for ixl0 TX3
Aug  8 14:46:57 rt1 kernel: 417.161510 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:46:57 rt1 kernel: 417.177110 [1787] netmap_ring_reinit        ixl0 TX3 reinit, cur 2 -> 1 tail 1387166207 -> 1387166207
Aug  8 14:46:58 rt1 kernel: 418.138193 [1637] nm_txsync_prologue        ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 2 c 2 t 1387166207 rh 1 rc 1 rt 1387166207 hc 1 ht 1387166207
Aug  8 14:46:58 rt1 kernel: 418.193599 [1758] netmap_ring_reinit        called for ixl0 TX3
Aug  8 14:46:58 rt1 kernel: 418.210507 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:46:58 rt1 kernel: 418.226096 [1787] netmap_ring_reinit        ixl0 TX3 reinit, cur 2 -> 1 tail 1387166207 -> 1387166207



Using pkt-get from netmap github I'm able to receive packets but not able to transmit then like this:

comand: pkt-gen -i ixl0 -f tx

637.872347 main [2593] interface is ixl0
637.872394 main [2727] running on 1 cpus (have 8)
637.872601 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234
637.872618 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234
638.046374 main [2822] mapped 294020KB at 0x801600000
Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff)
638.046466 main [2919] Sending 512 packets every  0.000000000 s
638.046507 start_threads [2274] Wait 2 secs for phy reset
640.145075 start_threads [2276] Ready...
640.145254 sender_body [1464] start, fd 3 main_fd 3
640.863306 sender_body [1538] poll error on 3 ring 0-7
641.198102 main_thread [2364] 7.780 Kpps (8.191 Kpkts 3.932 Mbps in 1052845 usec) 511.94 avg_batch 0 min_space
641.372908 main_thread [2391] ouch, thread 0 exited with error
Sent 8191 packets 491460 bytes 16 events 60 bytes each in -1533750640.15 seconds.
Speed: -0.000 pps Bandwidth: -0.003 bps (raw -0.004 bps). Average batch: 511.94 pkts



Then "dmesg" show:
Aug  8 14:51:48 rt1 kernel: 708.870527 [1637] nm_txsync_prologue        ixl0 TX1: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -278400513 rh 512 rc 512 rt -278400513 hc 512 ht -278400513
Aug  8 14:51:48 rt1 kernel: 708.927494 [1758] netmap_ring_reinit        called for ixl0 TX1
Aug  8 14:51:49 rt1 kernel: 708.944399 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:51:49 rt1 kernel: 708.959993 [1787] netmap_ring_reinit        ixl0 TX1 reinit, cur 0 -> 512 tail -278400513 -> -278400513
Aug  8 14:51:49 rt1 kernel: 708.987295 [1637] nm_txsync_prologue        ixl0 TX2: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -440139265 rh 512 rc 512 rt -440139265 hc 512 ht -440139265
Aug  8 14:51:49 rt1 kernel: 709.044489 [1758] netmap_ring_reinit        called for ixl0 TX2
Aug  8 14:51:49 rt1 kernel: 709.061399 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:51:49 rt1 kernel: 709.076993 [1787] netmap_ring_reinit        ixl0 TX2 reinit, cur 0 -> 512 tail -440139265 -> -440139265
Aug  8 14:51:49 rt1 kernel: 709.104291 [1637] nm_txsync_prologue        ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || 
Aug  8 14:51:49 rt1 kernel: kring->nr_hwtail >= n' h 0 c 0 t 1455426047 rh 512 rc 512 rt 1455426047 hc 512 ht 1455426047
Aug  8 14:51:49 rt1 kernel: 709.161491 [1758] netmap_ring_reinit        called for ixl0 TX3
Aug  8 14:51:49 rt1 kernel: 709.178394 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:51:49 rt1 kernel: 709.193987 [1787] netmap_ring_reinit        ixl0 TX3 reinit, cur 0 -> 512 tail 1455426047 -> 1455426047
Aug  8 14:51:49 rt1 kernel: 709.221304 [1637] nm_txsync_prologue        ixl0 TX4: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t 1457261055 rh 512 rc 512 rt 1457261055 hc 512 ht 1457261055
Aug  8 14:51:49 rt1 kernel: 709.278488 [1758] netmap_ring_reinit        called for ixl0 TX4
Aug  8 14:51:49 rt1 kernel: 709.295391 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:51:49 rt1 kernel: 709.310998 [1787] netmap_ring_reinit        ixl0 TX4 reinit, cur 0 -> 512 tail 1457261055 -> 1457261055
Aug  8 14:51:49 rt1 kernel: 709.338286 [1637] nm_txsync_prologue        ixl0 TX5: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t 1744312831 rh 512 rc 512 rt 1744312831 hc 512 ht 1744312831
Aug  8 14:51:49 rt1 kernel: 709.395485 [1758] netmap_ring_reinit        called for ixl0 TX5
Aug  8 14:51:49 rt1 kernel: 709.412388 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:51:49 rt1 kernel: 709.427981 [1787] netmap_ring_reinit        ixl0 TX5 reinit, cur 0 -> 512 tail 1744312831 -> 1744312831
Aug  8 14:51:49 rt1 kernel: 709.455284 [1637] nm_txsync_prologue        ixl0 TX6: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1586220545 rh 512 rc 512 rt -1586220545 hc 512 ht -1586220545
Aug  8 14:51:49 rt1 kernel: 709.513263 [1758] netmap_ring_reinit        called for ixl0 TX6
Aug  8 14:51:49 rt1 kernel: 709.530166 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:51:49 rt1 kernel: 709.545759 [1787] netmap_ring_reinit        ixl0 TX6 reinit, cur 0 -> 512 tail -1586220545 -> -1586220545
Aug  8 14:51:49 rt1 kernel: 709.573581 [1637] nm_txsync_prologue        ixl0 TX7: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1429192193 rh 512 rc 512 rt -1429192193 hc 512 ht -1429192193
Aug  8 14:51:49 rt1 kernel: 709.631560 [1758] netmap_ring_reinit        called for ixl0 TX7
Aug  8 14:51:49 rt1 kernel: 709.648463 [1783] netmap_ring_reinit        total 1 errors
Aug  8 14:51:49 rt1 kernel: 709.664056 [1787] netmap_ring_reinit        ixl0 TX7 reinit, cur 0 -> 512 tail -1429192193 -> -1429192193



Am I missing something?

Thanks!
Comment 1 Michael 2018-08-08 19:30:22 UTC
X710 has a couple of problem with netmap. 
Also experienced it with all available firmware versions for the NIC.

I just downgraded to X520 or Chelsio.

For me the NIC just freezes, carrier active and seeing incoming packets via tcpdump but nothing else ...
Comment 2 Charles Goncalves 2018-08-14 18:07:47 UTC
If I run pkt-gen with rate option then pkt-gen sends packets.

Like this: pkg-gen -f tx -R 150000

If I change value greater than 150k i.e: 160000 I got same error:

/pkt-gen -i ixl0 -f tx -R 160000 
812.653021 main [2593] interface is ixl0
812.653078 main [2727] running on 1 cpus (have 8)
812.653279 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234
812.653296 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234
812.826944 main [2822] mapped 294020KB at 0x801600000
Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff)
812.827043 main [2919] Sending 512 packets every  0.003200000 s
812.827085 start_threads [2274] Wait 2 secs for phy reset
814.836422 start_threads [2276] Ready...
814.836644 sender_body [1464] start, fd 3 main_fd 3
815.838007 main_thread [2364] 0.000 pps (0.000 pkts 0.000 bps in 1001364 usec) 0.00 avg_batch 0 min_space
816.134343 sender_body [1538] poll error on 3 ring 0-7
816.886366 main_thread [2364] 1.954 Kpps (2.048 Kpkts 983.040 Kbps in 1048359 usec) 341.33 avg_batch 99999 min_space
817.061069 main_thread [2391] ouch, thread 0 exited with error
Sent 2048 packets 122880 bytes 6 events 60 bytes each in -1534269816.00 seconds.
Speed: -0.000 pps Bandwidth: -0.001 bps (raw -0.001 bps). Average batch: 341.33 pkts


And kernel log:
Aug 14 15:03:36 rt1 kernel: 816.016016 [1637] nm_txsync_prologue        ixl0 TX1: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 275 c 275 t -278400513 rh 274 rc 274 rt -278400513 hc 274 ht -278400513
Aug 14 15:03:36 rt1 kernel: 816.074020 [1758] netmap_ring_reinit        called for ixl0 TX1
Aug 14 15:03:36 rt1 kernel: 816.090925 [1783] netmap_ring_reinit        total 1 errors
Aug 14 15:03:36 rt1 kernel: 816.106519 [1787] netmap_ring_reinit        ixl0 TX1 reinit, cur 275 -> 274 tail -278400513 -> -278400513



These errors seem to be related to synchronization of HEAD/CUR/TAIL in netmap ring
Comment 3 Charles Goncalves 2018-08-22 21:43:06 UTC
I've noted that in POLLIN this: 
poll(pfd, rxrings, -1);
Doesn't work, need it to be something like: 
poll(pfd, rxrings, 1); 
But with this I have a latency problem.

And I also have noted that I only can TX from TX-RING-0 if I use TX-RING more than 0, doesn't work. In my test scenario my ixl NIC has 8 rings.

Can anyone understand the reason?

Thanks!
Comment 4 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-10 22:29:36 UTC
The compilation problems have been fixed.
Which FreeBSD version are you using? We need to understand if your ixl driver is backed by iflib or not.
Comment 5 Charles Goncalves 2019-01-11 12:27:41 UTC
(In reply to Vincenzo Maffione from comment #4)
Hello Vincenzo, thank you for your answer!

I was running tests in FreeBSD 11.2 STABLE.

For coming months I'll upgrade to 12.0 STABLE.
Comment 6 Ozkan KIRIK 2019-01-26 15:03:04 UTC
I'm facing same problem on FreeBSD 11.2-p8. My NICS are Intel X722.
I'm tring to use suricata using netmap but no packets can received.
But Intel i350 nic installed on same hardware. igb is working.
Comment 7 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-26 17:58:14 UTC
Hi,
  From the log it's quite clear that the problem is that netmap TXSYNC is reading a random value for the hw HEAD index, that is the value of the last completed TX descriptor.
Now, in the driver there are two ways to get this index, depending on the value of `hw.ixl.enable_head_writeback`. But the ixl driver seems to be aware of this difference and prevent the use of netmap if this is not possible.
So I don't quite understand why this is not working.

What is the value of `sysctl hw.ixl.enable_head_writeback` in your setup ?
Also, what does `dmesg | grep "netmap queues"` say ?

In any case, this is affecting 11.x because it does not use iflib yet.
From 12.x on iflib is used, and netmap support is provided through iflib, which means that netmap works on ixl iff regular network stack works in ixl.
Comment 8 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-26 18:19:44 UTC
Btw, I prepared the following clean-up patch for ixl, any testing is welcome:

https://reviews.freebsd.org/D18984
Comment 9 Ozkan KIRIK 2019-01-27 10:29:00 UTC
Hello,

I done too many tests for this patch.

On stable/11 kernel, both with and without this patch, netmap doesn't work. suricata cannot capture any packets. counters are always zero.

On releng/11.2 kernel (FreeBSD 11.2-RELEASE-p8 #0 r343486), netmap with suricata works. But ixl patch cannot be applied.

I tested release/12.0, releng/12.0 and stable/12 also. Netmap doesnt work on all 12.0 branches. suricata cannot capture any packets. counters are always zero.

I think some commits after 11.2-p8 brakes netmap support.
Comment 10 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-28 09:48:08 UTC
(In reply to Ozkan KIRIK from comment #9)

Hi,
  Thanks a lot for testing and for the patience.
Yes, this patch was not meant to fix the issue, but just to clean up a little bit.
It would help to know what is the value of `sysctl hw.ixl.enable_head_writeback` in your setup, and what does `dmesg | grep "netmap queues"` say.

Could you please point me at the URL where you got your "releng/11.2 kernel (FreeBSD 11.2-RELEASE-p8 #0 r343486)" exactly ? You are saying that ixl in this version works, but r343486 corresponds to HEAD, which you are saying it doesn't work...
If ixl works on releng/11.2 than we can check what happened since then.

If you found that ixl/netmap doesn't work on 12.x, it means that the issue there is in iflib, since ixl uses iflib, and iflib provides netmap support for all the drivers.
We probably need to open another ticket for that, since it is a completely different piece of code.
Comment 11 Krzysztof Galazka 2019-01-28 11:50:28 UTC
(In reply to Vincenzo Maffione from comment #10)

If I'm reading the history right there was only one change between 11.2-RELEASE and 11-STABLE in the ixl: https://github.com/freebsd/freebsd/commits/stable/11/sys/dev/ixl
Comment 12 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-28 13:44:34 UTC
(In reply to Krzysztof Galazka from comment #11)

What single change are talking about exactly?
If you can point me at two git commits (or two svn revisions), I can look at the diff.

For now I suspect this commit
https://github.com/freebsd/freebsd/commit/27d66545b33f8e4f36fdce1003ddbbda40f5a7bb,
which introduces the hw.ixl.enable_head_writeback sysctl.
Comment 13 Krzysztof Galazka 2019-01-28 17:09:01 UTC
(In reply to Vincenzo Maffione from comment #12)

# git log  upstream/releng/11.2..upstream/stable/11 sys/dev/ixl
commit 2889f6fc498ab04853661e2f57d23fbb150128d3
Author: vmaffione <vmaffione@FreeBSD.org>
Date:   Tue Dec 4 17:40:56 2018 +0000

    MFC r339639

    netmap: align codebase to the current upstream (sha 8374e1a7e6941)

[...]

    Approved by:    gnn (mentor)
    Differential Revision:  https://reviews.freebsd.org/D17364

That's the only patch I see in git log which is in 11-STABLE but not in 11.2-RELEASE.
Comment 14 Ozkan KIRIK 2019-01-29 05:32:46 UTC
Hello,

Sorry for late response. I think my comment was not clear enough. I'm going to explain my test detailly:

------------------------------------------------------
Tested version : FreeBSD 12.0-STABLE https://svnweb.freebsd.org/base/stable/12/

Test NICs with suricata netmap if0 -> if0+ in IPS mode: 
em0 => 0 packets captured 0 dropped -> NOT WORKING
igb0 => 0 packets captured 0 dropped -> NOT WORKING
ix0 => 0 packets captured 0 dropped -> NOT WORKING
ixl0 => 0 packets captured 0 dropped -> NOT WORKING

------------------------------------------------------
Tested version : FreeBSD 12.0-p2 https://svnweb.freebsd.org/base/releng/12.0/

Test NICs with suricata netmap if0 -> if0+ in IPS mode: 
em0 => 0 packets captured 0 dropped -> NOT WORKING
igb0 => 0 packets captured 0 dropped -> NOT WORKING
ix0 => 0 packets captured 0 dropped -> NOT WORKING
ixl0 => 0 packets captured 0 dropped -> NOT WORKING

------------------------------------------------------
Tested version : FreeBSD 12.0-RELEASE https://svnweb.freebsd.org/base/release/12.0.0/

Test NICs with suricata netmap if0 -> if0+ in IPS mode: 
em0 => 0 packets captured 0 dropped -> NOT WORKING
igb0 => 0 packets captured 0 dropped -> NOT WORKING
ix0 => 0 packets captured 0 dropped -> NOT WORKING
ixl0 => 0 packets captured 0 dropped -> NOT WORKING

------------------------------------------------------
Tested version : FreeBSD 11.2-STABLE https://svnweb.freebsd.org/base/stable/11/

Test NICs with suricata netmap if0 -> if0+ in IPS mode: 
em0 => 0 packets captured 0 dropped -> NOT WORKING
igb0 => 0 packets captured 0 dropped -> NOT WORKING
ix0 => 0 packets captured 0 dropped -> NOT WORKING
ixl0 => 0 packets captured 0 dropped -> NOT WORKING

------------------------------------------------------
Tested version : FreeBSD 11.2-p8 https://svnweb.freebsd.org/base/releng/11.2/

Test NICs with suricata netmap if0 -> if0+ in IPS mode: 
em0 => 0 packets captured 0 dropped
igb0 => 0 packets captured 0 dropped
ix0 => 0 packets captured 0 dropped
ixl0 => 0 packets captured 0 dropped -> NOT WORKING

------------------------------------------------------
Tested version : FreeBSD 11.2-RELEASE https://svnweb.freebsd.org/base/release/11.2.0/

Test NICs with suricata netmap if0 -> if0+ in IPS mode: 
em0 => 8 packets captured 0 dropped -> Working
igb0 => 9 packets captured 0 dropped -> Working
ix0 => 6 packets captured 0 dropped -> Working
ixl0 => 0 packets captured 0 dropped -> NOT WORKING

------------------------------------------------------

CONCLUSION:
According to test results, netmap support fully broken after FreeBSD 11.2-p8.
ixl driver never get worked.
Comment 15 Ozkan KIRIK 2019-01-29 05:34:52 UTC
(In reply to Ozkan KIRIK from comment #14)

There was a typo in reports. The true results for 11.2-p8 is below:

------------------------------------------------------
Tested version : FreeBSD 11.2-p8 https://svnweb.freebsd.org/base/releng/11.2/

Test NICs with suricata netmap if0 -> if0+ in IPS mode: 
em0 => 5 packets captured 0 dropped -> Working
igb0 => 6 packets captured 0 dropped -> Working
ix0 => 7 packets captured 0 dropped -> Working
ixl0 => 0 packets captured 0 dropped -> NOT WORKING
Comment 16 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-29 09:28:37 UTC
(In reply to Krzysztof Galazka from comment #13)

Thanks. However, the changes to ixl are just compilation fixes that follow the many updates on the netmap code.
However, it's a bit unlikely that a netmap update broke only ixl, and not all the other drivers (e.g. em, igb, vtnet-pci, cxgbe), which are actually working after that change (on both 11.x and 12.x).
So this may be a suricata-specific issue (more on my following answer).
Comment 17 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-29 09:43:53 UTC
(In reply to Ozkan KIRIK from comment #14)

Hi Ozkan,
  Thanks, that's very clear.
Now, I can assure that stock netmap applications (pkt-gen, bridge, lb, vale-ctl, ...) work fine in both stable/11, stable/11, releng/12.0, release/12.0, etc,
at least when working with virtual interfaces (vale(4), pipes, monitors, ptnet(4)), and drivers such as em, igb and vtnet-pci.

Regarding ixl support, it looks it is broken in every version, and this is a first problem that we need to address.

Second, if you are reporting that suricata over netmap is not working at all,
it means that there must be some problem specific to suricata.
Can you please point me at which suricata code and configuration you were using, so that I can debug it (e.g. over em or igb interfaces, rather than ixl)? It's better if the configuration is minimal.

Also, it would help if you could provide "dmesg | grep netmap" on a machine where suricata over "em" interfaces does not work.

Thanks
Comment 18 Charles Goncalves 2019-01-29 10:22:51 UTC
(In reply to Vincenzo Maffione from comment #7)
Hello Vincenzo!

# sysctl hw.ixl.enable_head_writeback
hw.ixl.enable_head_writeback: 1

# dmesg | grep "netmap queues"
ixl0: netmap queues/slots: TX 1/1024, RX 1/1024
ixl1: netmap queues/slots: TX 1/1024, RX 1/1024
ixl2: netmap queues/slots: TX 1/1024, RX 1/1024
ixl3: netmap queues/slots: TX 1/1024, RX 1/1024

I have only one queue in this NIC. I usually use with X405(82599ES) that have 8 queues bind to 8 cpu core OR 16 queues when bind to 16 cpu core machine.

My question is: Will this X710 NIC have only one queue or is it a miss compiling?
Comment 19 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-29 10:56:19 UTC
(In reply to Charles Goncalves from comment #18)

Hi,
 Thanks. The hw.ixl.enable_head_writeback set to 1 looks good.
The number of queue depends on configuration, I guess. Did you change
hw.ixl.max_queues parameter dynamically? (or maybe in rc.conf).
Comment 20 Charles Goncalves 2019-01-29 11:25:24 UTC
(In reply to Vincenzo Maffione from comment #19)

Oh yes now I remember it's about 5 months. I had set this to max_queues = 1 (in /boot/loader.conf) to test with pkt-gen, because with max_queues = 8 doens't work.

Output of pkt-gen with hw.ixl.max_queues=1:

# pkt-gen -i ixl0 -f tx
945.763330 main [2593] interface is ixl0
945.763376 main [2727] running on 1 cpus (have 8)
945.763602 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234
945.763619 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234
945.907652 main [2822] mapped 294020KB at 0x801600000
Sending on netmap:ixl0: 1 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff)
945.907746 main [2919] Sending 512 packets every  0.000000000 s
945.907791 start_threads [2274] Wait 2 secs for phy reset
947.974562 start_threads [2276] Ready...
947.974737 sender_body [1464] start, fd 3 main_fd 3
947.989037 sender_body [1546] drop copy
949.001809 main_thread [2364] 4.285 Mpps (4.401 Mpkts 2.112 Gbps in 1027068 usec) 343.53 avg_batch 0 min_space
950.016811 main_thread [2364] 4.265 Mpps (4.329 Mpkts 2.078 Gbps in 1015002 usec) 341.05 avg_batch 99999 min_space
951.053574 main_thread [2364] 4.212 Mpps (4.367 Mpkts 2.096 Gbps in 1036763 usec) 341.39 avg_batch 99999 min_space
952.054597 main_thread [2364] 4.262 Mpps (4.266 Mpkts 2.048 Gbps in 1001023 usec) 341.66 avg_batch 99999 min_space



Now with hw.ixl.max_queues=8:

# pkt-gen -i ixl0 -f tx
734.500918 main [2593] interface is ixl0
734.500963 main [2727] running on 1 cpus (have 8)
734.501188 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234
734.501205 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234
734.651421 main [2822] mapped 294020KB at 0x801600000
Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff)
734.651514 main [2919] Sending 512 packets every  0.000000000 s
734.651558 start_threads [2274] Wait 2 secs for phy reset
736.666615 start_threads [2276] Ready...
736.666799 sender_body [1464] start, fd 3 main_fd 3
737.506822 sender_body [1538] poll error on 3 ring 0-7
737.677616 main_thread [2364] 8.103 Kpps (8.191 Kpkts 3.932 Mbps in 1010813 usec) 511.94 avg_batch 0 min_space
737.856747 main_thread [2391] ouch, thread 0 exited with error
Sent 8191 packets 491460 bytes 16 events 60 bytes each in -1548760736.67 seconds.
Speed: -0.000 pps Bandwidth: -0.003 bps (raw -0.004 bps). Average batch: 511.94 pkts



Then with max_queues=8 I have errors in my /var/log/messages:

Jan 29 09:21:29 rt1 kernel: 888.923727 [1637] nm_txsync_prologue        ixl0 TX1: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1678351873 rh 512 rc 512 rt -1678351873 hc 512 ht -1678351873
Jan 29 09:21:29 rt1 kernel: 888.981468 [1758] netmap_ring_reinit        called for ixl0 TX1
Jan 29 09:21:29 rt1 kernel: 888.998372 [1783] netmap_ring_reinit        total 1 errors
Jan 29 09:21:29 rt1 kernel: 889.013966 [1787] netmap_ring_reinit        ixl0 TX1 reinit, cur 0 -> 512 tail -1678351873 -> -1678351873
Jan 29 09:21:29 rt1 kernel: 889.041778 [1637] nm_txsync_prologue        ixl0 TX2: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1676512769 rh 512 rc 512 rt -1676512769 hc 512 ht -1676512769
Jan 29 09:21:29 rt1 kernel: 889.099769 [1758] netmap_ring_reinit        called for ixl0 TX2
Jan 29 09:21:29 rt1 kernel: 889.116664 [1783] netmap_ring_reinit        total 1 errors
Jan 29 09:21:29 rt1 kernel: 889.132258 [1787] netmap_ring_reinit        ixl0 TX2 reinit, cur 0 -> 512 tail -1676512769 -> -1676512769
Jan 29 09:21:29 rt1 kernel: 889.160084 [1637] nm_txsync_prologue        ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1674673665 rh 512 rc 512 rt -1674673665 hc 512 ht -1674673665
Jan 29 09:21:29 rt1 kernel: 889.218053 [1758] netmap_ring_reinit        called for ixl0 TX3
Jan 29 09:21:29 rt1 kernel: 889.234957 [1783] netmap_ring_reinit        total 1 errors
Jan 29 09:21:29 rt1 kernel: 889.250560 [1787] netmap_ring_reinit        ixl0 TX3 reinit, cur 0 -> 512 tail -1674673665 -> -1674673665
Jan 29 09:21:29 rt1 kernel: 889.278372 [1637] nm_txsync_prologue        ixl0 TX4: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1672834561 rh 512 rc 512 rt -1672834561 hc 512 ht -1672834561
Jan 29 09:21:29 rt1 kernel: 889.336348 [1758] netmap_ring_reinit        called for ixl0 TX4
Jan 29 09:21:29 rt1 kernel: 889.353258 [1783] netmap_ring_reinit        total 1 errors
Jan 29 09:21:29 rt1 kernel: 889.368847 [1787] netmap_ring_reinit        ixl0 TX4 reinit, cur 0 -> 512 tail -1672834561 -> -1672834561
Jan 29 09:21:29 rt1 kernel: 889.396679 [1637] nm_txsync_prologue        ixl0 TX5: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1670995457 rh 512 rc 512 rt -1670995457 hc 512 ht -1670995457
Jan 29 09:21:29 rt1 kernel: 889.454648 [1758] netmap_ring_reinit        called for ixl0 TX5
Jan 29 09:21:29 rt1 kernel: 889.471553 [1783] netmap_ring_reinit        total 1 errors
Jan 29 09:21:29 rt1 kernel: 889.487148 [1787] netmap_ring_reinit        ixl0 TX5 reinit, cur 0 -> 512 tail -1670995457 -> -1670995457
Jan 29 09:21:29 rt1 kernel: 889.514967 [1637] nm_txsync_prologue        ixl0 TX6: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || 
Jan 29 09:21:29 rt1 kernel: kring->nr_hwtail >= n' h 0 c 0 t -1669156353 rh 512 rc 512 rt -1669156353 hc 512 ht -1669156353
Jan 29 09:21:29 rt1 kernel: 889.572947 [1758] netmap_ring_reinit        called for ixl0 TX6
Jan 29 09:21:29 rt1 kernel: 889.589850 [1783] netmap_ring_reinit        total 1 errors
Jan 29 09:21:29 rt1 kernel: 889.605444 [1787] netmap_ring_reinit        ixl0 TX6 reinit, cur 0 -> 512 tail -1669156353 -> -1669156353
Jan 29 09:21:29 rt1 kernel: 889.633261 [1758] netmap_ring_reinit        called for ixl0 TX7
Jan 29 09:21:29 rt1 kernel: 889.650170 [1783] netmap_ring_reinit        total 1 errors
Jan 29 09:21:29 rt1 kernel: 889.665760 [1787] netmap_ring_reinit        ixl0 TX7 reinit, cur 0 -> 512 tail -1667321345 -> -1667321345


I also test this in my netmap application and with these tests I think that when a application try to write in txring > 0 like TX1, TX2 or so this error occurs
Comment 21 Franco Fichtner 2019-01-29 13:22:55 UTC
Suricata doesn't work with newer netmap code beyond 11.2-RELEASE, see https://github.com/OISF/suricata/pull/3616 for a patch which last I heard will likely be included in 4.1.3.


Cheers,
Franco
Comment 22 Michael Muenz 2019-01-29 13:36:00 UTC
Hi,

Not sure if it fit's your problem but I too had crazy behavior with Suricata in netmap and ixl. Was using OPNsense (FreeBSD 11.1) with Intel X710 cards in a lab. When running Suricata in IDS mode everything works fine, firing up IPS inline the nic suddenly stopped working. Carrier was there, also seein incoming arp requests via tcpdump but nothing more. 

I thought it was a problem with X710 as they were also problematic in linux so I switched to X520, but they are using ix and not ixl, so it too might be a problem with ixl and Suricata and netmap.

Michael
Comment 23 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-29 16:31:04 UTC
(In reply to Charles Goncalves from comment #20)

What happens if you change /boot/loader.conf to set the default number of queues to 8, and you never change that?
Same behaviour?
Comment 24 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-29 16:41:18 UTC
(In reply to Franco Fichtner from comment #21)

Thanks for the pointer, I'll follow up on the github to chase the problem.
Comment 25 Charles Goncalves 2019-01-29 17:23:41 UTC
(In reply to Vincenzo Maffione from comment #23)

Without hw.ixl.max_queues in /boot/loader.conf then

# sysctl hw.ixl.max_queues
hw.ixl.max_queues: 0

# dmesg | grep "netmap queues"
ixl0: netmap queues/slots: TX 8/1024, RX 8/1024
ixl1: netmap queues/slots: TX 8/1024, RX 8/1024
ixl2: netmap queues/slots: TX 8/1024, RX 8/1024
ixl3: netmap queues/slots: TX 8/1024, RX 8/1024

# pkt-gen -i ixl0 -f tx
415.475609 main [2593] interface is ixl0
415.475652 main [2727] running on 1 cpus (have 8)
415.475874 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234
415.475891 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234
415.644068 main [2822] mapped 294020KB at 0x801600000
Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff)
415.644158 main [2919] Sending 512 packets every  0.000000000 s
415.644202 start_threads [2274] Wait 2 secs for phy reset
417.645062 start_threads [2276] Ready...
417.645243 sender_body [1464] start, fd 3 main_fd 3
418.485366 sender_body [1538] poll error on 3 ring 0-7
418.708562 main_thread [2364] 7.702 Kpps (8.190 Kpkts 3.931 Mbps in 1063315 usec) 511.88 avg_batch 0 min_space
418.878363 main_thread [2391] ouch, thread 0 exited with error
Sent 8190 packets 491400 bytes 16 events 60 bytes each in -1548782417.65 seconds.
Speed: -0.000 pps Bandwidth: -0.003 bps (raw -0.004 bps). Average batch: 511.88 pkts


# tail -F /var/log/messages
Jan 29 15:20:17 rt1 kernel: 417.657494 [1637] nm_txsync_prologue        ixl0 TX1: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1843510785 rh 512 rc 512 rt -1843510785 hc 512 ht -1843510785
Jan 29 15:20:17 rt1 kernel: 417.715239 [1758] netmap_ring_reinit        called for ixl0 TX1
Jan 29 15:20:17 rt1 kernel: 417.732143 [1783] netmap_ring_reinit        total 1 errors
Jan 29 15:20:17 rt1 kernel: 417.747738 [1787] netmap_ring_reinit        ixl0 TX1 reinit, cur 0 -> 512 tail -1843510785 -> -1843510785
Jan 29 15:20:17 rt1 kernel: 417.775556 [1637] nm_txsync_prologue        ixl0 TX2: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1841671681 rh 512 rc 512 rt -1841671681 hc 512 ht -1841671681
Jan 29 15:20:17 rt1 kernel: 417.833536 [1758] netmap_ring_reinit        called for ixl0 TX2
Jan 29 15:20:17 rt1 kernel: 417.850440 [1783] netmap_ring_reinit        total 1 errors
Jan 29 15:20:17 rt1 kernel: 417.866031 [1787] netmap_ring_reinit        ixl0 TX2 reinit, cur 0 -> 512 tail -1841671681 -> -1841671681
Jan 29 15:20:17 rt1 kernel: 417.893854 [1637] nm_txsync_prologue        ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1839832577 rh 512 rc 512 rt -1839832577 hc 512 ht -1839832577
Jan 29 15:20:18 rt1 kernel: 417.951828 [1758] netmap_ring_reinit        called for ixl0 TX3
Jan 29 15:20:18 rt1 kernel: 417.968738 [1783] netmap_ring_reinit        total 1 errors
Jan 29 15:20:18 rt1 kernel: 417.984330 [1787] netmap_ring_reinit        ixl0 TX3 reinit, cur 0 -> 512 tail -1839832577 -> -1839832577
Jan 29 15:20:18 rt1 kernel: 418.012148 [1637] nm_txsync_prologue        ixl0 TX4: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1837993473 rh 512 rc 512 rt -1837993473 hc 512 ht -1837993473
Jan 29 15:20:18 rt1 kernel: 418.070125 [1758] netmap_ring_reinit        called for ixl0 TX4
Jan 29 15:20:18 rt1 kernel: 418.087037 [1783] netmap_ring_reinit        total 1 errors
Jan 29 15:20:18 rt1 kernel: 418.102626 [1787] netmap_ring_reinit        ixl0 TX4 reinit, cur 0 -> 512 tail -1837993473 -> -1837993473
Jan 29 15:20:18 rt1 kernel: 418.130451 [1637] nm_txsync_prologue        ixl0 TX5: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || 
Jan 29 15:20:18 rt1 kernel: kring->nr_hwtail >= n' h 0 c 0 t -1836154369 rh 512 rc 512 rt -1836154369 hc 512 ht -1836154369
Jan 29 15:20:18 rt1 kernel: 418.188425 [1758] netmap_ring_reinit        called for ixl0 TX5
Jan 29 15:20:18 rt1 kernel: 418.205330 [1783] netmap_ring_reinit        total 1 errors
Jan 29 15:20:18 rt1 kernel: 418.220926 [1787] netmap_ring_reinit        ixl0 TX5 reinit, cur 0 -> 512 tail -1836154369 -> -1836154369
Jan 29 15:20:18 rt1 kernel: 418.248752 [1637] nm_txsync_prologue        ixl0 TX6: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1834315265 rh 512 rc 512 rt -1834315265 hc 512 ht -1834315265
Jan 29 15:20:18 rt1 kernel: 418.306721 [1758] netmap_ring_reinit        called for ixl0 TX6
Jan 29 15:20:18 rt1 kernel: 418.323624 [1783] netmap_ring_reinit        total 1 errors
Jan 29 15:20:18 rt1 kernel: 418.339222 [1787] netmap_ring_reinit        ixl0 TX6 reinit, cur 0 -> 512 tail -1834315265 -> -1834315265
Jan 29 15:20:18 rt1 kernel: 418.367043 [1637] nm_txsync_prologue        ixl0 TX7: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1832476161 rh 512 rc 512 rt -1832476161 hc 512 ht -1832476161
Jan 29 15:20:18 rt1 kernel: 418.425019 [1758] netmap_ring_reinit        called for ixl0 TX7
Jan 29 15:20:18 rt1 kernel: 418.441929 [1783] netmap_ring_reinit        total 1 errors
Jan 29 15:20:18 rt1 kernel: 418.457518 [1787] netmap_ring_reinit        ixl0 TX7 reinit, cur 0 -> 512 tail -1832476161 -> -1832476161
Comment 26 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-30 09:52:40 UTC
(In reply to Charles Goncalves from comment #25)
Thanks!
So this means that the issue is not related to the number of queues changing dynamically.
Comment 27 Charles Goncalves 2019-01-30 10:02:14 UTC
(In reply to Vincenzo Maffione from comment #26)

I'll try to test on 12.0 today and give you some feedback.

Thank you!
Comment 28 Charles Goncalves 2019-01-30 15:52:58 UTC
(In reply to Charles Goncalves from comment #27)
Hi!

I test ixl on top of 12.0 and it works with pkt-gen

 # dmesg | grep 'netmap queues'
ixl0: netmap queues/slots: TX 8/1024, RX 8/1024
ixl1: netmap queues/slots: TX 8/1024, RX 8/1024
ixl2: netmap queues/slots: TX 8/1024, RX 8/1024
ixl3: netmap queues/slots: TX 8/1024, RX 8/1024

Can't find hw.ixl.max_queues via sysctl on 12.0


# ./pkt-gen -i ixl0 -f tx
518.683898 main [2889] interface is ixl0
518.683944 main [3011] using default burst size: 512
518.683954 main [3019] running on 1 cpus (have 8)
518.684152 extract_ip_range [471] range is 10.0.0.1:1234 to 10.0.0.1:1234
518.684169 extract_ip_range [471] range is 10.1.0.1:1234 to 10.1.0.1:1234
518.684235 nm_open [856] overriding ARG1 0
518.684246 nm_open [860] overriding ARG2 0
518.684251 nm_open [864] overriding ARG3 0
518.684254 nm_open [868] overriding RING_CFG
518.684258 nm_open [877] overriding ifname ixl0 ringid 0x0 flags 0x8001
518.853577 main [3117] mapped 334980KB at 0x800e00000
Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff)
518.853734 main [3224] Sending 512 packets every  0.000000000 s
518.853833 start_threads [2549] Wait 2 secs for phy reset
520.979092 start_threads [2551] Ready...
520.979263 sender_body [1580] start, fd 3 main_fd 3
520.979543 sender_body [1638] frags 1 frag_size 60
521.010053 sender_body [1676] drop copy
521.981085 main_thread [2639] 15.143 Mpps (15.170 Mpkts 7.282 Gbps in 1001818 usec) 200.31 avg_batch 0 min_space
522.983084 main_thread [2639] 15.503 Mpps (15.534 Mpkts 7.456 Gbps in 1001999 usec) 189.22 avg_batch 99999 min_space
^C523.664717 sigint_h [562] received control-C on thread 0x800747000
523.664771 main_thread [2639] 15.501 Mpps (10.567 Mpkts 5.072 Gbps in 681687 usec) 195.10 avg_batch 99999 min_space
523.664853 sender_body [1718] flush tail 318 head 318 on thread 0x800747500
524.672164 main_thread [2639] 1.525 Kpps (1.536 Kpkts 737.280 Kbps in 1007393 usec) 192.00 avg_batch 99999 min_space
Sent 41272863 packets 2476371780 bytes 212000 events 60 bytes each in 2.69 seconds.
Speed: 15.369 Mpps Bandwidth: 7.377 Gbps (raw 7.377 Gbps). Average batch: 194.68 pkts
Comment 29 Jeff Pieper 2019-01-30 16:09:10 UTC
(In reply to Charles Goncalves from comment #28)

ixl in 12.0 uses iflib to handle queue allocation:

sysctl -d dev.ixl.0.iflib.override_ntxqs
dev.ixl.0.iflib.override_ntxqs: # of txqs to use, 0 => use default #

sysctl -d dev.ixl.0.iflib.override_nrxqs
dev.ixl.0.iflib.override_nrxqs: # of rxqs to use, 0 => use default #
Comment 30 Vincenzo Maffione freebsd_committer freebsd_triage 2019-01-30 16:11:10 UTC
(In reply to Charles Goncalves from comment #28)

Good. I'm not surprised this works, because ixl in FreeBSD 12.x is implemented through iflib, and netmap in this case uses iflib to access the hw (and iflib must work, otherwise you would not be able to use an ixl NIC using the traditional networking tools and applications).

You can change the number of queues, slots and many other things through the generic iflib configuration tools. See iflib(4).

On the other hand, ixl in FreeBSD 11.x is not implemented through iflib, so it is a completely separate code. And netmap is broken there, unfortunately.
I notified the maintainers at Intel. They said the would have taken a look to this.
Comment 31 Charles Goncalves 2019-01-30 16:17:56 UTC
(In reply to Jeff Pieper from comment #29)

Thank you for your reply!

I setted dev.ixl.0.iflib.override_ntxqs and dev.ixl.0.iflib.override_nrxqs to "4" but still can some sysctl like "dev.ixl.0.iflib.txq7.r_drops". Can I change this sysctl value dynamically or only at boot time?
Comment 32 Jeff Pieper 2019-01-30 16:24:01 UTC
(In reply to Charles Goncalves from comment #31)

I believe this requires a driver reset. If you are using a static driver (compiled into the kernel), then yes, it has to be at boot. If you are using a driver module, then with the driver unloaded, you can do:
#kenv dev.ixl.0.iflib.override_ntxqs=<val>
#kenv dev.ixl.0.iflib.override_nrxqs=<val>

Then load then driver.
Comment 33 Charles Goncalves 2019-01-30 17:24:42 UTC
(In reply to Jeff Pieper from comment #32)

Thank you, it works!
Comment 34 Charles Goncalves 2019-01-30 17:28:02 UTC
(In reply to Vincenzo Maffione from comment #30)

So this is only issue with 11.x.

I'll wait this solution from maintainers on 11.2.

Thank you!
Comment 35 Vincenzo Maffione freebsd_committer freebsd_triage 2019-02-03 13:02:42 UTC
(In reply to Franco Fichtner from comment #21)

Hi,
  I tried to install suricata from github sources, on 12.0-RELEASE.
I use the following commands to run suricata over an e1000 interface:

````
sudo ifconfig em1 up -arp promisc -rxcsum -txcsum -rxcsum6 -txcsum6 -tso -tso4 -tso6 -lro -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso
sudo suricata -c /etc/suricata/suricata.yaml --netmap=em1 -v
````

This is the netmap section of my suricata.yaml:
````
netmap:
  - interface: default
    threads: auto
    copy-mode: ips
    disable-promisc: yes #  promiscuous mode
    checksum-checks: auto

      
  - interface: em1
    copy-iface: em1+

  - interface: em1+
    copy-iface: em1

````

and I see packets being captured
````
[100078] 3/2/2019 -- 14:01:59 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'em1':  pkts: 4892, drop: 0 (0.00%), invalid chksum: 0
````

So what is not working exactly? Can anyone describe reproducible step that I can follow?
Comment 36 Vincenzo Maffione freebsd_committer freebsd_triage 2019-02-03 16:44:06 UTC
Issue confirmed in suricata, let's wait for them to merge the fix.
https://github.com/OISF/suricata/pull/3616
Comment 37 Gong Teng 2019-04-03 17:00:54 UTC
I also met the same problem, ixl used iflib on 12 still has problems. More than 1Gbps data bridge will not work, no data test ping no problem
Comment 38 Kubilay Kocak freebsd_committer freebsd_triage 2020-08-15 06:19:25 UTC
^Triage: assign to committer (apparently) resolving

@Vincenzo What is the actual/remaining issues here and the change delta(s), if any, to be made in order to resolve?

Is https://reviews.freebsd.org/D18984 still relevant (its closed) and related to this issue?

Is this just a suricata issue?

Is comment 37 relevant, or unrelated?
Comment 39 Vincenzo Maffione freebsd_committer freebsd_triage 2020-08-19 17:28:29 UTC
(In reply to Kubilay Kocak from comment #38)

This is not a suricata issue. There was a suricata issue mentioned in this thread, but it has been fixed upstream (suricata).

Comment #37 seems unrelated, since it mentions netmap with ixl in 12.x, where iflib is in use. There was an iflib/netmap bug (see https://reviews.freebsd.org/D25252) that may explain the problems briefly mentioned in #37. But that is now in HEAD and stable/12.

This report is about a bug that apparently affects netmap TX over ixl in 11.x (but not in 12.x and ahead).
This change
https://reviews.freebsd.org/D18984
does some cleanup but it does not fix the bug.
As you can see in the discussion, I reported the issue to the Intel developers, but as far as I know there have been no changes on their side (in stable/11).
So I can assume that the bug is still there, and it's something that need the Intel developers attention, if someone is still interested in netmap+ixl in 11.x
Comment 40 Vincenzo Maffione freebsd_committer freebsd_triage 2021-01-09 17:01:37 UTC
Remaining issue is specific to the Intel driver. There's nothing more I can do here.
If there is still interest, someone at Intel should take it.
Comment 41 Charles Goncalves 2021-02-23 14:27:41 UTC
(In reply to Vincenzo Maffione from comment #40)
Hello Vincenzo!

This issue is present on FreeBSD 12.2 or 13.0? I don't have a test environment right now, but I will upgrade a production router from 12.1 to 12.2 and to 13.0 in the following months then I can test it.

This router has a ixl NIC (chip=0x15838086 Ethernet Controller XL710 for 40GbE QSFP+)
Comment 42 Vincenzo Maffione freebsd_committer freebsd_triage 2021-02-24 20:29:13 UTC
(In reply to Charles Goncalves from comment #41)
I don't have a test environment either.
But since ixl uses iflib on 12.x and 13.x, I expected this issue has gone away.
Comment 43 slw 2021-05-18 14:20:49 UTC
(In reply to Vincenzo Maffione from comment #42)

Looks like netmap don't worked:

# /usr/obj/usr/src/amd64.amd64/tools/tools/netmap/pkt-gen -i ixl1 -f tx
321.990539 main [2921] interface is ixl1
321.990568 main [3044] using default burst size: 512
321.990573 main [3052] running on 1 cpus (have 24)
321.990640 extract_ip_range [476] range is 10.0.0.1:1234 to 10.0.0.1:1234
321.990645 extract_ip_range [476] range is 10.1.0.1:1234 to 10.1.0.1:1234
Sending on netmap:ixl1: 5 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff)
322.096770 main [3255] Sending 512 packets every  0.000000000 s
322.096813 start_threads [2580] Wait 2 secs for phy reset
324.222299 start_threads [2582] Ready...
324.222365 sender_body [1599] start, fd 3 main_fd 3
324.222392 sender_body [1657] frags 1 frag_size 60
324.234391 sender_body [1695] drop copy
325.285776 main_thread [2671] 2.794 Mpps (2.971 Mpkts 1.341 Gbps in 1063411 usec) 15.05 avg_batch 0 min_space
326.348859 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063084 usec) 0.00 avg_batch 99999 min_space
326.472386 sender_body [1682] poll error on queue 0: timeout
327.411859 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063000 usec) 0.00 avg_batch 99999 min_space
328.473456 sender_body [1682] poll error on queue 0: timeout
328.474874 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063015 usec) 0.00 avg_batch 99999 min_space
329.537820 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1062945 usec) 0.00 avg_batch 99999 min_space
330.474386 sender_body [1682] poll error on queue 0: timeout
330.600771 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1062951 usec) 0.00 avg_batch 99999 min_space
331.663860 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063090 usec) 0.00 avg_batch 99999 min_space
332.475381 sender_body [1682] poll error on queue 0: timeout
332.726861 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063001 usec) 0.00 avg_batch 99999 min_space
^C333.671467 sigint_h [573] received control-C on thread 0x800a12000
333.671475 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 944614 usec) 0.00 avg_batch 99999 min_space
334.476434 sender_body [1737] flush tail 576 head 576 on thread 0x800a12700
334.734834 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063359 usec) 0.00 avg_batch 99999 min_space
Sent 2971392 packets 178283520 bytes 197414 events 60 bytes each in 10.25 seconds.
Speed: 289.777 Kpps Bandwidth: 139.093 Mbps (raw 139.093 Mbps). Average batch: 15.05 pkts


Additional, in my application I am see logical errors from kernel:

I a send 3 packets in ring 0, c/h/t is 3/3/2047
do NIOCTXSYNC, c/h/t is 3/3/0
do not send any packets, just do NIOCTXSYNC, c/h/t is 3/3/3 now!
i.e. like TX ring is full and stalled. Any transmission staled after this.

13-stable.
Comment 44 Vincenzo Maffione freebsd_committer freebsd_triage 2021-05-18 21:26:09 UTC
What if you set
hw.ixl.enable_head_writeback = 0
in /boot/loader.conf and reboot?
Comment 45 Francois ten Krooden 2021-05-21 13:10:37 UTC
(In reply to Vincenzo Maffione from comment #44)

I was testing a different application on a box with a ixl card.
I also noticed the drop in the tx packets.
Setting hw.ixl.enable_head_writeback = 0 in /boot/loader.conf seems to have resolved the issue with the preliminary tests I have done.
Comment 46 Francois ten Krooden 2021-05-24 05:54:19 UTC
(In reply to strongswan from comment #45)
I did some more testing, and even with the setting hw.ixl.enable_head_writeback = 0 I still get to a situation where no packets are transmitted.
There is however a much longer interval between when the issues are occurring.
Comment 47 Vincenzo Maffione freebsd_committer freebsd_triage 2021-05-26 19:36:31 UTC
What is the state of the TX ring (head, cur, tail) when stalling?