Hello! I have a ixl NIC (chip=0x15728086 'Ethernet Controller X710 for 10GbE SFP+) and I'm trying to work with netmap. When I was compiling kernel with ixl support I have errors with missing reference of: "'ixl_rx_miss', 'ixl_rx_miss_bufs and 'ixl_crcstrip'" so then I modify ixl_txrx.c and added this references like this: #ifdef DEV_NETMAP #include <dev/netmap/if_ixl_netmap.h> int ixl_rx_miss = 0, ixl_rx_miss_bufs = 0, ixl_crcstrip = 1; #endif /* DEV_NETMAP */ When I did this my kernel was compiled with sucess and now I see ixl interfaces in "ifconfig" command. Then now I'm trying netmap on then, but seems not working. In my application on top of netmap I see "dmesg" like this: Aug 8 14:46:56 rt1 kernel: 415.918289 [1637] nm_txsync_prologue ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 2 c 2 t 1387166207 rh 1 rc 1 rt 1387166207 hc 1 ht 1387166207 Aug 8 14:46:56 rt1 kernel: 415.973692 [1758] netmap_ring_reinit called for ixl0 TX3 Aug 8 14:46:56 rt1 kernel: 415.990602 [1783] netmap_ring_reinit total 1 errors Aug 8 14:46:56 rt1 kernel: 416.006198 [1787] netmap_ring_reinit ixl0 TX3 reinit, cur 2 -> 1 tail 1387166207 -> 1387166207 Aug 8 14:46:56 rt1 kernel: 416.032990 [1637] nm_txsync_prologue ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 2 c 2 t 1387166207 rh 1 rc 1 rt 1387166207 hc 1 ht 1387166207 Aug 8 14:46:56 rt1 kernel: 416.088614 [1758] netmap_ring_reinit called for ixl0 TX3 Aug 8 14:46:56 rt1 kernel: 416.105520 [1783] netmap_ring_reinit total 1 errors Aug 8 14:46:56 rt1 kernel: 416.121113 [1787] netmap_ring_reinit ixl0 TX3 reinit, cur 2 -> 1 tail 1387166207 -> 1387166207 Aug 8 14:46:57 rt1 kernel: 417.089185 [1637] nm_txsync_prologue ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 2 c 2 t 1387166207 rh 1 rc 1 rt 1387166207 hc 1 ht 1387166207 Aug 8 14:46:57 rt1 kernel: 417.144605 [1758] netmap_ring_reinit called for ixl0 TX3 Aug 8 14:46:57 rt1 kernel: 417.161510 [1783] netmap_ring_reinit total 1 errors Aug 8 14:46:57 rt1 kernel: 417.177110 [1787] netmap_ring_reinit ixl0 TX3 reinit, cur 2 -> 1 tail 1387166207 -> 1387166207 Aug 8 14:46:58 rt1 kernel: 418.138193 [1637] nm_txsync_prologue ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 2 c 2 t 1387166207 rh 1 rc 1 rt 1387166207 hc 1 ht 1387166207 Aug 8 14:46:58 rt1 kernel: 418.193599 [1758] netmap_ring_reinit called for ixl0 TX3 Aug 8 14:46:58 rt1 kernel: 418.210507 [1783] netmap_ring_reinit total 1 errors Aug 8 14:46:58 rt1 kernel: 418.226096 [1787] netmap_ring_reinit ixl0 TX3 reinit, cur 2 -> 1 tail 1387166207 -> 1387166207 Using pkt-get from netmap github I'm able to receive packets but not able to transmit then like this: comand: pkt-gen -i ixl0 -f tx 637.872347 main [2593] interface is ixl0 637.872394 main [2727] running on 1 cpus (have 8) 637.872601 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234 637.872618 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234 638.046374 main [2822] mapped 294020KB at 0x801600000 Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff) 638.046466 main [2919] Sending 512 packets every 0.000000000 s 638.046507 start_threads [2274] Wait 2 secs for phy reset 640.145075 start_threads [2276] Ready... 640.145254 sender_body [1464] start, fd 3 main_fd 3 640.863306 sender_body [1538] poll error on 3 ring 0-7 641.198102 main_thread [2364] 7.780 Kpps (8.191 Kpkts 3.932 Mbps in 1052845 usec) 511.94 avg_batch 0 min_space 641.372908 main_thread [2391] ouch, thread 0 exited with error Sent 8191 packets 491460 bytes 16 events 60 bytes each in -1533750640.15 seconds. Speed: -0.000 pps Bandwidth: -0.003 bps (raw -0.004 bps). Average batch: 511.94 pkts Then "dmesg" show: Aug 8 14:51:48 rt1 kernel: 708.870527 [1637] nm_txsync_prologue ixl0 TX1: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -278400513 rh 512 rc 512 rt -278400513 hc 512 ht -278400513 Aug 8 14:51:48 rt1 kernel: 708.927494 [1758] netmap_ring_reinit called for ixl0 TX1 Aug 8 14:51:49 rt1 kernel: 708.944399 [1783] netmap_ring_reinit total 1 errors Aug 8 14:51:49 rt1 kernel: 708.959993 [1787] netmap_ring_reinit ixl0 TX1 reinit, cur 0 -> 512 tail -278400513 -> -278400513 Aug 8 14:51:49 rt1 kernel: 708.987295 [1637] nm_txsync_prologue ixl0 TX2: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -440139265 rh 512 rc 512 rt -440139265 hc 512 ht -440139265 Aug 8 14:51:49 rt1 kernel: 709.044489 [1758] netmap_ring_reinit called for ixl0 TX2 Aug 8 14:51:49 rt1 kernel: 709.061399 [1783] netmap_ring_reinit total 1 errors Aug 8 14:51:49 rt1 kernel: 709.076993 [1787] netmap_ring_reinit ixl0 TX2 reinit, cur 0 -> 512 tail -440139265 -> -440139265 Aug 8 14:51:49 rt1 kernel: 709.104291 [1637] nm_txsync_prologue ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || Aug 8 14:51:49 rt1 kernel: kring->nr_hwtail >= n' h 0 c 0 t 1455426047 rh 512 rc 512 rt 1455426047 hc 512 ht 1455426047 Aug 8 14:51:49 rt1 kernel: 709.161491 [1758] netmap_ring_reinit called for ixl0 TX3 Aug 8 14:51:49 rt1 kernel: 709.178394 [1783] netmap_ring_reinit total 1 errors Aug 8 14:51:49 rt1 kernel: 709.193987 [1787] netmap_ring_reinit ixl0 TX3 reinit, cur 0 -> 512 tail 1455426047 -> 1455426047 Aug 8 14:51:49 rt1 kernel: 709.221304 [1637] nm_txsync_prologue ixl0 TX4: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t 1457261055 rh 512 rc 512 rt 1457261055 hc 512 ht 1457261055 Aug 8 14:51:49 rt1 kernel: 709.278488 [1758] netmap_ring_reinit called for ixl0 TX4 Aug 8 14:51:49 rt1 kernel: 709.295391 [1783] netmap_ring_reinit total 1 errors Aug 8 14:51:49 rt1 kernel: 709.310998 [1787] netmap_ring_reinit ixl0 TX4 reinit, cur 0 -> 512 tail 1457261055 -> 1457261055 Aug 8 14:51:49 rt1 kernel: 709.338286 [1637] nm_txsync_prologue ixl0 TX5: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t 1744312831 rh 512 rc 512 rt 1744312831 hc 512 ht 1744312831 Aug 8 14:51:49 rt1 kernel: 709.395485 [1758] netmap_ring_reinit called for ixl0 TX5 Aug 8 14:51:49 rt1 kernel: 709.412388 [1783] netmap_ring_reinit total 1 errors Aug 8 14:51:49 rt1 kernel: 709.427981 [1787] netmap_ring_reinit ixl0 TX5 reinit, cur 0 -> 512 tail 1744312831 -> 1744312831 Aug 8 14:51:49 rt1 kernel: 709.455284 [1637] nm_txsync_prologue ixl0 TX6: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1586220545 rh 512 rc 512 rt -1586220545 hc 512 ht -1586220545 Aug 8 14:51:49 rt1 kernel: 709.513263 [1758] netmap_ring_reinit called for ixl0 TX6 Aug 8 14:51:49 rt1 kernel: 709.530166 [1783] netmap_ring_reinit total 1 errors Aug 8 14:51:49 rt1 kernel: 709.545759 [1787] netmap_ring_reinit ixl0 TX6 reinit, cur 0 -> 512 tail -1586220545 -> -1586220545 Aug 8 14:51:49 rt1 kernel: 709.573581 [1637] nm_txsync_prologue ixl0 TX7: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1429192193 rh 512 rc 512 rt -1429192193 hc 512 ht -1429192193 Aug 8 14:51:49 rt1 kernel: 709.631560 [1758] netmap_ring_reinit called for ixl0 TX7 Aug 8 14:51:49 rt1 kernel: 709.648463 [1783] netmap_ring_reinit total 1 errors Aug 8 14:51:49 rt1 kernel: 709.664056 [1787] netmap_ring_reinit ixl0 TX7 reinit, cur 0 -> 512 tail -1429192193 -> -1429192193 Am I missing something? Thanks!
X710 has a couple of problem with netmap. Also experienced it with all available firmware versions for the NIC. I just downgraded to X520 or Chelsio. For me the NIC just freezes, carrier active and seeing incoming packets via tcpdump but nothing else ...
If I run pkt-gen with rate option then pkt-gen sends packets. Like this: pkg-gen -f tx -R 150000 If I change value greater than 150k i.e: 160000 I got same error: /pkt-gen -i ixl0 -f tx -R 160000 812.653021 main [2593] interface is ixl0 812.653078 main [2727] running on 1 cpus (have 8) 812.653279 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234 812.653296 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234 812.826944 main [2822] mapped 294020KB at 0x801600000 Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff) 812.827043 main [2919] Sending 512 packets every 0.003200000 s 812.827085 start_threads [2274] Wait 2 secs for phy reset 814.836422 start_threads [2276] Ready... 814.836644 sender_body [1464] start, fd 3 main_fd 3 815.838007 main_thread [2364] 0.000 pps (0.000 pkts 0.000 bps in 1001364 usec) 0.00 avg_batch 0 min_space 816.134343 sender_body [1538] poll error on 3 ring 0-7 816.886366 main_thread [2364] 1.954 Kpps (2.048 Kpkts 983.040 Kbps in 1048359 usec) 341.33 avg_batch 99999 min_space 817.061069 main_thread [2391] ouch, thread 0 exited with error Sent 2048 packets 122880 bytes 6 events 60 bytes each in -1534269816.00 seconds. Speed: -0.000 pps Bandwidth: -0.001 bps (raw -0.001 bps). Average batch: 341.33 pkts And kernel log: Aug 14 15:03:36 rt1 kernel: 816.016016 [1637] nm_txsync_prologue ixl0 TX1: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 275 c 275 t -278400513 rh 274 rc 274 rt -278400513 hc 274 ht -278400513 Aug 14 15:03:36 rt1 kernel: 816.074020 [1758] netmap_ring_reinit called for ixl0 TX1 Aug 14 15:03:36 rt1 kernel: 816.090925 [1783] netmap_ring_reinit total 1 errors Aug 14 15:03:36 rt1 kernel: 816.106519 [1787] netmap_ring_reinit ixl0 TX1 reinit, cur 275 -> 274 tail -278400513 -> -278400513 These errors seem to be related to synchronization of HEAD/CUR/TAIL in netmap ring
I've noted that in POLLIN this: poll(pfd, rxrings, -1); Doesn't work, need it to be something like: poll(pfd, rxrings, 1); But with this I have a latency problem. And I also have noted that I only can TX from TX-RING-0 if I use TX-RING more than 0, doesn't work. In my test scenario my ixl NIC has 8 rings. Can anyone understand the reason? Thanks!
The compilation problems have been fixed. Which FreeBSD version are you using? We need to understand if your ixl driver is backed by iflib or not.
(In reply to Vincenzo Maffione from comment #4) Hello Vincenzo, thank you for your answer! I was running tests in FreeBSD 11.2 STABLE. For coming months I'll upgrade to 12.0 STABLE.
I'm facing same problem on FreeBSD 11.2-p8. My NICS are Intel X722. I'm tring to use suricata using netmap but no packets can received. But Intel i350 nic installed on same hardware. igb is working.
Hi, From the log it's quite clear that the problem is that netmap TXSYNC is reading a random value for the hw HEAD index, that is the value of the last completed TX descriptor. Now, in the driver there are two ways to get this index, depending on the value of `hw.ixl.enable_head_writeback`. But the ixl driver seems to be aware of this difference and prevent the use of netmap if this is not possible. So I don't quite understand why this is not working. What is the value of `sysctl hw.ixl.enable_head_writeback` in your setup ? Also, what does `dmesg | grep "netmap queues"` say ? In any case, this is affecting 11.x because it does not use iflib yet. From 12.x on iflib is used, and netmap support is provided through iflib, which means that netmap works on ixl iff regular network stack works in ixl.
Btw, I prepared the following clean-up patch for ixl, any testing is welcome: https://reviews.freebsd.org/D18984
Hello, I done too many tests for this patch. On stable/11 kernel, both with and without this patch, netmap doesn't work. suricata cannot capture any packets. counters are always zero. On releng/11.2 kernel (FreeBSD 11.2-RELEASE-p8 #0 r343486), netmap with suricata works. But ixl patch cannot be applied. I tested release/12.0, releng/12.0 and stable/12 also. Netmap doesnt work on all 12.0 branches. suricata cannot capture any packets. counters are always zero. I think some commits after 11.2-p8 brakes netmap support.
(In reply to Ozkan KIRIK from comment #9) Hi, Thanks a lot for testing and for the patience. Yes, this patch was not meant to fix the issue, but just to clean up a little bit. It would help to know what is the value of `sysctl hw.ixl.enable_head_writeback` in your setup, and what does `dmesg | grep "netmap queues"` say. Could you please point me at the URL where you got your "releng/11.2 kernel (FreeBSD 11.2-RELEASE-p8 #0 r343486)" exactly ? You are saying that ixl in this version works, but r343486 corresponds to HEAD, which you are saying it doesn't work... If ixl works on releng/11.2 than we can check what happened since then. If you found that ixl/netmap doesn't work on 12.x, it means that the issue there is in iflib, since ixl uses iflib, and iflib provides netmap support for all the drivers. We probably need to open another ticket for that, since it is a completely different piece of code.
(In reply to Vincenzo Maffione from comment #10) If I'm reading the history right there was only one change between 11.2-RELEASE and 11-STABLE in the ixl: https://github.com/freebsd/freebsd/commits/stable/11/sys/dev/ixl
(In reply to Krzysztof Galazka from comment #11) What single change are talking about exactly? If you can point me at two git commits (or two svn revisions), I can look at the diff. For now I suspect this commit https://github.com/freebsd/freebsd/commit/27d66545b33f8e4f36fdce1003ddbbda40f5a7bb, which introduces the hw.ixl.enable_head_writeback sysctl.
(In reply to Vincenzo Maffione from comment #12) # git log upstream/releng/11.2..upstream/stable/11 sys/dev/ixl commit 2889f6fc498ab04853661e2f57d23fbb150128d3 Author: vmaffione <vmaffione@FreeBSD.org> Date: Tue Dec 4 17:40:56 2018 +0000 MFC r339639 netmap: align codebase to the current upstream (sha 8374e1a7e6941) [...] Approved by: gnn (mentor) Differential Revision: https://reviews.freebsd.org/D17364 That's the only patch I see in git log which is in 11-STABLE but not in 11.2-RELEASE.
Hello, Sorry for late response. I think my comment was not clear enough. I'm going to explain my test detailly: ------------------------------------------------------ Tested version : FreeBSD 12.0-STABLE https://svnweb.freebsd.org/base/stable/12/ Test NICs with suricata netmap if0 -> if0+ in IPS mode: em0 => 0 packets captured 0 dropped -> NOT WORKING igb0 => 0 packets captured 0 dropped -> NOT WORKING ix0 => 0 packets captured 0 dropped -> NOT WORKING ixl0 => 0 packets captured 0 dropped -> NOT WORKING ------------------------------------------------------ Tested version : FreeBSD 12.0-p2 https://svnweb.freebsd.org/base/releng/12.0/ Test NICs with suricata netmap if0 -> if0+ in IPS mode: em0 => 0 packets captured 0 dropped -> NOT WORKING igb0 => 0 packets captured 0 dropped -> NOT WORKING ix0 => 0 packets captured 0 dropped -> NOT WORKING ixl0 => 0 packets captured 0 dropped -> NOT WORKING ------------------------------------------------------ Tested version : FreeBSD 12.0-RELEASE https://svnweb.freebsd.org/base/release/12.0.0/ Test NICs with suricata netmap if0 -> if0+ in IPS mode: em0 => 0 packets captured 0 dropped -> NOT WORKING igb0 => 0 packets captured 0 dropped -> NOT WORKING ix0 => 0 packets captured 0 dropped -> NOT WORKING ixl0 => 0 packets captured 0 dropped -> NOT WORKING ------------------------------------------------------ Tested version : FreeBSD 11.2-STABLE https://svnweb.freebsd.org/base/stable/11/ Test NICs with suricata netmap if0 -> if0+ in IPS mode: em0 => 0 packets captured 0 dropped -> NOT WORKING igb0 => 0 packets captured 0 dropped -> NOT WORKING ix0 => 0 packets captured 0 dropped -> NOT WORKING ixl0 => 0 packets captured 0 dropped -> NOT WORKING ------------------------------------------------------ Tested version : FreeBSD 11.2-p8 https://svnweb.freebsd.org/base/releng/11.2/ Test NICs with suricata netmap if0 -> if0+ in IPS mode: em0 => 0 packets captured 0 dropped igb0 => 0 packets captured 0 dropped ix0 => 0 packets captured 0 dropped ixl0 => 0 packets captured 0 dropped -> NOT WORKING ------------------------------------------------------ Tested version : FreeBSD 11.2-RELEASE https://svnweb.freebsd.org/base/release/11.2.0/ Test NICs with suricata netmap if0 -> if0+ in IPS mode: em0 => 8 packets captured 0 dropped -> Working igb0 => 9 packets captured 0 dropped -> Working ix0 => 6 packets captured 0 dropped -> Working ixl0 => 0 packets captured 0 dropped -> NOT WORKING ------------------------------------------------------ CONCLUSION: According to test results, netmap support fully broken after FreeBSD 11.2-p8. ixl driver never get worked.
(In reply to Ozkan KIRIK from comment #14) There was a typo in reports. The true results for 11.2-p8 is below: ------------------------------------------------------ Tested version : FreeBSD 11.2-p8 https://svnweb.freebsd.org/base/releng/11.2/ Test NICs with suricata netmap if0 -> if0+ in IPS mode: em0 => 5 packets captured 0 dropped -> Working igb0 => 6 packets captured 0 dropped -> Working ix0 => 7 packets captured 0 dropped -> Working ixl0 => 0 packets captured 0 dropped -> NOT WORKING
(In reply to Krzysztof Galazka from comment #13) Thanks. However, the changes to ixl are just compilation fixes that follow the many updates on the netmap code. However, it's a bit unlikely that a netmap update broke only ixl, and not all the other drivers (e.g. em, igb, vtnet-pci, cxgbe), which are actually working after that change (on both 11.x and 12.x). So this may be a suricata-specific issue (more on my following answer).
(In reply to Ozkan KIRIK from comment #14) Hi Ozkan, Thanks, that's very clear. Now, I can assure that stock netmap applications (pkt-gen, bridge, lb, vale-ctl, ...) work fine in both stable/11, stable/11, releng/12.0, release/12.0, etc, at least when working with virtual interfaces (vale(4), pipes, monitors, ptnet(4)), and drivers such as em, igb and vtnet-pci. Regarding ixl support, it looks it is broken in every version, and this is a first problem that we need to address. Second, if you are reporting that suricata over netmap is not working at all, it means that there must be some problem specific to suricata. Can you please point me at which suricata code and configuration you were using, so that I can debug it (e.g. over em or igb interfaces, rather than ixl)? It's better if the configuration is minimal. Also, it would help if you could provide "dmesg | grep netmap" on a machine where suricata over "em" interfaces does not work. Thanks
(In reply to Vincenzo Maffione from comment #7) Hello Vincenzo! # sysctl hw.ixl.enable_head_writeback hw.ixl.enable_head_writeback: 1 # dmesg | grep "netmap queues" ixl0: netmap queues/slots: TX 1/1024, RX 1/1024 ixl1: netmap queues/slots: TX 1/1024, RX 1/1024 ixl2: netmap queues/slots: TX 1/1024, RX 1/1024 ixl3: netmap queues/slots: TX 1/1024, RX 1/1024 I have only one queue in this NIC. I usually use with X405(82599ES) that have 8 queues bind to 8 cpu core OR 16 queues when bind to 16 cpu core machine. My question is: Will this X710 NIC have only one queue or is it a miss compiling?
(In reply to Charles Goncalves from comment #18) Hi, Thanks. The hw.ixl.enable_head_writeback set to 1 looks good. The number of queue depends on configuration, I guess. Did you change hw.ixl.max_queues parameter dynamically? (or maybe in rc.conf).
(In reply to Vincenzo Maffione from comment #19) Oh yes now I remember it's about 5 months. I had set this to max_queues = 1 (in /boot/loader.conf) to test with pkt-gen, because with max_queues = 8 doens't work. Output of pkt-gen with hw.ixl.max_queues=1: # pkt-gen -i ixl0 -f tx 945.763330 main [2593] interface is ixl0 945.763376 main [2727] running on 1 cpus (have 8) 945.763602 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234 945.763619 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234 945.907652 main [2822] mapped 294020KB at 0x801600000 Sending on netmap:ixl0: 1 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff) 945.907746 main [2919] Sending 512 packets every 0.000000000 s 945.907791 start_threads [2274] Wait 2 secs for phy reset 947.974562 start_threads [2276] Ready... 947.974737 sender_body [1464] start, fd 3 main_fd 3 947.989037 sender_body [1546] drop copy 949.001809 main_thread [2364] 4.285 Mpps (4.401 Mpkts 2.112 Gbps in 1027068 usec) 343.53 avg_batch 0 min_space 950.016811 main_thread [2364] 4.265 Mpps (4.329 Mpkts 2.078 Gbps in 1015002 usec) 341.05 avg_batch 99999 min_space 951.053574 main_thread [2364] 4.212 Mpps (4.367 Mpkts 2.096 Gbps in 1036763 usec) 341.39 avg_batch 99999 min_space 952.054597 main_thread [2364] 4.262 Mpps (4.266 Mpkts 2.048 Gbps in 1001023 usec) 341.66 avg_batch 99999 min_space Now with hw.ixl.max_queues=8: # pkt-gen -i ixl0 -f tx 734.500918 main [2593] interface is ixl0 734.500963 main [2727] running on 1 cpus (have 8) 734.501188 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234 734.501205 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234 734.651421 main [2822] mapped 294020KB at 0x801600000 Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff) 734.651514 main [2919] Sending 512 packets every 0.000000000 s 734.651558 start_threads [2274] Wait 2 secs for phy reset 736.666615 start_threads [2276] Ready... 736.666799 sender_body [1464] start, fd 3 main_fd 3 737.506822 sender_body [1538] poll error on 3 ring 0-7 737.677616 main_thread [2364] 8.103 Kpps (8.191 Kpkts 3.932 Mbps in 1010813 usec) 511.94 avg_batch 0 min_space 737.856747 main_thread [2391] ouch, thread 0 exited with error Sent 8191 packets 491460 bytes 16 events 60 bytes each in -1548760736.67 seconds. Speed: -0.000 pps Bandwidth: -0.003 bps (raw -0.004 bps). Average batch: 511.94 pkts Then with max_queues=8 I have errors in my /var/log/messages: Jan 29 09:21:29 rt1 kernel: 888.923727 [1637] nm_txsync_prologue ixl0 TX1: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1678351873 rh 512 rc 512 rt -1678351873 hc 512 ht -1678351873 Jan 29 09:21:29 rt1 kernel: 888.981468 [1758] netmap_ring_reinit called for ixl0 TX1 Jan 29 09:21:29 rt1 kernel: 888.998372 [1783] netmap_ring_reinit total 1 errors Jan 29 09:21:29 rt1 kernel: 889.013966 [1787] netmap_ring_reinit ixl0 TX1 reinit, cur 0 -> 512 tail -1678351873 -> -1678351873 Jan 29 09:21:29 rt1 kernel: 889.041778 [1637] nm_txsync_prologue ixl0 TX2: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1676512769 rh 512 rc 512 rt -1676512769 hc 512 ht -1676512769 Jan 29 09:21:29 rt1 kernel: 889.099769 [1758] netmap_ring_reinit called for ixl0 TX2 Jan 29 09:21:29 rt1 kernel: 889.116664 [1783] netmap_ring_reinit total 1 errors Jan 29 09:21:29 rt1 kernel: 889.132258 [1787] netmap_ring_reinit ixl0 TX2 reinit, cur 0 -> 512 tail -1676512769 -> -1676512769 Jan 29 09:21:29 rt1 kernel: 889.160084 [1637] nm_txsync_prologue ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1674673665 rh 512 rc 512 rt -1674673665 hc 512 ht -1674673665 Jan 29 09:21:29 rt1 kernel: 889.218053 [1758] netmap_ring_reinit called for ixl0 TX3 Jan 29 09:21:29 rt1 kernel: 889.234957 [1783] netmap_ring_reinit total 1 errors Jan 29 09:21:29 rt1 kernel: 889.250560 [1787] netmap_ring_reinit ixl0 TX3 reinit, cur 0 -> 512 tail -1674673665 -> -1674673665 Jan 29 09:21:29 rt1 kernel: 889.278372 [1637] nm_txsync_prologue ixl0 TX4: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1672834561 rh 512 rc 512 rt -1672834561 hc 512 ht -1672834561 Jan 29 09:21:29 rt1 kernel: 889.336348 [1758] netmap_ring_reinit called for ixl0 TX4 Jan 29 09:21:29 rt1 kernel: 889.353258 [1783] netmap_ring_reinit total 1 errors Jan 29 09:21:29 rt1 kernel: 889.368847 [1787] netmap_ring_reinit ixl0 TX4 reinit, cur 0 -> 512 tail -1672834561 -> -1672834561 Jan 29 09:21:29 rt1 kernel: 889.396679 [1637] nm_txsync_prologue ixl0 TX5: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1670995457 rh 512 rc 512 rt -1670995457 hc 512 ht -1670995457 Jan 29 09:21:29 rt1 kernel: 889.454648 [1758] netmap_ring_reinit called for ixl0 TX5 Jan 29 09:21:29 rt1 kernel: 889.471553 [1783] netmap_ring_reinit total 1 errors Jan 29 09:21:29 rt1 kernel: 889.487148 [1787] netmap_ring_reinit ixl0 TX5 reinit, cur 0 -> 512 tail -1670995457 -> -1670995457 Jan 29 09:21:29 rt1 kernel: 889.514967 [1637] nm_txsync_prologue ixl0 TX6: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || Jan 29 09:21:29 rt1 kernel: kring->nr_hwtail >= n' h 0 c 0 t -1669156353 rh 512 rc 512 rt -1669156353 hc 512 ht -1669156353 Jan 29 09:21:29 rt1 kernel: 889.572947 [1758] netmap_ring_reinit called for ixl0 TX6 Jan 29 09:21:29 rt1 kernel: 889.589850 [1783] netmap_ring_reinit total 1 errors Jan 29 09:21:29 rt1 kernel: 889.605444 [1787] netmap_ring_reinit ixl0 TX6 reinit, cur 0 -> 512 tail -1669156353 -> -1669156353 Jan 29 09:21:29 rt1 kernel: 889.633261 [1758] netmap_ring_reinit called for ixl0 TX7 Jan 29 09:21:29 rt1 kernel: 889.650170 [1783] netmap_ring_reinit total 1 errors Jan 29 09:21:29 rt1 kernel: 889.665760 [1787] netmap_ring_reinit ixl0 TX7 reinit, cur 0 -> 512 tail -1667321345 -> -1667321345 I also test this in my netmap application and with these tests I think that when a application try to write in txring > 0 like TX1, TX2 or so this error occurs
Suricata doesn't work with newer netmap code beyond 11.2-RELEASE, see https://github.com/OISF/suricata/pull/3616 for a patch which last I heard will likely be included in 4.1.3. Cheers, Franco
Hi, Not sure if it fit's your problem but I too had crazy behavior with Suricata in netmap and ixl. Was using OPNsense (FreeBSD 11.1) with Intel X710 cards in a lab. When running Suricata in IDS mode everything works fine, firing up IPS inline the nic suddenly stopped working. Carrier was there, also seein incoming arp requests via tcpdump but nothing more. I thought it was a problem with X710 as they were also problematic in linux so I switched to X520, but they are using ix and not ixl, so it too might be a problem with ixl and Suricata and netmap. Michael
(In reply to Charles Goncalves from comment #20) What happens if you change /boot/loader.conf to set the default number of queues to 8, and you never change that? Same behaviour?
(In reply to Franco Fichtner from comment #21) Thanks for the pointer, I'll follow up on the github to chase the problem.
(In reply to Vincenzo Maffione from comment #23) Without hw.ixl.max_queues in /boot/loader.conf then # sysctl hw.ixl.max_queues hw.ixl.max_queues: 0 # dmesg | grep "netmap queues" ixl0: netmap queues/slots: TX 8/1024, RX 8/1024 ixl1: netmap queues/slots: TX 8/1024, RX 8/1024 ixl2: netmap queues/slots: TX 8/1024, RX 8/1024 ixl3: netmap queues/slots: TX 8/1024, RX 8/1024 # pkt-gen -i ixl0 -f tx 415.475609 main [2593] interface is ixl0 415.475652 main [2727] running on 1 cpus (have 8) 415.475874 extract_ip_range [468] range is 10.0.0.1:1234 to 10.0.0.1:1234 415.475891 extract_ip_range [468] range is 10.1.0.1:1234 to 10.1.0.1:1234 415.644068 main [2822] mapped 294020KB at 0x801600000 Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff) 415.644158 main [2919] Sending 512 packets every 0.000000000 s 415.644202 start_threads [2274] Wait 2 secs for phy reset 417.645062 start_threads [2276] Ready... 417.645243 sender_body [1464] start, fd 3 main_fd 3 418.485366 sender_body [1538] poll error on 3 ring 0-7 418.708562 main_thread [2364] 7.702 Kpps (8.190 Kpkts 3.931 Mbps in 1063315 usec) 511.88 avg_batch 0 min_space 418.878363 main_thread [2391] ouch, thread 0 exited with error Sent 8190 packets 491400 bytes 16 events 60 bytes each in -1548782417.65 seconds. Speed: -0.000 pps Bandwidth: -0.003 bps (raw -0.004 bps). Average batch: 511.88 pkts # tail -F /var/log/messages Jan 29 15:20:17 rt1 kernel: 417.657494 [1637] nm_txsync_prologue ixl0 TX1: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1843510785 rh 512 rc 512 rt -1843510785 hc 512 ht -1843510785 Jan 29 15:20:17 rt1 kernel: 417.715239 [1758] netmap_ring_reinit called for ixl0 TX1 Jan 29 15:20:17 rt1 kernel: 417.732143 [1783] netmap_ring_reinit total 1 errors Jan 29 15:20:17 rt1 kernel: 417.747738 [1787] netmap_ring_reinit ixl0 TX1 reinit, cur 0 -> 512 tail -1843510785 -> -1843510785 Jan 29 15:20:17 rt1 kernel: 417.775556 [1637] nm_txsync_prologue ixl0 TX2: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1841671681 rh 512 rc 512 rt -1841671681 hc 512 ht -1841671681 Jan 29 15:20:17 rt1 kernel: 417.833536 [1758] netmap_ring_reinit called for ixl0 TX2 Jan 29 15:20:17 rt1 kernel: 417.850440 [1783] netmap_ring_reinit total 1 errors Jan 29 15:20:17 rt1 kernel: 417.866031 [1787] netmap_ring_reinit ixl0 TX2 reinit, cur 0 -> 512 tail -1841671681 -> -1841671681 Jan 29 15:20:17 rt1 kernel: 417.893854 [1637] nm_txsync_prologue ixl0 TX3: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1839832577 rh 512 rc 512 rt -1839832577 hc 512 ht -1839832577 Jan 29 15:20:18 rt1 kernel: 417.951828 [1758] netmap_ring_reinit called for ixl0 TX3 Jan 29 15:20:18 rt1 kernel: 417.968738 [1783] netmap_ring_reinit total 1 errors Jan 29 15:20:18 rt1 kernel: 417.984330 [1787] netmap_ring_reinit ixl0 TX3 reinit, cur 0 -> 512 tail -1839832577 -> -1839832577 Jan 29 15:20:18 rt1 kernel: 418.012148 [1637] nm_txsync_prologue ixl0 TX4: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1837993473 rh 512 rc 512 rt -1837993473 hc 512 ht -1837993473 Jan 29 15:20:18 rt1 kernel: 418.070125 [1758] netmap_ring_reinit called for ixl0 TX4 Jan 29 15:20:18 rt1 kernel: 418.087037 [1783] netmap_ring_reinit total 1 errors Jan 29 15:20:18 rt1 kernel: 418.102626 [1787] netmap_ring_reinit ixl0 TX4 reinit, cur 0 -> 512 tail -1837993473 -> -1837993473 Jan 29 15:20:18 rt1 kernel: 418.130451 [1637] nm_txsync_prologue ixl0 TX5: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || Jan 29 15:20:18 rt1 kernel: kring->nr_hwtail >= n' h 0 c 0 t -1836154369 rh 512 rc 512 rt -1836154369 hc 512 ht -1836154369 Jan 29 15:20:18 rt1 kernel: 418.188425 [1758] netmap_ring_reinit called for ixl0 TX5 Jan 29 15:20:18 rt1 kernel: 418.205330 [1783] netmap_ring_reinit total 1 errors Jan 29 15:20:18 rt1 kernel: 418.220926 [1787] netmap_ring_reinit ixl0 TX5 reinit, cur 0 -> 512 tail -1836154369 -> -1836154369 Jan 29 15:20:18 rt1 kernel: 418.248752 [1637] nm_txsync_prologue ixl0 TX6: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1834315265 rh 512 rc 512 rt -1834315265 hc 512 ht -1834315265 Jan 29 15:20:18 rt1 kernel: 418.306721 [1758] netmap_ring_reinit called for ixl0 TX6 Jan 29 15:20:18 rt1 kernel: 418.323624 [1783] netmap_ring_reinit total 1 errors Jan 29 15:20:18 rt1 kernel: 418.339222 [1787] netmap_ring_reinit ixl0 TX6 reinit, cur 0 -> 512 tail -1834315265 -> -1834315265 Jan 29 15:20:18 rt1 kernel: 418.367043 [1637] nm_txsync_prologue ixl0 TX7: fail 'kring->nr_hwcur >= n || kring->rhead >= n || kring->rtail >= n || kring->nr_hwtail >= n' h 0 c 0 t -1832476161 rh 512 rc 512 rt -1832476161 hc 512 ht -1832476161 Jan 29 15:20:18 rt1 kernel: 418.425019 [1758] netmap_ring_reinit called for ixl0 TX7 Jan 29 15:20:18 rt1 kernel: 418.441929 [1783] netmap_ring_reinit total 1 errors Jan 29 15:20:18 rt1 kernel: 418.457518 [1787] netmap_ring_reinit ixl0 TX7 reinit, cur 0 -> 512 tail -1832476161 -> -1832476161
(In reply to Charles Goncalves from comment #25) Thanks! So this means that the issue is not related to the number of queues changing dynamically.
(In reply to Vincenzo Maffione from comment #26) I'll try to test on 12.0 today and give you some feedback. Thank you!
(In reply to Charles Goncalves from comment #27) Hi! I test ixl on top of 12.0 and it works with pkt-gen # dmesg | grep 'netmap queues' ixl0: netmap queues/slots: TX 8/1024, RX 8/1024 ixl1: netmap queues/slots: TX 8/1024, RX 8/1024 ixl2: netmap queues/slots: TX 8/1024, RX 8/1024 ixl3: netmap queues/slots: TX 8/1024, RX 8/1024 Can't find hw.ixl.max_queues via sysctl on 12.0 # ./pkt-gen -i ixl0 -f tx 518.683898 main [2889] interface is ixl0 518.683944 main [3011] using default burst size: 512 518.683954 main [3019] running on 1 cpus (have 8) 518.684152 extract_ip_range [471] range is 10.0.0.1:1234 to 10.0.0.1:1234 518.684169 extract_ip_range [471] range is 10.1.0.1:1234 to 10.1.0.1:1234 518.684235 nm_open [856] overriding ARG1 0 518.684246 nm_open [860] overriding ARG2 0 518.684251 nm_open [864] overriding ARG3 0 518.684254 nm_open [868] overriding RING_CFG 518.684258 nm_open [877] overriding ifname ixl0 ringid 0x0 flags 0x8001 518.853577 main [3117] mapped 334980KB at 0x800e00000 Sending on netmap:ixl0: 8 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff) 518.853734 main [3224] Sending 512 packets every 0.000000000 s 518.853833 start_threads [2549] Wait 2 secs for phy reset 520.979092 start_threads [2551] Ready... 520.979263 sender_body [1580] start, fd 3 main_fd 3 520.979543 sender_body [1638] frags 1 frag_size 60 521.010053 sender_body [1676] drop copy 521.981085 main_thread [2639] 15.143 Mpps (15.170 Mpkts 7.282 Gbps in 1001818 usec) 200.31 avg_batch 0 min_space 522.983084 main_thread [2639] 15.503 Mpps (15.534 Mpkts 7.456 Gbps in 1001999 usec) 189.22 avg_batch 99999 min_space ^C523.664717 sigint_h [562] received control-C on thread 0x800747000 523.664771 main_thread [2639] 15.501 Mpps (10.567 Mpkts 5.072 Gbps in 681687 usec) 195.10 avg_batch 99999 min_space 523.664853 sender_body [1718] flush tail 318 head 318 on thread 0x800747500 524.672164 main_thread [2639] 1.525 Kpps (1.536 Kpkts 737.280 Kbps in 1007393 usec) 192.00 avg_batch 99999 min_space Sent 41272863 packets 2476371780 bytes 212000 events 60 bytes each in 2.69 seconds. Speed: 15.369 Mpps Bandwidth: 7.377 Gbps (raw 7.377 Gbps). Average batch: 194.68 pkts
(In reply to Charles Goncalves from comment #28) ixl in 12.0 uses iflib to handle queue allocation: sysctl -d dev.ixl.0.iflib.override_ntxqs dev.ixl.0.iflib.override_ntxqs: # of txqs to use, 0 => use default # sysctl -d dev.ixl.0.iflib.override_nrxqs dev.ixl.0.iflib.override_nrxqs: # of rxqs to use, 0 => use default #
(In reply to Charles Goncalves from comment #28) Good. I'm not surprised this works, because ixl in FreeBSD 12.x is implemented through iflib, and netmap in this case uses iflib to access the hw (and iflib must work, otherwise you would not be able to use an ixl NIC using the traditional networking tools and applications). You can change the number of queues, slots and many other things through the generic iflib configuration tools. See iflib(4). On the other hand, ixl in FreeBSD 11.x is not implemented through iflib, so it is a completely separate code. And netmap is broken there, unfortunately. I notified the maintainers at Intel. They said the would have taken a look to this.
(In reply to Jeff Pieper from comment #29) Thank you for your reply! I setted dev.ixl.0.iflib.override_ntxqs and dev.ixl.0.iflib.override_nrxqs to "4" but still can some sysctl like "dev.ixl.0.iflib.txq7.r_drops". Can I change this sysctl value dynamically or only at boot time?
(In reply to Charles Goncalves from comment #31) I believe this requires a driver reset. If you are using a static driver (compiled into the kernel), then yes, it has to be at boot. If you are using a driver module, then with the driver unloaded, you can do: #kenv dev.ixl.0.iflib.override_ntxqs=<val> #kenv dev.ixl.0.iflib.override_nrxqs=<val> Then load then driver.
(In reply to Jeff Pieper from comment #32) Thank you, it works!
(In reply to Vincenzo Maffione from comment #30) So this is only issue with 11.x. I'll wait this solution from maintainers on 11.2. Thank you!
(In reply to Franco Fichtner from comment #21) Hi, I tried to install suricata from github sources, on 12.0-RELEASE. I use the following commands to run suricata over an e1000 interface: ```` sudo ifconfig em1 up -arp promisc -rxcsum -txcsum -rxcsum6 -txcsum6 -tso -tso4 -tso6 -lro -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso sudo suricata -c /etc/suricata/suricata.yaml --netmap=em1 -v ```` This is the netmap section of my suricata.yaml: ```` netmap: - interface: default threads: auto copy-mode: ips disable-promisc: yes # promiscuous mode checksum-checks: auto - interface: em1 copy-iface: em1+ - interface: em1+ copy-iface: em1 ```` and I see packets being captured ```` [100078] 3/2/2019 -- 14:01:59 - (util-device.c:329) <Notice> (LiveDeviceListClean) -- Stats for 'em1': pkts: 4892, drop: 0 (0.00%), invalid chksum: 0 ```` So what is not working exactly? Can anyone describe reproducible step that I can follow?
Issue confirmed in suricata, let's wait for them to merge the fix. https://github.com/OISF/suricata/pull/3616
I also met the same problem, ixl used iflib on 12 still has problems. More than 1Gbps data bridge will not work, no data test ping no problem
^Triage: assign to committer (apparently) resolving @Vincenzo What is the actual/remaining issues here and the change delta(s), if any, to be made in order to resolve? Is https://reviews.freebsd.org/D18984 still relevant (its closed) and related to this issue? Is this just a suricata issue? Is comment 37 relevant, or unrelated?
(In reply to Kubilay Kocak from comment #38) This is not a suricata issue. There was a suricata issue mentioned in this thread, but it has been fixed upstream (suricata). Comment #37 seems unrelated, since it mentions netmap with ixl in 12.x, where iflib is in use. There was an iflib/netmap bug (see https://reviews.freebsd.org/D25252) that may explain the problems briefly mentioned in #37. But that is now in HEAD and stable/12. This report is about a bug that apparently affects netmap TX over ixl in 11.x (but not in 12.x and ahead). This change https://reviews.freebsd.org/D18984 does some cleanup but it does not fix the bug. As you can see in the discussion, I reported the issue to the Intel developers, but as far as I know there have been no changes on their side (in stable/11). So I can assume that the bug is still there, and it's something that need the Intel developers attention, if someone is still interested in netmap+ixl in 11.x
Remaining issue is specific to the Intel driver. There's nothing more I can do here. If there is still interest, someone at Intel should take it.
(In reply to Vincenzo Maffione from comment #40) Hello Vincenzo! This issue is present on FreeBSD 12.2 or 13.0? I don't have a test environment right now, but I will upgrade a production router from 12.1 to 12.2 and to 13.0 in the following months then I can test it. This router has a ixl NIC (chip=0x15838086 Ethernet Controller XL710 for 40GbE QSFP+)
(In reply to Charles Goncalves from comment #41) I don't have a test environment either. But since ixl uses iflib on 12.x and 13.x, I expected this issue has gone away.
(In reply to Vincenzo Maffione from comment #42) Looks like netmap don't worked: # /usr/obj/usr/src/amd64.amd64/tools/tools/netmap/pkt-gen -i ixl1 -f tx 321.990539 main [2921] interface is ixl1 321.990568 main [3044] using default burst size: 512 321.990573 main [3052] running on 1 cpus (have 24) 321.990640 extract_ip_range [476] range is 10.0.0.1:1234 to 10.0.0.1:1234 321.990645 extract_ip_range [476] range is 10.1.0.1:1234 to 10.1.0.1:1234 Sending on netmap:ixl1: 5 queues, 1 threads and 1 cpus. 10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff) 322.096770 main [3255] Sending 512 packets every 0.000000000 s 322.096813 start_threads [2580] Wait 2 secs for phy reset 324.222299 start_threads [2582] Ready... 324.222365 sender_body [1599] start, fd 3 main_fd 3 324.222392 sender_body [1657] frags 1 frag_size 60 324.234391 sender_body [1695] drop copy 325.285776 main_thread [2671] 2.794 Mpps (2.971 Mpkts 1.341 Gbps in 1063411 usec) 15.05 avg_batch 0 min_space 326.348859 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063084 usec) 0.00 avg_batch 99999 min_space 326.472386 sender_body [1682] poll error on queue 0: timeout 327.411859 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063000 usec) 0.00 avg_batch 99999 min_space 328.473456 sender_body [1682] poll error on queue 0: timeout 328.474874 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063015 usec) 0.00 avg_batch 99999 min_space 329.537820 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1062945 usec) 0.00 avg_batch 99999 min_space 330.474386 sender_body [1682] poll error on queue 0: timeout 330.600771 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1062951 usec) 0.00 avg_batch 99999 min_space 331.663860 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063090 usec) 0.00 avg_batch 99999 min_space 332.475381 sender_body [1682] poll error on queue 0: timeout 332.726861 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063001 usec) 0.00 avg_batch 99999 min_space ^C333.671467 sigint_h [573] received control-C on thread 0x800a12000 333.671475 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 944614 usec) 0.00 avg_batch 99999 min_space 334.476434 sender_body [1737] flush tail 576 head 576 on thread 0x800a12700 334.734834 main_thread [2671] 0.000 pps (0.000 pkts 0.000 bps in 1063359 usec) 0.00 avg_batch 99999 min_space Sent 2971392 packets 178283520 bytes 197414 events 60 bytes each in 10.25 seconds. Speed: 289.777 Kpps Bandwidth: 139.093 Mbps (raw 139.093 Mbps). Average batch: 15.05 pkts Additional, in my application I am see logical errors from kernel: I a send 3 packets in ring 0, c/h/t is 3/3/2047 do NIOCTXSYNC, c/h/t is 3/3/0 do not send any packets, just do NIOCTXSYNC, c/h/t is 3/3/3 now! i.e. like TX ring is full and stalled. Any transmission staled after this. 13-stable.
What if you set hw.ixl.enable_head_writeback = 0 in /boot/loader.conf and reboot?
(In reply to Vincenzo Maffione from comment #44) I was testing a different application on a box with a ixl card. I also noticed the drop in the tx packets. Setting hw.ixl.enable_head_writeback = 0 in /boot/loader.conf seems to have resolved the issue with the preliminary tests I have done.
(In reply to strongswan from comment #45) I did some more testing, and even with the setting hw.ixl.enable_head_writeback = 0 I still get to a situation where no packets are transmitted. There is however a much longer interval between when the issues are occurring.
What is the state of the TX ring (head, cur, tail) when stalling?