Summary: | Traffic shaper unable to perform beyond 1Gbit/s limit | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | e94pasch | ||||||||
Component: | misc | Assignee: | freebsd-net (Nobody) <net> | ||||||||
Status: | Closed Not Enough Information | ||||||||||
Severity: | Affects Only Me | CC: | franco, garga, kbowling, kp | ||||||||
Priority: | --- | Keywords: | needs-qa | ||||||||
Version: | Unspecified | ||||||||||
Hardware: | amd64 | ||||||||||
OS: | Any | ||||||||||
See Also: | https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194453 | ||||||||||
Attachments: |
|
Description
e94pasch
2021-08-21 20:15:52 UTC
You're going to have to share a lot more detail about your configuration before we can possibly help you. Also, this is the wrong place to ask for support for Opnsense. Can we employ more professionalism here and not close this prematurely while "asking for details"? This pertains to at least FreeBSD 12.1 and I don't know of any changes that would indicate this is any better all the way up to CURRENT, but happy for any type of insightful pointer. Cheers, Franco (In reply to Franco Fichtner from comment #2) Franco, I can't even tell if this is a dummynet or an ALTQ problem from this report. There's nothing that can be done based on this information. Moreover, again, this is not the place for Opnsense support. Until and unless Opnsense runs on top of unmodified FreeBSD this is not the place to report these problems. Yes, I am very impatient with useless bug reports. We get more problem reports than we can possibly deal with, and keeping un-actionable reports around does no one any favours. If relevant information is eventually provided we can always reopen this. Kirstof, I send people here when we can establish the likelihood that this is an OS area of interest. I don't tell them what to do and I don't tell you what you should do. However, if you close tickets based on your assumption that we send people here all the time I can tell you that assumption is off so I will simply say that is neither nice nor productive and that's it. When you assume that OPNsense is not FreeBSD that assumption is off as well and I think you know that too, but insist otherwise. Yes this is about dummynet and the reporter will have to add more information about his configuration in any case, but please be aware you do not have to safeguarding this bugtracker from reports or reporters who are possibly using FreeBSD or open source software for the first time and don't know what qualifies as a good and informative report yet. We all have to deal with this situation possibly on a daily basis. I can tell you that panics exist in WFQ scheduler that haven't been worked on in years. I can also tell you that the MBPS calculation is off after about 500 MBit/s of specification where you have to start overcommitting on bandwidth to match actual line speed. This report may be about the latter or something else. The fact we don't know yet is the only fact why this is is taking an unprofessional turn after a single day of the report being sent. Cheers, Franco (In reply to Franco Fichtner from comment #4) I'm sorry, but we don't support anything other than FreeBSD here. A quick diff between FreeBSD and Opnsense (I assume, I have no idea how your branching strategy works) results in: "73 files changed, 5924 insertions(+), 3519 deletions(-)" I'm well aware that there are any number of bugs in dummynet (and ALTQ for that matter). I'm currently arguing with it myself. That does not change the fact that this is FreeBSD's bug tracker, so there's no point in reporting issues here unless they are reproduced on FreeBSD. We don't take Opnsense bug reports here any more than we take Playstation or Netflix OCA bug reports. Aside from the fact that it's a report against the wrong OS we still don't have any sort of setup information. Absent a reproduction case on FreeBSD I will not engage on this bug any more. Without evidence it sounds like the user may be running out of CPU budget rather than any particular bug. I agree with Kristof that we need more information and the vendor/project needs to help qualify what and where issues are first or this will fatigue devs if it turn into a raw support forum for issues like that. If there is an obvious FreeBSD defect (for instance a panic in unmodified code) I'd send it here right away. Otherwise I think you should figure out how to help users produce more qualified reports, a debug bundle including things like dmesg, pciconf -lv, relevant sysctls (to include stats ones for the NIC), netstat etc is the standard Mellanox wants when I interact with them in my dayjob, maybe worth consideration here. In this case I'd suggest a more direct approach of some 'top' output while bw limited, and even better some pmcstat for at least instructions retired https://wiki.freebsd.org/PmcTools/PmcTop. As far as Kristof's line of argumentation is concerned I'll take note of the hostility that has (elsewhere) never sparked any useful result in the past. I don't blame him and I know where this is coming from. ;) I do get what you are asking, but if we as a core team try to force hours of qualifying user reports for at least 67 upstream projects of varying scope and source code and complexity and rules without any pledge of hand-over is a waste of time between involved projects. That's also assuming we don't spend any time on our own code. It's not a realistic attitude. But as I said I agree that reporters need to do more here if you let them. The only project that gave such a pledge is Suricata and they did include all of FreeBSD because of it making it a tier 1 priority. I've personally seen issues lay dormant for years, patch-ready reports time out, committers nowhere to be found (src and ports alike(. If this wasn't the case I wouldn't try to argue here that this is one step below, but it is. Cheers, Franco Created attachment 227369 [details]
Logs, Info and Screen shots
Hello, please take a look at the information attached.
Thanks!
(In reply to e94pasch from comment #8) Thanks, can you do 'top -PSH' instead? I will explain my reasoning so you can follow along. A stream is ordered and transcends the network stack on one core. Therefore it is dependent on the single core performance. Adding more work into the network stack, like shaping, can therefore result in the bw limits you are seeing. So, after looking at the revised top info (PSH will show cores, system, and threads) we can confirm if this is the case. (In reply to Franco Fichtner from comment #7) We all have to work together somehow. I don't particularly care about the past or downstream project business concerns as long as everyone is doing good work on the mainline tree. I'd like help improving FreeBSD from anyone interested, there's plenty of work to go around. I feel that some of your technical concerns will be taken more seriously when you are on an up to date mainline FreeBSD. Created attachment 227384 [details]
top with PSH flags shaper on
Created attachment 227385 [details]
top with PSH flags shaper off
(In reply to e94pasch from comment #12) Thanks, this actually looks ok in both cases and the fact that it is on Chelsio eliminates a lot of other concerns one might have in the data path. I am wondering if this is actually some unexpected interaction on the TCP streams, which opens up the set of problems considerably. Could you try a low RTT workload like iperf3 through the firewall without going over the internet? Might bug 194453 be at all relevent, or potentially related? Can we also get a copy of a minimal (and the underlying freebsd) /etc/rc.conf network configuration @Franco, would you be able to help to find the file equivalent to /etc/rc.conf on regular Freebsd on opnsense? I also tried to execute a NAT speed test but failed unfortunately. @Franco do you have any pointers here i was not able to do iperf through firewall, something was still blocking. Perhaps in a separate discussion in opnsense forums? Thanks! Dunno if helpful here is kldstat result: root@OPNsense:~ # kldstat Id Refs Address Size Name 1 97 0xffffffff80200000 26906e0 kernel 2 1 0xffffffff82892000 10250 carp.ko 3 1 0xffffffff828a3000 f998 if_bridge.ko 4 2 0xffffffff828b3000 72a8 bridgestp.ko 5 1 0xffffffff828bb000 3e78 if_enc.ko 6 1 0xffffffff828bf000 b1c0 if_gre.ko 7 1 0xffffffff828cb000 16008 if_lagg.ko 8 1 0xffffffff828e2000 8b60 if_tap.ko 9 3 0xffffffff828eb000 584a0 pf.ko 10 1 0xffffffff82944000 2af8 pflog.ko 11 1 0xffffffff82947000 ec30 pfsync.ko 12 1 0xffffffff82956000 24180 mlx4en.ko 13 3 0xffffffff8297b000 29128 linuxkpi.ko 14 2 0xffffffff829a5000 671b0 mlx4.ko 15 1 0xffffffff82a0d000 cdda0 if_cxgbe.ko 16 1 0xffffffff82c21000 89b8 tmpfs.ko 17 1 0xffffffff82c2a000 2668 intpm.ko 18 1 0xffffffff82c2d000 b50 smbus.ko 19 1 0xffffffff82c2e000 4260 ng_ubt.ko 20 9 0xffffffff82c33000 9e30 netgraph.ko 21 2 0xffffffff82c3d000 91b8 ng_hci.ko 22 3 0xffffffff82c47000 9c0 ng_bluetooth.ko 23 1 0xffffffff82c48000 cad0 ng_l2cap.ko 24 1 0xffffffff82c55000 1ba00 ng_btsocket.ko 25 1 0xffffffff82c71000 21c0 ng_socket.ko 26 1 0xffffffff82c74000 8d80 aesni.ko 27 1 0xffffffff82c7d000 1490 amdtemp.ko 28 1 0xffffffff82c7f000 828 amdsmn.ko 29 1 0xffffffff82c80000 1710 ng_ether.ko 30 1 0xffffffff82c82000 55b8 ng_netflow.ko 31 1 0xffffffff82c88000 2397 ng_ksocket.ko 32 2 0xffffffff82c8b000 25858 ipfw.ko 33 1 0xffffffff82cb1000 11140 dummynet.ko |