I have DELL R630 with ccX (Chelsio T62100-SO-CR) agreggated with lagg0 and vlans on it. I move all vlans to JAIL (VNET). There works bird which received full feed (900k prefixes). When I have set: net.route.algo.inet.algo: dxr the inbound dropp counter for ccX is incrementing. When I use dpdk_lpm4 or radix4 it do not happen. graph for net.route.algo.inet.algo=radix4 https://imgur.com/a/LsKBioR graph for net.route.algo.inet.algo=dxr https://imgur.com/a/UNsrAk6 # netstat -i Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll cc0 1500 <Link#1> 00:07:43:64:9e:60 1133593300298 44384 3593534 1133260298112 0 0 cc1 1500 <Link#2> 00:07:43:64:9e:68 0 0 0 0 0 0 bge0* 1500 <Link#3> 14:18:77:3d:10:5d 0 0 0 0 0 0 bge1* 1500 <Link#4> 14:18:77:3d:10:5e 0 0 0 0 0 0 bge2 1500 <Link#5> 14:18:77:3d:10:5b 19362895 0 0 15602306 0 0 bge2 - 10.19.70.0/24 10.19.70.20 19320567 - - 15559975 - - bge3* 1500 <Link#6> 14:18:77:3d:10:5c 0 0 0 0 0 0 cc2 1500 <Link#7> 00:07:43:64:9e:60 1108050218194 120531 1639219 1107797492457 0 0 cc3 1500 <Link#8> 00:07:43:64:9e:18 0 0 0 0 0 0 lo0 16384 <Link#9> lo0 106 0 0 106 0 0 lo0 - localhost localhost 0 - - 0 - - lo0 - fe80::%lo0/64 fe80::1%lo0 0 - - 0 - - lo0 - your-net localhost 106 - - 106 - - lagg0 1500 <Link#10> 00:07:43:64:9e:60 2241643518492 164915 5232753 2241057790569 0 0 dmesg when net.route.algo.debug_level=6 [fib_algo] inet.0 (dxr#213) handle_fd_callout: running callout type=3 [fib_algo] inet.0 (dxr#213) dxr_change_rib_batch: processing 2 update(s) [fib_algo] inet.0 (dxr#213) dxr_build: D16X4R, 845809 prefixes, 751 nhops (max) [fib_algo] inet.0 (dxr#213) dxr_build: 1311.29 KBytes, 1.58 Bytes/prefix [fib_algo] inet.0 (dxr#213) dxr_build: range table updated in 0.008 ms [fib_algo] inet.0 (dxr#213) dxr_build: trie updated in 0.001 ms [fib_algo] inet.0 (dxr#213) dxr_build: snapshot forked in 0.547 ms [fib_algo] inet.0 (dxr#213) dxr_build: range table: 2%, 26970 chunks, 8 holes [fib_algo] inet.0 (dxr#213) handle_fd_callout: running callout type=3 [fib_algo] inet.0 (dxr#213) dxr_change_rib_batch: processing 28 update(s) [fib_algo] inet.0 (dxr#213) dxr_build: D16X4R, 845809 prefixes, 751 nhops (max) [fib_algo] inet.0 (dxr#213) dxr_build: 1311.29 KBytes, 1.58 Bytes/prefix [fib_algo] inet.0 (dxr#213) dxr_build: range table updated in 0.110 ms [fib_algo] inet.0 (dxr#213) dxr_build: trie updated in 4.664 ms [fib_algo] inet.0 (dxr#213) dxr_build: snapshot forked in 0.462 ms [fib_algo] inet.0 (dxr#213) dxr_build: range table: 2%, 26970 chunks, 8 holes [fib_algo] inet.0 (dxr#213) handle_fd_callout: running callout type=3 [fib_algo] inet.0 (dxr#213) dxr_change_rib_batch: processing 2 update(s) [fib_algo] inet.0 (dxr#213) dxr_build: D16X4R, 845807 prefixes, 751 nhops (max) [fib_algo] inet.0 (dxr#213) dxr_build: 1311.29 KBytes, 1.58 Bytes/prefix [fib_algo] inet.0 (dxr#213) dxr_build: range table updated in 0.005 ms [fib_algo] inet.0 (dxr#213) dxr_build: trie updated in 0.001 ms [fib_algo] inet.0 (dxr#213) dxr_build: snapshot forked in 0.473 ms [fib_algo] inet.0 (dxr#213) dxr_build: range table: 2%, 26970 chunks, 8 holes [fib_algo] inet.0 (dxr#213) handle_fd_callout: running callout type=3 [fib_algo] inet.0 (dxr#213) dxr_change_rib_batch: processing 56 update(s) [fib_algo] inet.0 (dxr#213) dxr_build: D16X4R, 845807 prefixes, 751 nhops (max) [fib_algo] inet.0 (dxr#213) dxr_build: 1311.29 KBytes, 1.58 Bytes/prefix [fib_algo] inet.0 (dxr#213) dxr_build: range table updated in 0.092 ms [fib_algo] inet.0 (dxr#213) dxr_build: trie updated in 2.845 ms [fib_algo] inet.0 (dxr#213) dxr_build: snapshot forked in 0.545 ms [fib_algo] inet.0 (dxr#213) dxr_build: range table: 2%, 26970 chunks, 8 holes
Created attachment 227381 [details] DXR incremental updating: skip unaffected trie portions
The net.route.algo.debug_level=6 logs which Konrad posted indicate that DXR trie updating takes much longer than what range table updating does, while it should be the opposite, since range table updating involves costly radix tree walks and a lot of bit shuffling, while trie updating is less complex. In this particular case (full BGP view with frequent live updates) the observed FIB updating behavior may have been triggered by batched updates over a large portion of the entire IP address space, combined with naive / inefficient trie updating scan in dxr_build(). The attached patch should speed things up by skipping over trie parts which are entirely unaffected by the batched RIB updates. Nevertheless, even if this patch succeeds in reducing the duration of incremental updates with DXR, it might lessen but can not solve the original problem Konrad reported, i.e. packet loss somewhere in the forwarding path. Since over the entire duration of the DXR FIB update process the radix tree is read-locked, I can only guess there might be some obscure call path which requires a write lock on the radix tree and thus can't make any forward progress until DXR rebuild completes, but this is far beyond my grasp of the current state of the forwarding path.
I retract my previous speculation about the possibility of the forwarding path blocking on RIB write lock, because the problem seems to be more obvious. The FIB updating is being done under the net epoch, which means that the FIB updating thread can't be preempted until it completes, and with DXR this can last anywhere between a millisecond and a full second, depending on the size / complexity of the RIB, and the nature of the updates. If this happens on a CPU which also has to service a RX queue of a busy interface, there's nothing we can do ATM, except watch RX queue drop counters jumping through the roof. Ideally FIB rebuilds should be done under the RIB read lock but not under the net epoch. I'm not sure whether or how this could be arranged but perhaps melifaro@ might have an idea...
Hmm net epoch appears to be preemtible so it must be something else... Throwing the towel on this for today...
Q: is there any chance you have some amount of ARP packets going in, or some interface flapping, or both, during the time you observe the drops?
interfaces certainly not flapping, ccX are directly connected to our infrastructure. Three vlan are related with exchange point (shared vlan ), there is a lot of arp queries, but is normal level for such a connection. To be sure, I will start monitoring it. but as I post, when I set dpdk_lpm46 or radix46 it does not happend. The whole configuration is identical, I just change the algorithm.
(In reply to Konrad from comment #6) Did you have a chance to try the patch I posted last week - you can rebuild only the dxr module directly in /sys/modules/fib_dxr? I'm curious whether the events when trie updates take much longer than range table updates have vanished with the patch applied? Can you correlate long total DXR update times with your packet loss events? In particular, can "table rebuilt" or "trie rebuilt" reports in /var/log/messages be mapped to packet loss events?
Created attachment 227589 [details] lagg0 drops
Created attachment 227590 [details] dxr rebuilt
I attach lagg0 drops and dxr rebuilt logged with timestamp. I do not see relationship between them.
(In reply to Konrad from comment #10) It looks like full table rebuilds are being triggered far too often, most probably for no good reason. Try bumping sysctl net.route.algo.dxr.max_range_holes to somewhere between 50 and 100, and observe whether full rebuilds would occur less frequently. This sysctl aims to limit range table fragmentation which gets accumulated with incremental updates, and may start to hurt both cache effectiveness during lookups as well as the speed of incremental updates, but the default value might be too strict for full-view BGP workloads. Could you post more complete logs with net.route.algo.debug_level=6, like in your initial report, as I'm curious re. the effectiveness of my Aug 23 patch WRT incremental trie updates? Moreover, my reading of the logs is that indeed full rebuilds may be related to packet loss events, for example: Aug 31 13:33:01 Storm kernel: [fib_algo] inet.0 (dxr#155) dxr_build: range table rebuilt in 706.353 ms Aug 31 13:33:01 Storm kernel: [fib_algo] inet.0 (dxr#155) dxr_build: trie rebuilt in 8.615 ms [2021-08-31 13:33:00] 534246 [2021-08-31 13:33:01] 550441 [2021-08-31 13:33:02] 551616 or: Aug 31 13:45:22 Storm kernel: [fib_algo] inet.0 (dxr#155) dxr_build: range table rebuilt in 701.239 ms Aug 31 13:45:22 Storm kernel: [fib_algo] inet.0 (dxr#155) dxr_build: trie rebuilt in 8.632 ms [2021-08-31 13:45:21] 551616 [2021-08-31 13:45:22] 563845 [2021-08-31 13:45:23] 571588 etc. Perhaps yielding the CPU during lengthy table rebuilds could help, I'll post a follow-up patch when ready...
There is another vector to consider here. Currently the process of checking whether arp is eligible for insertion performs lookup under RIB rlock for no apparent reason. Thus, with the constant flow of ARP messages, every rebuild triggers RX queue handling stuck due to the processing of ARP message waiting for RIB lock. There are 2 ways of addressing it: 1) fixing arp code check 2) deferring such packets to a separate netisr, so we're generally get less sensitive to the complex processing of control plane traffic. I'll post a patch to address (1) in a day or two.
(In reply to Marko Zec from comment #11) Here https://files.fm/u/ntg6q8uxv are logs with net.route.algo.debug_level=6 after apply your patch, this is the period I was collecting tie rebuilt and lagg0 dropp.
Created attachment 227603 [details] Yield CPU during long DXR updates, add more diagnostic counters
(In reply to Konrad from comment #13) By looking at the last net.route.algo.debug_level=6 you posted I wonder if the patch was ineffective, or you kldloaded a wrong (old) .ko module, since trie update times still by far exceed range update times, and that should not be happening. Pls. try out the revised patch, which also attempts to yield the CPU during lengthy updates, maybe this could improve the chances for the NIC RX thread to gain access to the CPU. The patch also extends the output of net.route.algo.debug_level=6 logs, which should look like this: [fib_algo] inet.0 (dxr#117) dxr_change_rib_batch: processing 1 update(s) [fib_algo] inet.0 (dxr#117) dxr_change_rib_batch: plen min 8 max 8 avg 8 [fib_algo] inet.0 (dxr#117) dxr_build: D11X9R, 4 prefixes, 5 nhops (max) [fib_algo] inet.0 (dxr#117) dxr_build: 10.02 KBytes, 2567.00 Bytes/prefix [fib_algo] inet.0 (dxr#117) dxr_build: 4096 range updates, 0 refs, 0 unrefs [fib_algo] inet.0 (dxr#117) dxr_build: range table updated in 1.105 ms [fib_algo] inet.0 (dxr#117) dxr_build: 8 trie updates [fib_algo] inet.0 (dxr#117) dxr_build: trie updated in 0.038 ms [fib_algo] inet.0 (dxr#117) dxr_build: snapshot forked in 0.006 ms [fib_algo] inet.0 (dxr#117) dxr_build: range table: 0%, 2 chunks, 0 holes
Created attachment 227604 [details] Yield CPU during long DXR updates, add more diagnostic counters Uploaded a wrong patch last time, fixing...
(In reply to Alexander V. Chernikov from comment #12) https://reviews.freebsd.org/D31824 to avoid grabbing RIB lock on the fast path. It's probably with testing Marco's and mine changes separately to ensure that both of them actually addresses the problem :-) Sorry for making it a bit complex.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=936f4a42fa2a23d21f8f14a8c33627a8207b4b3b commit 936f4a42fa2a23d21f8f14a8c33627a8207b4b3b Author: Alexander V. Chernikov <melifaro@FreeBSD.org> AuthorDate: 2021-09-03 11:48:36 +0000 Commit: Alexander V. Chernikov <melifaro@FreeBSD.org> CommitDate: 2021-09-06 21:03:22 +0000 lltable: do not require prefix lookup when checking lle allocation rules. With the new FIB_ALGO infrastructure, nearly all subsystems use fib[46]_lookup() functions, which provides lockless lookups. A number of places remains that uses old-style lookup functions, that still requires RIB read lock to return the result. One of such places is arp processing code. FIB_ALGO implementation makes some tradeoffs, resulting in (relatively) prolonged periods of holding RIB_WLOCK. If the lock is held and datapath competes for it, the RX ring may get blocked, ending in traffic delays and losses. As currently arp processing is performed directly in the interrupt handler, handling ARP replies triggers the problem descibed above when the amount of ARP replies is high. To be more specific, prior to creating new ARP entry, routing lookup for the entry address in interface fib is executed. The following conditions are the verified: 1. If lookup returns an empty result, or the resulting prefix is non-directly-reachable, failure is returned. The only exception are host routes w/ gateway==address. 2. If the routing lookup returns different interface and non-host route, we want to support the use case of having multiple interfaces with the same prefix. In fact, the current code just checks if the returned prefix covers target address (always true) and effectively allow allocating ARP entries for any directly-reachable prefix, regardless of its interface. Change the code to perform the following: 1) use fib4_lookup() to get the nexthop, instead of requesting exact prefix. 2) Rewrite first condition check using nexthop flags (1:1 match) 3) Rewrite second condition to check for interface addresses matching target address on the input interface. Differential Revision: https://reviews.freebsd.org/D31824 Reviewed by: ae MFC after: 1 week PR: 257965 sys/netinet/in.c | 73 ++++++++++++++++++-------------------------------------- 1 file changed, 23 insertions(+), 50 deletions(-)
(In reply to Alexander V. Chernikov from comment #17) I built latest 13/STABLE kernel + your patch and dropps do not appear! Marko, in this case you need feedback for your changes ? I can build fib_dxr.ko and send dxr messages.
(In reply to Konrad from comment #19) Great news! Re testing Marko’s changes - I’d love to see the results as well. Currently some dxr rebuilds take up to 800ms and I’d really love to see the new numbers. With the current framework logic, 800ms may be a bit long. depending on the testing results I may need to change some of the logic to handle such scenarious better.
Created attachment 227786 [details] dxr messages I built new and load new fib_dxr.ko with Marko's patch DXR messages in attachment
(In reply to Konrad from comment #21) The debugging output does not correspond to the fib_dxr patch posted here, since the following lines are missing from the logs: dxr_change_rib_batch: plen min 8 max 8 avg 8 dxr_build: 4096 range updates, 0 refs, 0 unrefs dxr_build: 8 trie updates Please rebuild and reload the correct .ko module, and post the new debug_level=6 output. Ideally, apply a fresh patch from today which attempts to provide more insight into range table fragmentation patterns, while removing the sched_yield() hack which is apparently completely useless, after melifaro@ nailed down the blocking call path in ARP code. The problem that DXR triggers full table rebuilds far too often (every 5 minutes) remains to be properly addressed, and extended debugging logs from your real-world BGP environment could be extremely helpful towards that goal. I'll try to come up with a policy which strikes a better balance between keeping the table fragmentation low and triggering rebuilds only when it really makes sense. Under normal circumstances the lookup tables should be only incrementally updated.
Created attachment 227830 [details] Track range table fragmentation
(In reply to Marko Zec from comment #22) My latest attachment containt addiotional logs which you write about: Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) handle_fd_callout: running callout type=3 Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_change_rib_batch: processing 2 update(s) Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_change_rib_batch: plen min 24 max 24 avg 24 Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: D16X4R, 846135 prefixes, 743 nhops (max) Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1320.55 KBytes, 1.59 Bytes/prefix Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1 range updates, 1 refs, 1 unrefs Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: range table updated in 0.011 ms Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1 trie updates Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: trie updated in 0.000 ms Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: snapshot forked in 0.486 ms Sep 9 16:10:24 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: range table: 2%, 27369 chunks, 2 holes Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) handle_fd_callout: running callout type=3 Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_change_rib_batch: processing 2 update(s) Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_change_rib_batch: plen min 24 max 24 avg 24 Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: D16X4R, 846135 prefixes, 743 nhops (max) Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1320.55 KBytes, 1.59 Bytes/prefix Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1 range updates, 1 refs, 1 unrefs Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: range table updated in 0.035 ms Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1 trie updates Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: trie updated in 0.001 ms Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: snapshot forked in 0.475 ms Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: range table: 2%, 27369 chunks, 2 holes Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) handle_fd_callout: running callout type=3 Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_change_rib_batch: processing 8 update(s) Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_change_rib_batch: plen min 22 max 22 avg 22 Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: D16X4R, 846135 prefixes, 743 nhops (max) Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1320.55 KBytes, 1.59 Bytes/prefix Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1 range updates, 1 refs, 1 unrefs Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: range table updated in 0.027 ms Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1 trie updates Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: trie updated in 0.001 ms Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: snapshot forked in 0.498 ms Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: range table: 2%, 27369 chunks, 2 holes Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) handle_fd_callout: running callout type=3 Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_change_rib_batch: processing 4 update(s) Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_change_rib_batch: plen min 15 max 15 avg 15 Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: D16X4R, 846135 prefixes, 743 nhops (max) Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 1320.55 KBytes, 1.59 Bytes/prefix Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: 32 range updates, 0 refs, 0 unrefs Sep 9 16:10:25 Storm kernel: [fib_algo] inet.0 (dxr#87) dxr_build: range table updated in 0.020 ms whole logs - https://files.fm/u/wjz565tsg after the weekend I will build the last uploaded patch
Created attachment 227882 [details] Trie updating optimization, hopefully proper fix
Thanks for posting the log. It revealed a clumsy bug in the previous patch, pls. try out the new one (dxr_trie_update_fix.diff).
I apply the latest patch DXR logs - https://files.fm/u/76uhvx9sp
a few minutes ago "rebuilt" appeared: Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 12 update(s) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 13 max 24 avg 20 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846579 prefixes, 750 nhops (max) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1318.10 KBytes, 1.59 Bytes/prefix Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 194 range updates, 35 refs, 36 unrefs Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 0.480 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 163840 trie updates Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie rebuilt in 15.813 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.517 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27315 chunks, 94 holes Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 12 update(s) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 11 max 14 avg 12 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846579 prefixes, 750 nhops (max) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1318.10 KBytes, 1.59 Bytes/prefix Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 960 range updates, 244 refs, 244 unrefs Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 2.486 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 60 trie updates Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.016 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.545 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27315 chunks, 94 holes Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 2 update(s) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 22 max 22 avg 22 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846579 prefixes, 750 nhops (max) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1318.10 KBytes, 1.59 Bytes/prefix Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1 range updates, 1 refs, 1 unrefs Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 0.013 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1 trie updates Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.001 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.468 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27315 chunks, 94 holes Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 2 update(s) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 15 max 15 avg 15 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846579 prefixes, 750 nhops (max) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1318.10 KBytes, 1.59 Bytes/prefix Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 32 range updates, 0 refs, 0 unrefs Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 0.020 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 2 trie updates Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.004 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.503 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27315 chunks, 94 holes Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 10 update(s) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 11 max 24 avg 17 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846579 prefixes, 750 nhops (max) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1318.10 KBytes, 1.59 Bytes/prefix Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 801 range updates, 211 refs, 211 unrefs Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 1.879 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 52 trie updates Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.117 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.575 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27315 chunks, 94 holes Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 2 update(s) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 15 max 15 avg 15 Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846579 prefixes, 750 nhops (max) Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1318.10 KBytes, 1.59 Bytes/prefix Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 32 range updates, 0 refs, 0 unrefs Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 0.017 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 2 trie updates Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.001 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.505 ms Sep 14 20:59:16 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27315 chunks, 94 holes Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 4 update(s) Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 24 max 24 avg 24 Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846579 prefixes, 750 nhops (max) Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1318.10 KBytes, 1.59 Bytes/prefix Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1 range updates, 1 refs, 1 unrefs Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 0.017 ms Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1 trie updates Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.001 ms Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.519 ms Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27315 chunks, 94 holes Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 22 update(s) Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 11 max 24 avg 18 Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846579 prefixes, 750 nhops (max) Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1318.10 KBytes, 1.59 Bytes/prefix Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1011 range updates, 188 refs, 281 unrefs Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 2.594 ms Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 70 trie updates Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.187 ms Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.523 ms Sep 14 20:59:17 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27315 chunks, 121 holes Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 14 update(s) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 20 max 24 avg 22 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: frags 2:25 3:48 4:32 5:9 6:5 7:1 8:1 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846579 prefixes, 750 nhops (max) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1316.22 KBytes, 1.59 Bytes/prefix Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1048576 range updates, 66537 refs, 0 unrefs Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table rebuilt in 1077.913 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 163840 trie updates Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie rebuilt in 14.782 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.557 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27194 chunks, 0 holes Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 20 update(s) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 24 max 24 avg 24 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846581 prefixes, 750 nhops (max) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1316.22 KBytes, 1.59 Bytes/prefix Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 6 range updates, 5 refs, 4 unrefs Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 0.086 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 11 trie updates Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.057 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.586 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27194 chunks, 1 holes Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 2 update(s) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 24 max 24 avg 24 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846581 prefixes, 750 nhops (max) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1316.22 KBytes, 1.59 Bytes/prefix Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1 range updates, 1 refs, 1 unrefs Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 0.041 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1 trie updates Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.001 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.609 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27194 chunks, 1 holes Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 6 update(s) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 24 max 24 avg 24 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846581 prefixes, 750 nhops (max) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1316.22 KBytes, 1.59 Bytes/prefix Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 2 range updates, 1 refs, 1 unrefs Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 0.016 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 3 trie updates Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.009 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.554 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27194 chunks, 1 holes Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) handle_fd_callout: running callout type=3 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: processing 2 update(s) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_change_rib_batch: plen min 15 max 15 avg 15 Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: D16X4R, 846581 prefixes, 750 nhops (max) Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 1316.22 KBytes, 1.59 Bytes/prefix Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 32 range updates, 0 refs, 0 unrefs Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table updated in 0.022 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: 2 trie updates Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: trie updated in 0.001 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: snapshot forked in 0.442 ms Sep 14 20:59:19 Storm kernel: [fib_algo] inet.0 (dxr#109) dxr_build: range table: 2%, 27194 chunks, 1 holes
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=b51f8bae570b4e908191a1dae9da38aacf8c0fab commit b51f8bae570b4e908191a1dae9da38aacf8c0fab Author: Marko Zec <zec@FreeBSD.org> AuthorDate: 2021-09-15 20:36:59 +0000 Commit: Marko Zec <zec@FreeBSD.org> CommitDate: 2021-09-15 20:42:49 +0000 [fib algo][dxr] Optimize trie updating. Don't rebuild in vain trie parts unaffected by accumulated incremental RIB updates. PR: 257965 Tested by: Konrad Kreciwilk MFC after: 3 days sys/netinet/in_fib_dxr.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
(In reply to Konrad from comment #28) Occasional trie rebuilds can be considered normal and harmless: their purpose is to defragment the trie, while being relatively quick (10 - 20 ms range with a full BGP view). Range table rebuilds can be two orders of magnitude slower, but again if the table gets hopelessly fragmented over time, it may be reasonable to trigger a rebuild. Will try to produce a more reasonable rebuild triggering strategy and post a follow up patch over the weekend...
(In reply to commit-hook from comment #29) I've tried it on STABLE on a bit complex jail setting, and if I enable dxr from sysctl.conf during jail boot I get: [....] Fatal trap 18: integer divide fault while in kernel mode cpuid = 10; apic id = 0a instruction pointer = 0x20:0xffffffff83214ae0 stack pointer = 0x28:0xfffffe021ac095b0 frame pointer = 0x28:0xfffffe021ac096c0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 3890 (sysctl) trap number = 18 panic: integer divide fault cpuid = 10 time = 1631746352 KDB: stack backtrace: #0 0xffffffff80c73525 at kdb_backtrace+0x65 #1 0xffffffff80c25677 at vpanic+0x187 #2 0xffffffff80c254e3 at panic+0x43 #3 0xffffffff810b3587 at trap_fatal+0x387 #4 0xffffffff810b2a7b at trap+0x8b #5 0xffffffff81089fd8 at calltrap+0x8 #6 0xffffffff83213282 at dxr_dump_end+0x12 #7 0xffffffff80d6b43f at sync_algo_end_cb+0xbf #8 0xffffffff80d691f6 at setup_fd_instance+0x406 #9 0xffffffff80d68b17 at set_fib_algo+0x207 #10 0xffffffff80c35a81 at sysctl_root_handler_locked+0x91 #11 0xffffffff80c34fbc at sysctl_root+0x24c #12 0xffffffff80c35543 at userland_sysctl+0x173 #13 0xffffffff80c3538c at sys___sysctl+0x5c #14 0xffffffff810b3e7c at amd64_syscall+0x10c #15 0xffffffff8108a8eb at fast_syscall_common+0xf8 Uptime: 2m21s Dumping 2755 out of 65405 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 55 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu, [....] same build is happily running with auto selected radix4_lockless.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=eb3148cc4d256c20b5c7c9052539139b6f57f58b commit eb3148cc4d256c20b5c7c9052539139b6f57f58b Author: Marko Zec <zec@FreeBSD.org> AuthorDate: 2021-09-16 14:34:05 +0000 Commit: Marko Zec <zec@FreeBSD.org> CommitDate: 2021-09-16 14:34:05 +0000 [fib algo][dxr] Fix division by zero. A division by zero would occur if DXR would be activated on a vnet with no IP addresses configured on any interfaces. PR: 257965 MFC after: 3 days Reported by: Raul Munoz sys/netinet/in_fib_dxr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
(In reply to Raúl from comment #31) Thanks for the report! I could reproduce the problem, and the patch I just committed to main should fix the panic. Will MFC it with other fixes in a few days.
(In reply to Marko Zec from comment #33) It does, fixed it, panic averted. Thanks a lot Marko.
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=ad2cca48ed53e3282e9bc490074e75ccb50bffb9 commit ad2cca48ed53e3282e9bc490074e75ccb50bffb9 Author: Marko Zec <zec@FreeBSD.org> AuthorDate: 2021-09-15 20:36:59 +0000 Commit: Marko Zec <zec@FreeBSD.org> CommitDate: 2021-09-18 17:37:35 +0000 [fib algo][dxr] Optimize trie updating. Don't rebuild in vain trie parts unaffected by accumulated incremental RIB updates. PR: 257965 Tested by: Konrad Kreciwilk MFC after: 3 days (cherry picked from commit b51f8bae570b4e908191a1dae9da38aacf8c0fab) sys/netinet/in_fib_dxr.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=ec47ee78b461f5c03c11fa44ad77f695371b7d13 commit ec47ee78b461f5c03c11fa44ad77f695371b7d13 Author: Marko Zec <zec@FreeBSD.org> AuthorDate: 2021-09-16 14:34:05 +0000 Commit: Marko Zec <zec@FreeBSD.org> CommitDate: 2021-09-18 17:38:09 +0000 [fib algo][dxr] Fix division by zero. A division by zero would occur if DXR would be activated on a vnet with no IP addresses configured on any interfaces. PR: 257965 MFC after: 3 days Reported by: Raul Munoz (cherry picked from commit eb3148cc4d256c20b5c7c9052539139b6f57f58b) sys/netinet/in_fib_dxr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=1549575f22d14b3ac89a73627618a63132217460 commit 1549575f22d14b3ac89a73627618a63132217460 Author: Marko Zec <zec@FreeBSD.org> AuthorDate: 2021-10-09 11:22:27 +0000 Commit: Marko Zec <zec@FreeBSD.org> CommitDate: 2021-10-09 11:22:27 +0000 [fib_algo][dxr] Improve incremental updating strategy Tracking the number of unused holes in the trie and the range table was a bad metric based on which full trie and / or range rebuilds were triggered, which would happen in vain by far too frequently, particularly with live BGP feeds. Instead, track the total unused space inside the trie and range table structures, and trigger rebuilds if the percentage of unused space exceeds a sysctl-tunable threshold. MFC after: 3 days PR: 257965 sys/netinet/in_fib_dxr.c | 103 ++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 84 insertions(+), 19 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=0eeef61aec4b996e88547f31a8e7fef677180e98 commit 0eeef61aec4b996e88547f31a8e7fef677180e98 Author: Marko Zec <zec@FreeBSD.org> AuthorDate: 2021-10-09 11:22:27 +0000 Commit: Marko Zec <zec@FreeBSD.org> CommitDate: 2021-10-13 20:06:10 +0000 [fib_algo][dxr] Improve incremental updating strategy Tracking the number of unused holes in the trie and the range table was a bad metric based on which full trie and / or range rebuilds were triggered, which would happen in vain by far too frequently, particularly with live BGP feeds. Instead, track the total unused space inside the trie and range table structures, and trigger rebuilds if the percentage of unused space exceeds a sysctl-tunable threshold. MFC after: 3 days PR: 257965 sys/netinet/in_fib_dxr.c | 103 ++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 84 insertions(+), 19 deletions(-)
(In reply to Konrad from comment #28) Could you try out the fresh dxr version in stable/13, and let us know how often do range / trie rebuilds happen now with a live feed? As always, a snippet from a debug log could reveal if there's more that needs to be tweaked, in particular how much of lookup structure fragmentation gets accumulated after N hours of incremental updating...
Marco, I will check your changes in the next few weeks
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=e72b873b7c3ba3a6d8f54d58503fdd3454bb5be9 commit e72b873b7c3ba3a6d8f54d58503fdd3454bb5be9 Author: Alexander V. Chernikov <melifaro@FreeBSD.org> AuthorDate: 2021-09-03 11:48:36 +0000 Commit: Alexander V. Chernikov <melifaro@FreeBSD.org> CommitDate: 2021-12-04 19:02:23 +0000 lltable: do not require prefix lookup when checking lle allocation rules. With the new FIB_ALGO infrastructure, nearly all subsystems use fib[46]_lookup() functions, which provides lockless lookups. A number of places remains that uses old-style lookup functions, that still requires RIB read lock to return the result. One of such places is arp processing code. FIB_ALGO implementation makes some tradeoffs, resulting in (relatively) prolonged periods of holding RIB_WLOCK. If the lock is held and datapath competes for it, the RX ring may get blocked, ending in traffic delays and losses. As currently arp processing is performed directly in the interrupt handler, handling ARP replies triggers the problem descibed above when the amount of ARP replies is high. To be more specific, prior to creating new ARP entry, routing lookup for the entry address in interface fib is executed. The following conditions are the verified: 1. If lookup returns an empty result, or the resulting prefix is non-directly-reachable, failure is returned. The only exception are host routes w/ gateway==address. 2. If the routing lookup returns different interface and non-host route, we want to support the use case of having multiple interfaces with the same prefix. In fact, the current code just checks if the returned prefix covers target address (always true) and effectively allow allocating ARP entries for any directly-reachable prefix, regardless of its interface. Change the code to perform the following: 1) use fib4_lookup() to get the nexthop, instead of requesting exact prefix. 2) Rewrite first condition check using nexthop flags (1:1 match) 3) Rewrite second condition to check for interface addresses matching target address on the input interface. Differential Revision: https://reviews.freebsd.org/D31824 Reviewed by: ae MFC after: 1 week PR: 257965 (cherry picked from commit 936f4a42fa2a23d21f8f14a8c33627a8207b4b3b) sys/netinet/in.c | 73 ++++++++++++++++++-------------------------------------- 1 file changed, 23 insertions(+), 50 deletions(-)
MARKED AS SPAM
If you want to change the world, start with yourself. Greatness lies not in physical strength, but in the power of goodness that we do together <a href="https://heylink.me/paris88.com/">paris88</a>