Bug 276870 - mbuf cluster leak with on pf+bird2 bgp routers
Summary: mbuf cluster leak with on pf+bird2 bgp routers
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.2-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-pf (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-07 15:25 UTC by Thomas Steen Rasmussen / Tykling
Modified: 2024-03-17 13:47 UTC (History)
9 users (show)

See Also:


Attachments
Screenshot of mbuf cluster total use as reported by netstat -m over time (280.00 KB, image/png)
2024-02-07 15:25 UTC, Thomas Steen Rasmussen / Tykling
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Steen Rasmussen / Tykling 2024-02-07 15:25:11 UTC
Created attachment 248234 [details]
Screenshot of mbuf cluster total use as reported by netstat -m over time

Hello 🙂

Last month I had one of my FreeBSD routers stop forwarding (stopped responding on the network at all, had to IPMI in) because it ran out of mbuf clusters. It usually operates far from the limit, but there is (was) something leaking mbuf clusters bad, and I suspect it might be bird2, or a combination of bird2 and a FreeBSD kernel bug.

----

Some background:

The boxes in question are BGP routers for a small network, they run bird2 and only get a default route from upstream BGP, not a full table.

Due to a missing/misconfigured kernel export filter bird was repeatedly trying to export some routes to the kernel which the kernel already knew (from statically configured blackhole routes). So these errors have been repeating in the logs for some time (more than a year, meaning this in itself has not been an issue):

Jan 11 19:09:04 dgncr2a bird[30963]: KRT: Error sending route 2a09:94c0::/29 to kernel: File exists
Jan 11 19:10:04 dgncr2a syslogd: last message repeated 1 times
Jan 11 19:10:04 dgncr2a bird[30963]: KRT: Error sending route 85.209.116.0/22 to kernel: File exists
Jan 11 19:11:04 dgncr2a syslogd: last message repeated 1 times

Over the holidays I upgraded from bird 2.0.9 to bird 2.14, as well as upgrading FreeBSD from 13-STABLE-384a885111ad to 13-STABLE-2cbd132986a7. I suspect one of these two changes made this problem appear. I made no changes to bird or router config other than the upgrades.

----

The mbuf cluster leak was pretty bad, like 8-10 clusters per second at a pretty steady rate. The kern.ipc.nmbclusters limit on my routers was around 2 million and I raised it to 4 million now.

Since I had no idea what was causing the leak and I was desperate for a fix I at one point tried adding the missing kernel export filter (as to at least silence the noisy warnings in the logs), and imagine my surprise when the mbuf cluster leak stopped.

I tried removing the filers again, the leak started again, and stopped again when I re-added the filters. It appears some combination of bird 2.14 and exporting routes already found in the kernel means leaking mbuf clusters like crazy.

I have no idea if this is a bird or a freebsd problem. I reported the issue to the bird-users@ list http://trubka.network.cz/pipermail/bird-users/2024-January/017314.html and was encouraged in that thread to open this PR as well.

The attached grafana screenshot shows the per-second rate of increase (seen over 5 minutes) of the "total" number in the "mbuf clusters in use" line of the `netstat -m` output for both routers. The green line is the active and the yellow line is the passive router.

The drop in the green line and the following spike towards the end (2000-2100ish) is me filtering the blackhole routes from the bird kernel export, removing the filter to confirm, and re-adding it.

I can to some extent test stuff, but the routers are in production so nothing too wild.
Comment 1 Thomas Steen Rasmussen / Tykling 2024-02-07 15:31:17 UTC
ps. The very rudimentary netstat -m exporter is here, needs jq and sponge installed:

[tykling@dgncr2a ~]$ cat /etc/cron.d/netstat_mbuf_exporter
# Run netstat_mbuf_exporter.sh every minute and put the output in prometheus textfile collector directory
* * * * * root /usr/local/bin/netstat_mbuf_exporter.sh | /usr/local/bin/sponge /var/tmp/node_exporter/netstat-mbuf.prom

[tykling@dgncr2a ~]$ cat /usr/local/bin/netstat_mbuf_exporter.sh
#!/bin/sh
/usr/bin/netstat -m --libxo json | /usr/local/bin/jq -r '."mbuf-statistics" | keys_unsorted[] as $k | "\($k) \(.[$k])"' | /usr/bin/tr "-" "_" | /usr/bin/sed "s/^/freebsd_netstat_mbuf_/g"

[tykling@dgncr2a ~]$ head -5 /var/tmp/node_exporter/netstat-mbuf.prom
freebsd_netstat_mbuf_mbuf_current 1495568
freebsd_netstat_mbuf_mbuf_cache 3547
freebsd_netstat_mbuf_mbuf_total 1499115
freebsd_netstat_mbuf_cluster_current 749044
freebsd_netstat_mbuf_cluster_cache 3558
[tykling@dgncr2a ~]$
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2024-02-09 19:43:52 UTC
Gleb, based on the report this sounds more like a leak in the routing socket code, no?  There's no mention of pf except in the bug title.

> I at one point tried adding the missing kernel export filter (as to at least silence the noisy warnings in the logs), and imagine my surprise when the mbuf cluster leak stopped.

I'm not too familiar with how this works - does this basically install a bunch of routes in the kernel, so most likely you're hitting an mbuf leak in the routing socket code?  This may be fixed in 14.0 by virtue of having reimplemented parts of that interface using netlink.
Comment 3 Gleb Smirnoff freebsd_committer freebsd_triage 2024-02-10 03:56:18 UTC
On Wed Feb  7 15:25:11  2024 UTC, thomas@gibfest.dk wrote:
> Over the holidays I upgraded from bird 2.0.9 to bird 2.14, as well as upgrading
> FreeBSD from 13-STABLE-384a885111ad to 13-STABLE-2cbd132986a7. I suspect one of
> these two changes made this problem appear. I made no changes to bird or router
> config other than the upgrades.

What I would suspect here is NETLINK. Lots of stuff merged between
384a885111ad and 2cbd132986a7.

Thomas, is it possible for you to work more on isolating the regression?

Things to check:
1) Did bird upgrade 2.0.9 to 2.14 switch bird to use NETLINK instead of
   route socket?

If 1) is false, there are two options: 2.0.9 and 2.14 both used NETLINK
or both used route socket. If the latter, than my guess is totally wrong
and Mark's guess is much better. If the former, than we need to bisect
between 384a885111ad and 2cbd132986a7.

2) If 1) is true, then please compile 2.14 with NETLINK disabled and
   check if leak has gone.

If 1) and 2) are true it could be the problem was in 384a885111ad
as well, but you did not use NETLINK.

3) Check if running with NETLINK on 384a885111ad reproduces the leak
   or not? (Be careful, as lots of bugs were removed after 384a885111ad)

Depending on 3) we may need to run bisection.

Anyway, please keep us updated when you got more info, starting with 1).
Comment 4 Marek Zarychta 2024-02-10 07:08:09 UTC
Now, when we have only FreeBSD 13, 14 and CURRENT branches supported and all of them have reworked routing stack with NETLINK support included, bird2-netlink is better suited to run on FreeBSD and probably should become the default flavor of net/bird2 port. The transition is important to avoid such situations in the future.

Netlink flavor supports ECMP, the memory footprint is much lower compared to rtsock version, and it will run with the same config file, though small config changes are recommended. The user experience with bird2-netlink is better since it can run undisturbed for months on FreeBSD 13.2+ without any observable drawbacks.