Hello all, We have upgraded one of our servers to FreeBSD 13.0 Stable last week. The server have nagios/nrpe sensor, checking the current states, rules and nat entries in PF namespace. After an 12-14 hours uptime server hangs. No messages in system log, debug log and so on. To prove our suspicions, we start the sensor script which is responsible for the pf states check, in infinite loop. The server hangs in next 10-15 minutes. This is 100% reproducible. The main part of the scirpt is : pfctl -sr pfctl -sn pfctl -ss Each command return value is compared with established limits. We are at your service if you need more information or details. Cheers Rumen Palov
When the server hangs with manually script execute the error which we face is: pfctl: DIOCGETRULE: Cannot allocate memory
(In reply to Rumen Palov from comment #1) That's ... very strange. That should be ENOMEM, but DIOCGETRULE doesn't actually ever return that error code. Are you monitoring memory use on that machine? What does truss show for the command that returns the 'Cannot allocate memory' error?
Hello Kristof , Yes we have memory monitoring. The amount of free RAM before server freeze is aroud 350MB from 16G total RAM. The swap is around 10-15MB usage. How do you recommend to execute truss command ? When I ran it with -o -m -f to catch the forks the output file become around 16-17G from command execution until Server Freeze. Cheers Rumen Palov
(In reply to Rumen Palov from comment #3) That's also odd, but possibly useful. A 16GB truss log is very unexpected, so I'd speculate that it's somehow ending up in an infinite loop. Can you post the first 5 or 10 MB of the log somewhere? I'd be very surprised if there's anything other than endless repetition of basically the same information after that (but do take a look near the end to confirm that).
Created attachment 225124 [details] Truss otput first 200l rows , 12 MB, bziped
The truss is running one perl script which execute pfctl -sr pfctl -sn pfctl -ss in endless loop until the Server Hangs, so yes the file is full with repeating content. This endless loop execution is the way we force the Server to become in hang state, without to wait 12 or 14 hours. It happens for 15-30 minutes, after we ran the script. I guess that the same count of this script is executed by the nrpe for 12 or 14 hours, which we approach in 30 minutes with endless execution. I have attached one bziped file with first 200k rows, 12 MB
(In reply to Rumen Palov from comment #6) Ah, I misunderstood your test setup. In that case the last few thousand lines might be more interesting. Depending on how the system stops working, of course. When you say it hangs, how are you diagnosing this? Loss of SSH access, or do you not get any response from the console any more? (It's more likely for pf issues to break networking than it is for it to totally freeze the system.) I'm going to assume that comment #1 means you can access the console when it's in the bad state. If the console is still responsive a forced core dump (sysctl debug.kdb.panic=1) is likely to be useful. A truss run over a single pfctl that produces "pfctl: DIOCGETRULE: Cannot allocate memory" should also be useful.
When the server is in the hang state, we can enter login username in system console, press enter and never receive password prompt. The machine is still having ICMP working, but we can not login to it via ssh or system console. It look like heavy swap situation. Also we do not have this error: DIOCGETRULE . In some cases of running the hang force script we do not have any logs: no one one the running ssh console and nothing in system logs. Sometimes we have: pid was killed: out of swap space or swap_pager: cannot allocate bio I will attach txt file with last 200K rows
Created attachment 225125 [details] Truss otput LAST 200K rows , 12 MB, bziped
(In reply to Rumen Palov from comment #8) I don't understand where comment #1 came from then. When did that happen?
(In reply to Kristof Provost from comment #10) Sorry, I missed one part of the sentence: We do not have this error each time: DIOCGETRULE.
Hello Kristof, I have truss files for each type of pfctl execution. I attach them here. For experiment, we disable pf state nagios sensor and since then the server is stable. No one freeze / hang.
Created attachment 225147 [details] Truss output of pfctl -ss , pfctl -sn , pfctl -sr, tar.bz2, 13M
Okay, I'm starting to make sense of this. pfctl -sr leaks memory on every invocation. I was confused by DIOCGETRULE, because it's not that call. It's DIOCGETRULENV, but the pfctl error message is still DIOCGETRULE. That also means that 13.0 is not affected (yay!). DIOCGETRULENV is relatively new code, but I don't see an obvious leak path right now. At least it's trivial to reproduce, so we'll get there in the next couple of days. You'll probably want to stop calling `pfctl -sr` or `pfctl -sa` for now. That'll mitigate your freezing issue until we can figure the leak out and fix it.
(In reply to Kristof Provost from comment #14) Listing the states leaks too. I've identified the source and expect to have a patch on Monday. In the mean time: either don't poll, or build or download a releng/13.0 pfctl and use that (it'll use the pre-nvlist ioctls, and those don't leak).
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4483fb47735c29408c72045469c9c4b3e549668b commit 4483fb47735c29408c72045469c9c4b3e549668b Author: Kristof Provost <kp@FreeBSD.org> AuthorDate: 2021-05-24 06:32:16 +0000 Commit: Kristof Provost <kp@FreeBSD.org> CommitDate: 2021-05-24 13:56:24 +0000 pf: fix ioctl() memory leak When we create an nvlist and insert it into another nvlist we must remember to destroy it. The nvlist_add_nvlist() function makes a copy, just like nvlist_add_string() makes a copy of the string. If we don't we're leaking memory on every (nvlist-based) ioctl() call. While here remove two redundant 'break' statements. PR: 255971 MFC after: 3 days Sponsored by: Rubicon Communications, LLC ("Netgate") sys/netpfil/pf/pf_ioctl.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=c3e4b38f4932d0ce457508b3893324a520e0dc30 commit c3e4b38f4932d0ce457508b3893324a520e0dc30 Author: Kristof Provost <kp@FreeBSD.org> AuthorDate: 2021-05-24 06:32:16 +0000 Commit: Kristof Provost <kp@FreeBSD.org> CommitDate: 2021-05-27 10:09:04 +0000 pf: fix ioctl() memory leak When we create an nvlist and insert it into another nvlist we must remember to destroy it. The nvlist_add_nvlist() function makes a copy, just like nvlist_add_string() makes a copy of the string. If we don't we're leaking memory on every (nvlist-based) ioctl() call. While here remove two redundant 'break' statements. PR: 255971 MFC after: 3 days Sponsored by: Rubicon Communications, LLC ("Netgate") (cherry picked from commit 4483fb47735c29408c72045469c9c4b3e549668b) sys/netpfil/pf/pf_ioctl.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=ff4447ac31ca1ee54ac7e2a01ba11c3bc8cafdca commit ff4447ac31ca1ee54ac7e2a01ba11c3bc8cafdca Author: Kristof Provost <kp@FreeBSD.org> AuthorDate: 2021-05-24 06:32:16 +0000 Commit: Kristof Provost <kp@FreeBSD.org> CommitDate: 2021-05-27 07:12:03 +0000 pf: fix ioctl() memory leak When we create an nvlist and insert it into another nvlist we must remember to destroy it. The nvlist_add_nvlist() function makes a copy, just like nvlist_add_string() makes a copy of the string. If we don't we're leaking memory on every (nvlist-based) ioctl() call. While here remove two redundant 'break' statements. PR: 255971 MFC after: 3 days Sponsored by: Rubicon Communications, LLC ("Netgate") (cherry picked from commit 4483fb47735c29408c72045469c9c4b3e549668b) sys/netpfil/pf/pf_ioctl.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
Hello Kristof, We update the server with issue pointed here In our state table we have around 150k states and the pfctl -ss never ends and never return prompt. I can not kill it event. Cheers
(In reply to Rumen Palov from comment #19) Yeah, something is weird: # /usr/bin/time pfctl -ss|wc -l 53.97 real 0.10 user 53.83 sys 2276 Kernel and world are: stable/13-n245797-cfeeb57166d: Sat May 29 20:58:25 CEST 2021
Our is: 13.0-STABLE FreeBSD 13.0-STABLE #6 r3823049: Mon May 31 19:08:28 EEST 2021