Bug 255971

Summary: FreeBSD 13.0 Stable r3822450 hangs with error pfctl: DIOCGETRULE: Cannot allocate memory
Product: Base System Reporter: Rumen Palov <rpalov>
Component: kernAssignee: Kristof Provost <kp>
Status: Closed FIXED    
Severity: Affects Some People CC: flo, kp
Priority: --- Keywords: regression
Version: 13.0-STABLE   
Hardware: amd64   
OS: Any   
Attachments:
Description Flags
Truss otput first 200l rows , 12 MB, bziped
none
Truss otput LAST 200K rows , 12 MB, bziped
none
Truss output of pfctl -ss , pfctl -sn , pfctl -sr, tar.bz2, 13M none

Description Rumen Palov 2021-05-18 10:05:44 UTC
Hello all,

We have upgraded one of our servers to FreeBSD 13.0 Stable last week. 

The server have nagios/nrpe sensor, checking the current states, rules and nat entries in PF namespace. 

After an 12-14 hours uptime server hangs. 

No messages in system log, debug log and so on. 

To prove our suspicions, we start the sensor script which is responsible for the pf states check, in infinite loop. The server hangs in next 10-15 minutes. This is 100% reproducible. 

The main part of the scirpt is :

pfctl -sr
pfctl -sn
pfctl -ss

Each command return value is compared with established limits. 

We are at your service if you need more information or details.

Cheers
Rumen Palov
Comment 1 Rumen Palov 2021-05-18 10:08:11 UTC
When the server hangs with manually script execute the error which we face is: 

pfctl: DIOCGETRULE: Cannot allocate memory
Comment 2 Kristof Provost freebsd_committer freebsd_triage 2021-05-18 21:02:13 UTC
(In reply to Rumen Palov from comment #1)
That's ... very strange. That should be ENOMEM, but DIOCGETRULE doesn't actually ever return that error code.

Are you monitoring memory use on that machine?

What does truss show for the command that returns the 'Cannot allocate memory' error?
Comment 3 Rumen Palov 2021-05-20 13:37:07 UTC
Hello Kristof ,

Yes we have memory monitoring. 

The amount of free RAM before server freeze is aroud 350MB from 16G total RAM. 

The swap is around 10-15MB usage. 

How do you recommend to execute truss command ? When I ran it with -o -m -f to catch the forks the output file become around 16-17G from command execution until Server Freeze. 

Cheers
Rumen Palov
Comment 4 Kristof Provost freebsd_committer freebsd_triage 2021-05-20 14:09:03 UTC
(In reply to Rumen Palov from comment #3)
That's also odd, but possibly useful. A 16GB truss log is very unexpected, so I'd speculate that it's somehow ending up in an infinite loop.

Can you post the first 5 or 10 MB of the log somewhere? I'd be very surprised if there's anything other than endless repetition of basically the same information after that (but do take a look near the end to confirm that).
Comment 5 Rumen Palov 2021-05-20 14:20:31 UTC
Created attachment 225124 [details]
Truss otput first 200l rows , 12 MB, bziped
Comment 6 Rumen Palov 2021-05-20 14:24:00 UTC
The truss is running one perl script which execute 

pfctl -sr
pfctl -sn
pfctl -ss

in endless loop until the Server Hangs, so yes the file is full with repeating content. 

This endless loop execution is the way we force the Server to become in hang state, without to wait 12 or 14 hours. It happens for 15-30 minutes, after we ran the script. I guess that the same count of this script is executed by the nrpe for 12 or 14 hours, which we approach in 30 minutes with endless execution. 

I have attached one bziped file with first 200k rows, 12 MB
Comment 7 Kristof Provost freebsd_committer freebsd_triage 2021-05-20 14:29:22 UTC
(In reply to Rumen Palov from comment #6)
Ah, I misunderstood your test setup. In that case the last few thousand lines might be more interesting. Depending on how the system stops working, of course.

When you say it hangs, how are you diagnosing this? Loss of SSH access, or do you not get any response from the console any more? (It's more likely for pf issues to break networking than it is for it to totally freeze the system.)

I'm going to assume that comment #1 means you can access the console when it's in the bad state.
If the console is still responsive a forced core dump (sysctl debug.kdb.panic=1) is likely to be useful.

A truss run over a single pfctl that produces "pfctl: DIOCGETRULE: Cannot allocate memory" should also be useful.
Comment 8 Rumen Palov 2021-05-20 14:38:58 UTC
When the server is in the hang state, we can enter login username in system console, press enter and never receive password prompt. 

The machine is still having ICMP working, but we can not login to it via ssh or system console. 

It look like heavy swap situation.

Also we do not have this error: DIOCGETRULE .

In some cases of running the hang force script we do not have any logs: no one one the running ssh console and nothing in system logs. 

Sometimes we have:

pid was killed: out of swap space

or
 
swap_pager: cannot allocate bio

I will attach txt file with last 200K rows
Comment 9 Rumen Palov 2021-05-20 14:39:38 UTC
Created attachment 225125 [details]
Truss otput LAST 200K rows , 12 MB, bziped
Comment 10 Kristof Provost freebsd_committer freebsd_triage 2021-05-20 14:39:53 UTC
(In reply to Rumen Palov from comment #8)
I don't understand where comment #1 came from then. When did that happen?
Comment 11 Rumen Palov 2021-05-20 14:42:37 UTC
(In reply to Kristof Provost from comment #10)
Sorry, I missed one part of the sentence: 

We do not have this error each time: DIOCGETRULE.
Comment 12 Rumen Palov 2021-05-21 11:36:15 UTC
Hello Kristof,

I have truss files for each type of pfctl execution. 

I attach them here. 

For experiment, we disable pf state nagios sensor and since then the server is stable. No one freeze / hang.
Comment 13 Rumen Palov 2021-05-21 11:37:31 UTC
Created attachment 225147 [details]
Truss output of  pfctl -ss , pfctl -sn , pfctl -sr, tar.bz2, 13M
Comment 14 Kristof Provost freebsd_committer freebsd_triage 2021-05-22 09:42:54 UTC
Okay, I'm starting to make sense of this.

pfctl -sr leaks memory on every invocation. I was confused by DIOCGETRULE, because it's not that call. It's DIOCGETRULENV, but the pfctl error message is still DIOCGETRULE.

That also means that 13.0 is not affected (yay!). DIOCGETRULENV is relatively new code, but I don't see an obvious leak path right now.
At least it's trivial to reproduce, so we'll get there in the next couple of days.

You'll probably want to stop calling `pfctl -sr` or `pfctl -sa` for now. That'll mitigate your freezing issue until we can figure the leak out and fix it.
Comment 15 Kristof Provost freebsd_committer freebsd_triage 2021-05-22 10:37:00 UTC
(In reply to Kristof Provost from comment #14)
Listing the states leaks too. I've identified the source and expect to have a patch on Monday.

In the mean time: either don't poll, or build or download a releng/13.0 pfctl and use that (it'll use the pre-nvlist ioctls, and those don't leak).
Comment 16 commit-hook freebsd_committer freebsd_triage 2021-05-24 16:46:03 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4483fb47735c29408c72045469c9c4b3e549668b

commit 4483fb47735c29408c72045469c9c4b3e549668b
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2021-05-24 06:32:16 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2021-05-24 13:56:24 +0000

    pf: fix ioctl() memory leak

    When we create an nvlist and insert it into another nvlist we must
    remember to destroy it. The nvlist_add_nvlist() function makes a copy,
    just like nvlist_add_string() makes a copy of the string. If we don't
    we're leaking memory on every (nvlist-based) ioctl() call.

    While here remove two redundant 'break' statements.

    PR:             255971
    MFC after:      3 days
    Sponsored by:   Rubicon Communications, LLC ("Netgate")

 sys/netpfil/pf/pf_ioctl.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)
Comment 17 commit-hook freebsd_committer freebsd_triage 2021-05-27 10:21:22 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=c3e4b38f4932d0ce457508b3893324a520e0dc30

commit c3e4b38f4932d0ce457508b3893324a520e0dc30
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2021-05-24 06:32:16 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2021-05-27 10:09:04 +0000

    pf: fix ioctl() memory leak

    When we create an nvlist and insert it into another nvlist we must
    remember to destroy it. The nvlist_add_nvlist() function makes a copy,
    just like nvlist_add_string() makes a copy of the string. If we don't
    we're leaking memory on every (nvlist-based) ioctl() call.

    While here remove two redundant 'break' statements.

    PR:             255971
    MFC after:      3 days
    Sponsored by:   Rubicon Communications, LLC ("Netgate")

    (cherry picked from commit 4483fb47735c29408c72045469c9c4b3e549668b)

 sys/netpfil/pf/pf_ioctl.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)
Comment 18 commit-hook freebsd_committer freebsd_triage 2021-05-27 10:21:23 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ff4447ac31ca1ee54ac7e2a01ba11c3bc8cafdca

commit ff4447ac31ca1ee54ac7e2a01ba11c3bc8cafdca
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2021-05-24 06:32:16 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2021-05-27 07:12:03 +0000

    pf: fix ioctl() memory leak

    When we create an nvlist and insert it into another nvlist we must
    remember to destroy it. The nvlist_add_nvlist() function makes a copy,
    just like nvlist_add_string() makes a copy of the string. If we don't
    we're leaking memory on every (nvlist-based) ioctl() call.

    While here remove two redundant 'break' statements.

    PR:             255971
    MFC after:      3 days
    Sponsored by:   Rubicon Communications, LLC ("Netgate")

    (cherry picked from commit 4483fb47735c29408c72045469c9c4b3e549668b)

 sys/netpfil/pf/pf_ioctl.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)
Comment 19 Rumen Palov 2021-06-01 08:10:44 UTC
Hello Kristof,

We update the server with issue pointed here 

In our state table we have around 150k states and the pfctl -ss never ends and never return prompt. 

I can not kill it event. 

Cheers
Comment 20 Florian Smeets freebsd_committer freebsd_triage 2021-06-01 14:56:47 UTC
(In reply to Rumen Palov from comment #19)

Yeah, something is weird:

# /usr/bin/time pfctl -ss|wc -l
       53.97 real         0.10 user        53.83 sys
    2276

Kernel and world are:

stable/13-n245797-cfeeb57166d: Sat May 29 20:58:25 CEST 2021
Comment 21 Rumen Palov 2021-06-01 15:15:58 UTC
Our is:

13.0-STABLE FreeBSD 13.0-STABLE #6 r3823049: Mon May 31 19:08:28 EEST 2021