Bug 240532 - pf stops purging IPv6 FIN_WAIT_2 states?
Summary: pf stops purging IPv6 FIN_WAIT_2 states?
Status: Closed Not A Bug
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.2-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-pf (Nobody)
URL:
Keywords:
: 240533 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-09-12 11:23 UTC by Peter Eriksson
Modified: 2019-09-16 09:46 UTC (History)
2 users (show)

See Also:


Attachments
pf.conf (2.55 KB, text/plain)
2019-09-12 11:23 UTC, Peter Eriksson
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Eriksson 2019-09-12 11:23:23 UTC
Created attachment 207418 [details]
pf.conf

I just noticed that our production servers seems to be accumulating FIN_WAIT_2 state entries in PF (atleast IPv6 ones).

# pfctl -ss | egrep FIN_WAIT_2 | egrep -v 2001 | wc -l
     386
# pfctl -ss | egrep FIN_WAIT_2 | egrep  2001 | wc -l
   23141

Using workaround from bug 222126 seems to wake up the "pf purge" kernel thread again 
(atleast for a while):

# echo "set timeout interval 5" | pfctl -mf -

FreeBSD 11.2-RELEASE-p10
Dell PowerEdge R730xd
256GB RAM
(NFS & SMB fileservers)
Comment 1 Peter Eriksson 2019-09-12 11:56:26 UTC
*** Bug 240533 has been marked as a duplicate of this bug. ***
Comment 2 Kristof Provost freebsd_committer freebsd_triage 2019-09-15 08:32:16 UTC
Can you repeat the tests from #222126? 
Specifically the dtrace script in comment #1 and the procstat in comment #6?
Comment 3 Peter Eriksson 2019-09-16 06:13:57 UTC
I ran those tests you mention and some other stuff, and it looks like the "accumulation" was due to Linux (Ubuntu 18.04 for what it's worth) NFS clients bombarding our servers with new TCP connections (which they just as quickly teared down). Like 200-400 new requests/s with unique client source port numbers - no surprise those states accumulated quickly.

Exactly why those Linux clients are doing this is a bit unclear but it looks like it have something to do with users having their home directory mounted via NFSv4 with sec=krb5, and then their Kerberos tickets expiring (on the client). Possibly while they were running "evolution" which has a number of files/databases opened on the users home directory. This seems to cause "rpc.gssd" (on the client) to go into a spin (100% CPU) and somehow causes this endless stream of new TCP connections...

The stream of IP packets we are seeing look like this:

0.001280 Client -> Server SYN
0.001289 Server -> Client SYN+ACK
0.001516 Client -> Server ACK
0.003609 Client -> Server FIN+ACK
0.003615 Server -> Client ACK
0.003620 Server -> Client FIN+ACK
0.003841 Client -> Server ACK
<repeat 400 times/s>

Anyway I don't think this is a problem in FreeBSD/pf so we can close this bug. Looking more like (yet another) Linux bug.

(I wonder if it would be possible to throttle misbehaving clients like these somehow, perhaps some rate-limiting in PF could do the trick? Hmm...)
Comment 4 Kristof Provost freebsd_committer freebsd_triage 2019-09-16 09:46:28 UTC
Not a FreeBSD/pf issue, as per comment #3.