A user reports that, in a large NAT setup with pf, the system will from time to time stop passing traffic. This resolves itself after a few minutes. Afterwards the pf counters show an abnormally large number of searches.
Created attachment 199955 [details] Test case demonstrating the problem This test case provokes the problem. It sets up NAT with only two usable ports, then creates three connections.
The system loses network connectivity when it can't find a free source port in pf_get_sport(). It keeps calling pf_map_addr(), trying to get a new IP to check for available ports. I believe this problem was introduced by the patch in PR# 184003. Note that we're running NAT with PF_POOL_STICKYADDR, so we find a src_node, and the early return is taken. This means we always return the same IP, and loop through the available ports in pf_get_sport() again and again. This loop continues until a state times out, and we do find a free port.
Proposed fix in: https://reviews.freebsd.org/D18483
A commit references this bug: Author: kp Date: Wed Dec 12 20:15:06 UTC 2018 New revision: 341998 URL: https://svnweb.freebsd.org/changeset/base/341998 Log: pf: Fix endless loop on NAT exhaustion with sticky-address When we try to find a source port in pf_get_sport() it's possible that all available source ports will be in use. In that case we call pf_map_addr() to try to find a new source IP to try from. If there are no more available source IPs pf_map_addr() will return 1 and we stop trying. However, if sticky-address is set we'll always return the same IP address, even if we've already tried that one. We need to check the supplied address, because if that's the one we'd set it means pf_get_sport() has already tried it, and we should error out rather than keep trying. PR: 233867 MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18483 Changes: head/sys/netpfil/pf/pf.c head/sys/netpfil/pf/pf_lb.c
A commit references this bug: Author: kp Date: Wed Dec 12 20:19:19 UTC 2018 New revision: 341999 URL: https://svnweb.freebsd.org/changeset/base/341999 Log: pf tests: NAT exhaustion test It's been reported that pf doesn't handle running out of available ports for NAT correctly. It freezes until a state expires and it can find a free port. Test for this, by setting up a situation where only two ports are available for NAT and then attempting to create three connections. If successful the third connection will fail immediately. In an incorrect case the connection attempt will freeze, also freezing all interaction with pf through pfctl and trigger timeout. PR: 233867 MFC after: 2 weeks Changes: head/tests/sys/netpfil/pf/Makefile head/tests/sys/netpfil/pf/nat.sh
A commit references this bug: Author: kp Date: Wed Dec 26 12:54:25 UTC 2018 New revision: 342542 URL: https://svnweb.freebsd.org/changeset/base/342542 Log: MFC r341998: pf: Fix endless loop on NAT exhaustion with sticky-address When we try to find a source port in pf_get_sport() it's possible that all available source ports will be in use. In that case we call pf_map_addr() to try to find a new source IP to try from. If there are no more available source IPs pf_map_addr() will return 1 and we stop trying. However, if sticky-address is set we'll always return the same IP address, even if we've already tried that one. We need to check the supplied address, because if that's the one we'd set it means pf_get_sport() has already tried it, and we should error out rather than keep trying. PR: 233867 Changes: _U stable/12/ stable/12/sys/netpfil/pf/pf.c stable/12/sys/netpfil/pf/pf_lb.c
A commit references this bug: Author: kp Date: Wed Dec 26 12:54:28 UTC 2018 New revision: 342543 URL: https://svnweb.freebsd.org/changeset/base/342543 Log: MFC r341998: pf: Fix endless loop on NAT exhaustion with sticky-address When we try to find a source port in pf_get_sport() it's possible that all available source ports will be in use. In that case we call pf_map_addr() to try to find a new source IP to try from. If there are no more available source IPs pf_map_addr() will return 1 and we stop trying. However, if sticky-address is set we'll always return the same IP address, even if we've already tried that one. We need to check the supplied address, because if that's the one we'd set it means pf_get_sport() has already tried it, and we should error out rather than keep trying. PR: 233867 Changes: _U stable/11/ stable/11/sys/netpfil/pf/pf.c stable/11/sys/netpfil/pf/pf_lb.c
A commit references this bug: Author: kp Date: Wed Dec 26 12:55:36 UTC 2018 New revision: 342544 URL: https://svnweb.freebsd.org/changeset/base/342544 Log: MFC r341999: pf tests: NAT exhaustion test It's been reported that pf doesn't handle running out of available ports for NAT correctly. It freezes until a state expires and it can find a free port. Test for this, by setting up a situation where only two ports are available for NAT and then attempting to create three connections. If successful the third connection will fail immediately. In an incorrect case the connection attempt will freeze, also freezing all interaction with pf through pfctl and trigger timeout. PR: 233867 Changes: _U stable/12/ stable/12/tests/sys/netpfil/pf/Makefile stable/12/tests/sys/netpfil/pf/nat.sh