Bug 233867 - pf: Long freezes on NAT port exhaustion
Summary: pf: Long freezes on NAT port exhaustion
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.2-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: Kristof Provost
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-08 14:50 UTC by Kristof Provost
Modified: 2018-12-26 12:57 UTC (History)
2 users (show)

See Also:
kp: mfc-stable11+
kp: mfc-stable12+


Attachments
Test case demonstrating the problem (2.49 KB, patch)
2018-12-08 14:52 UTC, Kristof Provost
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kristof Provost freebsd_committer 2018-12-08 14:50:56 UTC
A user reports that, in a large NAT setup with pf, the system will from time to time stop passing traffic. This resolves itself after a few minutes.
Afterwards the pf counters show an abnormally large number of searches.
Comment 1 Kristof Provost freebsd_committer 2018-12-08 14:52:44 UTC
Created attachment 199955 [details]
Test case demonstrating the problem

This test case provokes the problem.
It sets up NAT with only two usable ports, then creates three connections.
Comment 2 Kristof Provost freebsd_committer 2018-12-08 14:57:10 UTC
The system loses network connectivity when it can't find a free source port in pf_get_sport(). It keeps calling pf_map_addr(), trying to get a new IP to check for available ports.

I believe this problem was introduced by the patch in PR# 184003.

Note that we're running NAT with PF_POOL_STICKYADDR, so we find a src_node, and the early return is taken. This means we always return the same IP, and loop through the available ports in pf_get_sport() again and again. This loop continues until a state times out, and we do find a free port.
Comment 3 Kristof Provost freebsd_committer 2018-12-08 15:06:45 UTC
Proposed fix in: https://reviews.freebsd.org/D18483
Comment 4 commit-hook freebsd_committer 2018-12-12 20:15:36 UTC
A commit references this bug:

Author: kp
Date: Wed Dec 12 20:15:06 UTC 2018
New revision: 341998
URL: https://svnweb.freebsd.org/changeset/base/341998

Log:
  pf: Fix endless loop on NAT exhaustion with sticky-address

  When we try to find a source port in pf_get_sport() it's possible that
  all available source ports will be in use. In that case we call
  pf_map_addr() to try to find a new source IP to try from. If there are
  no more available source IPs pf_map_addr() will return 1 and we stop
  trying.

  However, if sticky-address is set we'll always return the same IP
  address, even if we've already tried that one.
  We need to check the supplied address, because if that's the one we'd
  set it means pf_get_sport() has already tried it, and we should error
  out rather than keep trying.

  PR:		233867
  MFC after:	2 weeks
  Differential Revision:	https://reviews.freebsd.org/D18483

Changes:
  head/sys/netpfil/pf/pf.c
  head/sys/netpfil/pf/pf_lb.c
Comment 5 commit-hook freebsd_committer 2018-12-12 20:19:41 UTC
A commit references this bug:

Author: kp
Date: Wed Dec 12 20:19:19 UTC 2018
New revision: 341999
URL: https://svnweb.freebsd.org/changeset/base/341999

Log:
  pf tests: NAT exhaustion test

  It's been reported that pf doesn't handle running out of available ports
  for NAT correctly. It freezes until a state expires and it can find a
  free port.
  Test for this, by setting up a situation where only two ports are
  available for NAT and then attempting to create three connections.

  If successful the third connection will fail immediately. In an
  incorrect case the connection attempt will freeze, also freezing all
  interaction with pf through pfctl and trigger timeout.

  PR:		233867
  MFC after:	2 weeks

Changes:
  head/tests/sys/netpfil/pf/Makefile
  head/tests/sys/netpfil/pf/nat.sh
Comment 6 commit-hook freebsd_committer 2018-12-26 12:54:56 UTC
A commit references this bug:

Author: kp
Date: Wed Dec 26 12:54:25 UTC 2018
New revision: 342542
URL: https://svnweb.freebsd.org/changeset/base/342542

Log:
  MFC r341998:

  pf: Fix endless loop on NAT exhaustion with sticky-address

  When we try to find a source port in pf_get_sport() it's possible that
  all available source ports will be in use. In that case we call
  pf_map_addr() to try to find a new source IP to try from. If there are
  no more available source IPs pf_map_addr() will return 1 and we stop
  trying.

  However, if sticky-address is set we'll always return the same IP
  address, even if we've already tried that one.
  We need to check the supplied address, because if that's the one we'd
  set it means pf_get_sport() has already tried it, and we should error
  out rather than keep trying.

  PR:		233867

Changes:
_U  stable/12/
  stable/12/sys/netpfil/pf/pf.c
  stable/12/sys/netpfil/pf/pf_lb.c
Comment 7 commit-hook freebsd_committer 2018-12-26 12:54:58 UTC
A commit references this bug:

Author: kp
Date: Wed Dec 26 12:54:28 UTC 2018
New revision: 342543
URL: https://svnweb.freebsd.org/changeset/base/342543

Log:
  MFC r341998:

  pf: Fix endless loop on NAT exhaustion with sticky-address

  When we try to find a source port in pf_get_sport() it's possible that
  all available source ports will be in use. In that case we call
  pf_map_addr() to try to find a new source IP to try from. If there are
  no more available source IPs pf_map_addr() will return 1 and we stop
  trying.

  However, if sticky-address is set we'll always return the same IP
  address, even if we've already tried that one.
  We need to check the supplied address, because if that's the one we'd
  set it means pf_get_sport() has already tried it, and we should error
  out rather than keep trying.

  PR:		233867

Changes:
_U  stable/11/
  stable/11/sys/netpfil/pf/pf.c
  stable/11/sys/netpfil/pf/pf_lb.c
Comment 8 commit-hook freebsd_committer 2018-12-26 12:56:01 UTC
A commit references this bug:

Author: kp
Date: Wed Dec 26 12:55:36 UTC 2018
New revision: 342544
URL: https://svnweb.freebsd.org/changeset/base/342544

Log:
  MFC r341999:

  pf tests: NAT exhaustion test

  It's been reported that pf doesn't handle running out of available ports
  for NAT correctly. It freezes until a state expires and it can find a
  free port.
  Test for this, by setting up a situation where only two ports are
  available for NAT and then attempting to create three connections.

  If successful the third connection will fail immediately. In an
  incorrect case the connection attempt will freeze, also freezing all
  interaction with pf through pfctl and trigger timeout.

  PR:		233867

Changes:
_U  stable/12/
  stable/12/tests/sys/netpfil/pf/Makefile
  stable/12/tests/sys/netpfil/pf/nat.sh