Bug 217997 - [pf] orphaned entries in src-track
Summary: [pf] orphaned entries in src-track
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.3-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-pf mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-22 10:12 UTC by Robert Schulze
Modified: 2017-05-11 14:38 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Schulze 2017-03-22 10:12:58 UTC
There seems to be a flaw in the src-track cleanup code. This has been discovered since 10.3.

pf.conf is (stripped down to the relevant parts):

-- 8< ------------------------

rdr proto tcp from any to $public-ip port http -> <www-pool> port http \
        round-robin sticky-address

table <www-pool> persist  { $www-addr, $www-addr2, $www-addr3 }

block in all
block out all

pass quick proto tcp from any to <www-pool> port 80 \
        keep state \
        ( source-track rule, max 8192, max-src-states 96, \
          tcp.closing 20, tcp.finwait 15 )

-- 8< ------------------------

The problem is, that src-track table grows until no more entries can be inserted. Although there are no states from a sample ip-address in the state table, there are still references in the src-track table:

# pfctl -vsS | grep -A1 $example-address
$example-address -> $www-addr ( states 4, connections 0, rate 0.0/0s )
   age 01:47:25, 4808 pkts, 1713437 bytes, rdr rule 0

# pfctl -sS | grep $example-address
(nothing shown)

To circumvent this problem, we have to parse pfctl output and kill over-aged source tracking entries manually.

Maybe someone can track this down.

Regards,
Robert Schulze
Comment 1 Max 2017-03-24 06:38:51 UTC
(In reply to Robert Schulze from comment #0)
Hello, Robert.

>The problem is, that src-track table grows until no more entries can be
> inserted. Although there are no states from a sample ip-address in the state 
>table, there are still references in the src-track table:
>
># pfctl -vsS | grep -A1 $example-address
>$example-address -> $www-addr ( states 4, connections 0, rate 0.0/0s )
>   age 01:47:25, 4808 pkts, 1713437 bytes, rdr rule 0
>
># pfctl -sS | grep $example-address
>(nothing shown)

"rdr rule 0". I think it is something related to "sticky-address". Do you have any kernel messages?

man pf.conf states:
"Note that by default these associations are destroyed as soon as there	are no longer states which refer to them; in order to make the mappings last beyond the lifetime of the states, increase the global options with set timeout src.track."

And do you have "expires in" counter in "pfctl -vsS" output?
Comment 2 Robert Schulze 2017-03-27 08:15:22 UTC
There are no pf-related kernel messages.

# pfctl -st | grep src.track
src.track                     0s

So source tracking entries should expire, as soon there are no more referenced states. The "expires in" counters from pfctl -vsS are always "00:00:00" or not shown.

regards,
Robert Schulze
Comment 3 Max 2017-03-28 06:08:42 UTC
(In reply to Robert Schulze from comment #2)

I'm unable to reproduce the described behaviour on 10.3 release with generic kernel and your rules. It works just fine. Maybe the problem is related to filter rule's limits... I'll try to produce more test traffic later to reach them.
I have two source tracking records for each address: one for rdr rule and one for filter rule. Can you confirm that?
Comment 4 Robert Schulze 2017-03-28 08:02:48 UTC
(In reply to Max from comment #3)
Those old-aged src track entries are only on rdr rule:

# pfctl -vsS | grep -A1 $client
$client -> $www_host ( states 4, connections 0, rate 0.0/0s )
   age 02:39:54, 20643 pkts, 23362337 bytes, rdr rule 0

# pfctl -vss | grep -A1 $client
(nothing shown)


One question is: why is there states=4 in source track, but no states in state table and how could that happen?

regards,
Robert Schulze
Comment 5 Max 2017-03-28 19:36:11 UTC
Well, I can reproduce the problem.
I have 3 hosts with 10.3 release (generic kernel). "Server", "client" and "firewall".
Complete pf.conf of "firewall" host:

set skip on {lo, em2}

table <www-pool> persist { 192.168.0.10, 192.168.0.20, 192.168.0.30 }


rdr proto tcp from any to 192.168.2.1 port http -> <www-pool> port http \
        round-robin sticky-address

block in all
block out all

pass quick proto tcp from any to <www-pool> port 80 \
        keep state \
        (source-track rule, max 120, max-src-states 96, \
         tcp.closing 20, tcp.finwait 15, tcp.closed 10)



It works as expected until we hit the "max states per rule" limit. For example (just counters):

# pfctl -vsi
Status: Enabled for 0 days 00:17:46           Debug: Urgent

State Table                          Total             Rate
  current entries                       20
  searches                             345            0.3/s
  inserts                               40            0.0/s
  removals                              20            0.0/s
Source Tracking Table
  current entries                       20
  searches                              80            0.1/s
  inserts                               40            0.0/s
  removals                              20            0.0/s

# pfctl -vsi
Status: Enabled for 0 days 00:18:05           Debug: Urgent

State Table                          Total             Rate
  current entries                        0
  searches                             345            0.3/s
  inserts                               40            0.0/s
  removals                              40            0.0/s
Source Tracking Table
  current entries                       20
  searches                              80            0.1/s
  inserts                               40            0.0/s
  removals                              20            0.0/s

# pfctl -vsi
Status: Enabled for 0 days 00:18:16           Debug: Urgent

State Table                          Total             Rate
  current entries                        0
  searches                             345            0.3/s
  inserts                               40            0.0/s
  removals                              40            0.0/s
Source Tracking Table
  current entries                        0
  searches                              80            0.1/s
  inserts                               40            0.0/s
  removals                              40            0.0/s


But when I reach the limit:

# pfctl -vsi
Status: Enabled for 0 days 00:04:46           Debug: Urgent

State Table                          Total             Rate
  current entries                        1
  searches                            1627            5.7/s
  inserts                              203            0.7/s
  removals                             202            0.7/s
Source Tracking Table
  current entries                       10
  searches                             333            1.2/s
  inserts                               40            0.1/s
  removals                              30            0.1/s
Limit Counters
  max states per rule                    9            0.0/s
  max-src-states                         0            0.0/s
  max-src-nodes                          0            0.0/s
  max-src-conn                           0            0.0/s
  max-src-conn-rate                      0            0.0/s
  overload table insertion               0            0.0/s
  overload flush states                  0            0.0/s

# pfctl -ss
all tcp 192.168.0.10:80 (192.168.2.1:80) <- 192.168.2.14:15122       CLOSED:SYN_SENT

# pfctl -sS
192.168.2.17 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s )
192.168.2.15 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s )
192.168.2.14 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s )
192.168.2.14 -> 0.0.0.0 ( states 1, connections 0, rate 0.0/0s )
192.168.2.13 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s )
192.168.2.11 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s )
192.168.2.12 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s )
192.168.2.16 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s )
192.168.2.18 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s )
192.168.2.10 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s )


# pfctl -vsi
Status: Enabled for 0 days 00:08:19           Debug: Urgent

State Table                          Total             Rate
  current entries                        0
  searches                            1627            3.3/s
  inserts                              203            0.4/s
  removals                             203            0.4/s
Source Tracking Table
  current entries                        8
  searches                             333            0.7/s
  inserts                               40            0.1/s
  removals                              32            0.1/s
Limit Counters
  max states per rule                    9            0.0/s
  max-src-states                         0            0.0/s
  max-src-nodes                          0            0.0/s
  max-src-conn                           0            0.0/s
  max-src-conn-rate                      0            0.0/s
  overload table insertion               0            0.0/s
  overload flush states                  0            0.0/s

# pfctl -vsS
192.168.2.17 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s )
   age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0
192.168.2.15 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s )
   age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0
192.168.2.13 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s )
   age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0
192.168.2.11 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s )
   age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0
192.168.2.12 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s )
   age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0
192.168.2.16 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s )
   age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0
192.168.2.18 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s )
   age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0
192.168.2.10 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s )
   age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0
Comment 6 Robert Schulze 2017-03-29 08:14:45 UTC
(In reply to Max from comment #5)
Thank you for your efforts.
Comment 7 Max 2017-03-29 20:29:47 UTC
A bit more info...

Before reaching the limit:

Status: Enabled for 0 days 04:08:59           Debug: Urgent
State Table                          Total             Rate
  current entries                      120
  searches                            7976            0.5/s
  inserts                              997            0.1/s
  removals                             877            0.1/s
Source Tracking Table
  current entries                        0
  searches                            1623            0.1/s
  inserts                              236            0.0/s
  removals                             216            0.0/s
Limit Counters
  max states per rule                    2            0.0/s
  max-src-states                         4            0.0/s
ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
pf mtags:                40,      0,       0,       0,       0,   0,   0
pf states:              296,  10010,     120,      62,     997,   0,   0
pf state keys:           88,      0,     184,     221,    1506,   0,   0
pf source nodes:        136,  10005,      20,     125,     236,   0,   0
pf table entries:       160, 200000,       3,      72,       3,   0,   0
pf table counters:       64,      0,       0,       0,       0,   0,   0
pf frags:               120,      0,       0,       0,       0,   0,   0
pf frag entries:         40,   5000,       0,       0,       0,   0,   0
pf state scrubs:         40,      0,       0,       0,       0,   0,   0
192.168.2.10 -> 192.168.0.20 ( states 6, connections 0, rate 0.0/0s )


After (two seconds later):

Status: Enabled for 0 days 04:09:01           Debug: Urgent
State Table                          Total             Rate
  current entries                      120
  searches                            7977            0.5/s
  inserts                              997            0.1/s
  removals                             877            0.1/s
Source Tracking Table
  current entries                        0
  searches                            1624            0.1/s
  inserts                              236            0.0/s
  removals                             216            0.0/s
Limit Counters
  max states per rule                    3            0.0/s
  max-src-states                         4            0.0/s
ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
pf mtags:                40,      0,       0,       0,       0,   0,   0
pf states:              296,  10010,     120,      62,     997,   0,   0
pf state keys:           88,      0,     186,     219,    1508,   0,   0
pf source nodes:        136,  10005,      20,     125,     236,   0,   0
pf table entries:       160, 200000,       3,      72,       3,   0,   0
pf table counters:       64,      0,       0,       0,       0,   0,   0
pf frags:               120,      0,       0,       0,       0,   0,   0
pf frag entries:         40,   5000,       0,       0,       0,   0,   0
pf state scrubs:         40,      0,       0,       0,       0,   0,   0
192.168.2.10 -> 192.168.0.20 ( states 7, connections 0, rate 0.0/0s )

So, we have one serach in state table, one search in source tracking table and increased states counter in source entry (other not included here).
We increase state counter of source node in pf_find_src_node(). But the problem is not so easy as it seems.


By the way, what about "pf state keys"? We have no states, but I see 6 state keys:

Status: Enabled for 0 days 04:09:15           Debug: Urgent
State Table                          Total             Rate
  current entries                        0
  searches                            7977            0.5/s
  inserts                              997            0.1/s
  removals                             997            0.1/s
Source Tracking Table
  current entries                        1
  searches                            1624            0.1/s
  inserts                              236            0.0/s
  removals                             235            0.0/s
Limit Counters
  max states per rule                    3            0.0/s
  max-src-states                         4            0.0/s
ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
pf mtags:                40,      0,       0,       0,       0,   0,   0
pf states:              296,  10010,       0,     182,     997,   0,   0
pf state keys:           88,      0,       6,     399,    1508,   0,   0
pf source nodes:        136,  10005,       1,     144,     236,   0,   0
pf table entries:       160, 200000,       3,      72,       3,   0,   0
pf table counters:       64,      0,       0,       0,       0,   0,   0
pf frags:               120,      0,       0,       0,       0,   0,   0
pf frag entries:         40,   5000,       0,       0,       0,   0,   0
pf state scrubs:         40,      0,       0,       0,       0,   0,   0
192.168.2.10 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s )
Comment 8 Max 2017-03-30 04:38:33 UTC
I think the problem is in pf_create_state():

	/* check maximums */
	if (r->max_states &&
	    (counter_u64_fetch(r->states_cur) >= r->max_states)) {
		counter_u64_add(V_pf_status.lcounters[LCNT_STATES], 1);
		REASON_SET(&reason, PFRES_MAXSTATES);
		return (PF_DROP);
	}

We can't just return here. Arguably we should "goto csfailed;" instead.
Comment 9 Max 2017-03-30 08:36:44 UTC
It seems that I'm right. The problem has gone. Hope that someone more experienced will review this fix.
Comment 10 Kristof Provost freebsd_committer 2017-03-30 18:18:24 UTC
(In reply to Max from comment #9)
It looks like you've certainly found a bug. It's clearly wrong to not free sk and nk in that error path. I've had a quick look at the OpenBSD history and they also fixed this (though slightly differently) a while back.
Good catch.

Robert, can you confirm this fixes your problem?
Comment 11 Max 2017-03-30 18:59:45 UTC
(In reply to Kristof Provost from comment #10)

I've done basic tests. It works. I can post some counters here. Perhaps we should test it without rdr/nat rule.

By the way, "pfctl -Fi" (flush info) does not clear "Limit Counters" in "pfctl -vsi" output.
Comment 12 Robert Schulze 2017-03-31 07:57:32 UTC
Since the affected machine is critical and cannot be rebooted without care, I cannot confirm the fix right now. My primary intention was to report the misbehaviour.
If a planned reboot takes place, I will patch the kernel beforehand and then talk back to you. 

Nevertheless, I think if two people agree on a simple fix, and the same bug was present in OpenBSD's pf, this should be the solution.

Thank you very much.

Robert Schulze
Comment 13 Robert Schulze 2017-03-31 09:57:14 UTC
By the way: could someone please attach a patch based on the assumptions you made?

regards,
Robert Schulze
Comment 14 Max 2017-03-31 11:51:25 UTC
--- sys/netpfil/pf/pf.c.orig    2017-03-30 09:54:22.056490000 +0000
+++ sys/netpfil/pf/pf.c 2017-03-30 09:55:10.735221000 +0000
@@ -3508,7 +3508,7 @@
            (counter_u64_fetch(r->states_cur) >= r->max_states)) {
                counter_u64_add(V_pf_status.lcounters[LCNT_STATES], 1);
                REASON_SET(&reason, PFRES_MAXSTATES);
-               return (PF_DROP);
+               goto csfailed;
        }
        /* src node for filter rule */
        if ((r->rule_flag & PFRULE_SRCTRACK ||
Comment 15 commit-hook freebsd_committer 2017-04-01 12:23:02 UTC
A commit references this bug:

Author: kp
Date: Sat Apr  1 12:22:34 UTC 2017
New revision: 316355
URL: https://svnweb.freebsd.org/changeset/base/316355

Log:
  pf: Fix leak of pf_state_keys

  If we hit the state limit we returned from pf_create_state() without cleaning
  up.

  PR:		217997
  Submitted by:	Max <maximos@als.nnov.ru>
  MFC after:	1 week

Changes:
  head/sys/netpfil/pf/pf.c
Comment 16 commit-hook freebsd_committer 2017-04-08 09:48:56 UTC
A commit references this bug:

Author: kp
Date: Sat Apr  8 09:48:21 UTC 2017
New revision: 316640
URL: https://svnweb.freebsd.org/changeset/base/316640

Log:
  MFC r316355

  pf: Fix leak of pf_state_keys

  If we hit the state limit we returned from pf_create_state() without cleaning
  up.

  PR:		217997
  Submitted by:	Max <maximos@als.nnov.ru>

Changes:
_U  stable/11/
  stable/11/sys/netpfil/pf/pf.c
Comment 17 commit-hook freebsd_committer 2017-04-08 09:50:01 UTC
A commit references this bug:

Author: kp
Date: Sat Apr  8 09:49:21 UTC 2017
New revision: 316641
URL: https://svnweb.freebsd.org/changeset/base/316641

Log:
  MFC r316355

  pf: Fix leak of pf_state_keys

  If we hit the state limit we returned from pf_create_state() without cleaning
  up.

  PR:		217997
  Submitted by:	Max <maximos@als.nnov.ru>

Changes:
_U  stable/10/
  stable/10/sys/netpfil/pf/pf.c