There seems to be a flaw in the src-track cleanup code. This has been discovered since 10.3. pf.conf is (stripped down to the relevant parts): -- 8< ------------------------ rdr proto tcp from any to $public-ip port http -> <www-pool> port http \ round-robin sticky-address table <www-pool> persist { $www-addr, $www-addr2, $www-addr3 } block in all block out all pass quick proto tcp from any to <www-pool> port 80 \ keep state \ ( source-track rule, max 8192, max-src-states 96, \ tcp.closing 20, tcp.finwait 15 ) -- 8< ------------------------ The problem is, that src-track table grows until no more entries can be inserted. Although there are no states from a sample ip-address in the state table, there are still references in the src-track table: # pfctl -vsS | grep -A1 $example-address $example-address -> $www-addr ( states 4, connections 0, rate 0.0/0s ) age 01:47:25, 4808 pkts, 1713437 bytes, rdr rule 0 # pfctl -sS | grep $example-address (nothing shown) To circumvent this problem, we have to parse pfctl output and kill over-aged source tracking entries manually. Maybe someone can track this down. Regards, Robert Schulze
(In reply to Robert Schulze from comment #0) Hello, Robert. >The problem is, that src-track table grows until no more entries can be > inserted. Although there are no states from a sample ip-address in the state >table, there are still references in the src-track table: > ># pfctl -vsS | grep -A1 $example-address >$example-address -> $www-addr ( states 4, connections 0, rate 0.0/0s ) > age 01:47:25, 4808 pkts, 1713437 bytes, rdr rule 0 > ># pfctl -sS | grep $example-address >(nothing shown) "rdr rule 0". I think it is something related to "sticky-address". Do you have any kernel messages? man pf.conf states: "Note that by default these associations are destroyed as soon as there are no longer states which refer to them; in order to make the mappings last beyond the lifetime of the states, increase the global options with set timeout src.track." And do you have "expires in" counter in "pfctl -vsS" output?
There are no pf-related kernel messages. # pfctl -st | grep src.track src.track 0s So source tracking entries should expire, as soon there are no more referenced states. The "expires in" counters from pfctl -vsS are always "00:00:00" or not shown. regards, Robert Schulze
(In reply to Robert Schulze from comment #2) I'm unable to reproduce the described behaviour on 10.3 release with generic kernel and your rules. It works just fine. Maybe the problem is related to filter rule's limits... I'll try to produce more test traffic later to reach them. I have two source tracking records for each address: one for rdr rule and one for filter rule. Can you confirm that?
(In reply to Max from comment #3) Those old-aged src track entries are only on rdr rule: # pfctl -vsS | grep -A1 $client $client -> $www_host ( states 4, connections 0, rate 0.0/0s ) age 02:39:54, 20643 pkts, 23362337 bytes, rdr rule 0 # pfctl -vss | grep -A1 $client (nothing shown) One question is: why is there states=4 in source track, but no states in state table and how could that happen? regards, Robert Schulze
Well, I can reproduce the problem. I have 3 hosts with 10.3 release (generic kernel). "Server", "client" and "firewall". Complete pf.conf of "firewall" host: set skip on {lo, em2} table <www-pool> persist { 192.168.0.10, 192.168.0.20, 192.168.0.30 } rdr proto tcp from any to 192.168.2.1 port http -> <www-pool> port http \ round-robin sticky-address block in all block out all pass quick proto tcp from any to <www-pool> port 80 \ keep state \ (source-track rule, max 120, max-src-states 96, \ tcp.closing 20, tcp.finwait 15, tcp.closed 10) It works as expected until we hit the "max states per rule" limit. For example (just counters): # pfctl -vsi Status: Enabled for 0 days 00:17:46 Debug: Urgent State Table Total Rate current entries 20 searches 345 0.3/s inserts 40 0.0/s removals 20 0.0/s Source Tracking Table current entries 20 searches 80 0.1/s inserts 40 0.0/s removals 20 0.0/s # pfctl -vsi Status: Enabled for 0 days 00:18:05 Debug: Urgent State Table Total Rate current entries 0 searches 345 0.3/s inserts 40 0.0/s removals 40 0.0/s Source Tracking Table current entries 20 searches 80 0.1/s inserts 40 0.0/s removals 20 0.0/s # pfctl -vsi Status: Enabled for 0 days 00:18:16 Debug: Urgent State Table Total Rate current entries 0 searches 345 0.3/s inserts 40 0.0/s removals 40 0.0/s Source Tracking Table current entries 0 searches 80 0.1/s inserts 40 0.0/s removals 40 0.0/s But when I reach the limit: # pfctl -vsi Status: Enabled for 0 days 00:04:46 Debug: Urgent State Table Total Rate current entries 1 searches 1627 5.7/s inserts 203 0.7/s removals 202 0.7/s Source Tracking Table current entries 10 searches 333 1.2/s inserts 40 0.1/s removals 30 0.1/s Limit Counters max states per rule 9 0.0/s max-src-states 0 0.0/s max-src-nodes 0 0.0/s max-src-conn 0 0.0/s max-src-conn-rate 0 0.0/s overload table insertion 0 0.0/s overload flush states 0 0.0/s # pfctl -ss all tcp 192.168.0.10:80 (192.168.2.1:80) <- 192.168.2.14:15122 CLOSED:SYN_SENT # pfctl -sS 192.168.2.17 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s ) 192.168.2.15 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s ) 192.168.2.14 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s ) 192.168.2.14 -> 0.0.0.0 ( states 1, connections 0, rate 0.0/0s ) 192.168.2.13 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s ) 192.168.2.11 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s ) 192.168.2.12 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s ) 192.168.2.16 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s ) 192.168.2.18 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s ) 192.168.2.10 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s ) # pfctl -vsi Status: Enabled for 0 days 00:08:19 Debug: Urgent State Table Total Rate current entries 0 searches 1627 3.3/s inserts 203 0.4/s removals 203 0.4/s Source Tracking Table current entries 8 searches 333 0.7/s inserts 40 0.1/s removals 32 0.1/s Limit Counters max states per rule 9 0.0/s max-src-states 0 0.0/s max-src-nodes 0 0.0/s max-src-conn 0 0.0/s max-src-conn-rate 0 0.0/s overload table insertion 0 0.0/s overload flush states 0 0.0/s # pfctl -vsS 192.168.2.17 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s ) age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0 192.168.2.15 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s ) age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0 192.168.2.13 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s ) age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0 192.168.2.11 -> 192.168.0.10 ( states 1, connections 0, rate 0.0/0s ) age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0 192.168.2.12 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s ) age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0 192.168.2.16 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s ) age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0 192.168.2.18 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s ) age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0 192.168.2.10 -> 192.168.0.30 ( states 1, connections 0, rate 0.0/0s ) age 00:04:40, 72 pkts, 4050 bytes, rdr rule 0
(In reply to Max from comment #5) Thank you for your efforts.
A bit more info... Before reaching the limit: Status: Enabled for 0 days 04:08:59 Debug: Urgent State Table Total Rate current entries 120 searches 7976 0.5/s inserts 997 0.1/s removals 877 0.1/s Source Tracking Table current entries 0 searches 1623 0.1/s inserts 236 0.0/s removals 216 0.0/s Limit Counters max states per rule 2 0.0/s max-src-states 4 0.0/s ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP pf mtags: 40, 0, 0, 0, 0, 0, 0 pf states: 296, 10010, 120, 62, 997, 0, 0 pf state keys: 88, 0, 184, 221, 1506, 0, 0 pf source nodes: 136, 10005, 20, 125, 236, 0, 0 pf table entries: 160, 200000, 3, 72, 3, 0, 0 pf table counters: 64, 0, 0, 0, 0, 0, 0 pf frags: 120, 0, 0, 0, 0, 0, 0 pf frag entries: 40, 5000, 0, 0, 0, 0, 0 pf state scrubs: 40, 0, 0, 0, 0, 0, 0 192.168.2.10 -> 192.168.0.20 ( states 6, connections 0, rate 0.0/0s ) After (two seconds later): Status: Enabled for 0 days 04:09:01 Debug: Urgent State Table Total Rate current entries 120 searches 7977 0.5/s inserts 997 0.1/s removals 877 0.1/s Source Tracking Table current entries 0 searches 1624 0.1/s inserts 236 0.0/s removals 216 0.0/s Limit Counters max states per rule 3 0.0/s max-src-states 4 0.0/s ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP pf mtags: 40, 0, 0, 0, 0, 0, 0 pf states: 296, 10010, 120, 62, 997, 0, 0 pf state keys: 88, 0, 186, 219, 1508, 0, 0 pf source nodes: 136, 10005, 20, 125, 236, 0, 0 pf table entries: 160, 200000, 3, 72, 3, 0, 0 pf table counters: 64, 0, 0, 0, 0, 0, 0 pf frags: 120, 0, 0, 0, 0, 0, 0 pf frag entries: 40, 5000, 0, 0, 0, 0, 0 pf state scrubs: 40, 0, 0, 0, 0, 0, 0 192.168.2.10 -> 192.168.0.20 ( states 7, connections 0, rate 0.0/0s ) So, we have one serach in state table, one search in source tracking table and increased states counter in source entry (other not included here). We increase state counter of source node in pf_find_src_node(). But the problem is not so easy as it seems. By the way, what about "pf state keys"? We have no states, but I see 6 state keys: Status: Enabled for 0 days 04:09:15 Debug: Urgent State Table Total Rate current entries 0 searches 7977 0.5/s inserts 997 0.1/s removals 997 0.1/s Source Tracking Table current entries 1 searches 1624 0.1/s inserts 236 0.0/s removals 235 0.0/s Limit Counters max states per rule 3 0.0/s max-src-states 4 0.0/s ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP pf mtags: 40, 0, 0, 0, 0, 0, 0 pf states: 296, 10010, 0, 182, 997, 0, 0 pf state keys: 88, 0, 6, 399, 1508, 0, 0 pf source nodes: 136, 10005, 1, 144, 236, 0, 0 pf table entries: 160, 200000, 3, 72, 3, 0, 0 pf table counters: 64, 0, 0, 0, 0, 0, 0 pf frags: 120, 0, 0, 0, 0, 0, 0 pf frag entries: 40, 5000, 0, 0, 0, 0, 0 pf state scrubs: 40, 0, 0, 0, 0, 0, 0 192.168.2.10 -> 192.168.0.20 ( states 1, connections 0, rate 0.0/0s )
I think the problem is in pf_create_state(): /* check maximums */ if (r->max_states && (counter_u64_fetch(r->states_cur) >= r->max_states)) { counter_u64_add(V_pf_status.lcounters[LCNT_STATES], 1); REASON_SET(&reason, PFRES_MAXSTATES); return (PF_DROP); } We can't just return here. Arguably we should "goto csfailed;" instead.
It seems that I'm right. The problem has gone. Hope that someone more experienced will review this fix.
(In reply to Max from comment #9) It looks like you've certainly found a bug. It's clearly wrong to not free sk and nk in that error path. I've had a quick look at the OpenBSD history and they also fixed this (though slightly differently) a while back. Good catch. Robert, can you confirm this fixes your problem?
(In reply to Kristof Provost from comment #10) I've done basic tests. It works. I can post some counters here. Perhaps we should test it without rdr/nat rule. By the way, "pfctl -Fi" (flush info) does not clear "Limit Counters" in "pfctl -vsi" output.
Since the affected machine is critical and cannot be rebooted without care, I cannot confirm the fix right now. My primary intention was to report the misbehaviour. If a planned reboot takes place, I will patch the kernel beforehand and then talk back to you. Nevertheless, I think if two people agree on a simple fix, and the same bug was present in OpenBSD's pf, this should be the solution. Thank you very much. Robert Schulze
By the way: could someone please attach a patch based on the assumptions you made? regards, Robert Schulze
--- sys/netpfil/pf/pf.c.orig 2017-03-30 09:54:22.056490000 +0000 +++ sys/netpfil/pf/pf.c 2017-03-30 09:55:10.735221000 +0000 @@ -3508,7 +3508,7 @@ (counter_u64_fetch(r->states_cur) >= r->max_states)) { counter_u64_add(V_pf_status.lcounters[LCNT_STATES], 1); REASON_SET(&reason, PFRES_MAXSTATES); - return (PF_DROP); + goto csfailed; } /* src node for filter rule */ if ((r->rule_flag & PFRULE_SRCTRACK ||
A commit references this bug: Author: kp Date: Sat Apr 1 12:22:34 UTC 2017 New revision: 316355 URL: https://svnweb.freebsd.org/changeset/base/316355 Log: pf: Fix leak of pf_state_keys If we hit the state limit we returned from pf_create_state() without cleaning up. PR: 217997 Submitted by: Max <maximos@als.nnov.ru> MFC after: 1 week Changes: head/sys/netpfil/pf/pf.c
A commit references this bug: Author: kp Date: Sat Apr 8 09:48:21 UTC 2017 New revision: 316640 URL: https://svnweb.freebsd.org/changeset/base/316640 Log: MFC r316355 pf: Fix leak of pf_state_keys If we hit the state limit we returned from pf_create_state() without cleaning up. PR: 217997 Submitted by: Max <maximos@als.nnov.ru> Changes: _U stable/11/ stable/11/sys/netpfil/pf/pf.c
A commit references this bug: Author: kp Date: Sat Apr 8 09:49:21 UTC 2017 New revision: 316641 URL: https://svnweb.freebsd.org/changeset/base/316641 Log: MFC r316355 pf: Fix leak of pf_state_keys If we hit the state limit we returned from pf_create_state() without cleaning up. PR: 217997 Submitted by: Max <maximos@als.nnov.ru> Changes: _U stable/10/ stable/10/sys/netpfil/pf/pf.c