Bug 163208 - [pf] PF state key linking mismatch
Summary: [pf] PF state key linking mismatch
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: unspecified
Hardware: Any Any
: Normal Affects Some People
Assignee: freebsd-pf
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-12 16:40 UTC by mlager
Modified: 2016-10-20 14:15 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description mlager 2011-12-12 16:40:07 UTC
With a raw IP-IP GIF tunnel set up between an 8.2-RELEASE system and an 9.0-RC3 system, the tunnel functions properly, each side can connect to eachother's network, however, the 9.0-RC3 system reports numerous PF state key linking mismatch errors, even for successful connections, that look like:

pf: state key linking mismatch! dir=OUT, if=re1, stored af=2, a0: B.B.B.B, a1: A.A.A.A, proto=4, found af=2, a0: 172.16.1.2:80, a1: 172.16.2.1:52102, proto=6.

I don't see these errors on the 8.2-RELEASE endpoint and the error seems to disrupt network performance. Here is my configuration on each endpoint, I've masked public IP addresses as A.A.A.A and B.B.B.B:

ENDPOINT 1:

/etc/rc.conf:
gif_interfaces="gif0"
gifconfig_gif0="A.A.A.A B.B.B.B"
ifconfig_gif0="inet 172.16.1.1 172.16.2.1 netmask 255.255.255.0"
static_routes="tslbell"
route_tslbell="-net 172.16.2.0/24 172.16.2.1"

/etc/pf.conf:
# MACROS
ext_if="re0"
int_if="re1"
internal_net="172.16.1.0/24"

# NORMALIZATION
scrub in all

# NETWORK ADDRESS TRANSLATION
nat on $ext_if from $internal_net to any -> ($ext_if)

# FILTERING
set skip on gif0

pass in all
pass out all

block in log all
pass quick on lo0 all
pass quick on $int_if all

# ENABLE INBOUND ICMP
pass in on $ext_if proto icmp all keep state

pass out on $ext_if proto { tcp, udp, icmp } all keep state

---------------------------

ENDPOINT 2:

/etc/rc.conf:
gifconfig_gif0="B.B.B.B A.A.A.A"
ifconfig_gif0="inet 172.16.2.1 172.16.1.1 netmask 255.255.255.0"
static_routes="belltsl"
route_belltsl="-net 172.16.1.0/24 172.16.1.1"


/etc/pf.conf:
# MACROS
ext_if="lagg0"
int_if="bge0"
internal_net="172.16.2.0/24"

# NORMALIZATION
scrub in all

# NETWORK ADDRESS TRANSLATION
nat on $ext_if from $internal_net to any -> ($ext_if)

# FILTERING
set skip on gif0

pass in all
pass out all

block in log all
pass quick on lo0 all
pass quick on $int_if all

# ENABLE INBOUND ICMP
pass in on $ext_if proto icmp all keep state

pass out on $ext_if proto { tcp, udp, icmp } all keep state

Fix: 

None found as of now.
How-To-Repeat: Setup an IP-IP tunnel on FreeBSD 9.0-RC3, enable PF, and look for state mismatch error messages.
Comment 1 Mark Linimon freebsd_committer 2011-12-19 08:06:19 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-pf

Over to maintainer(s).
Comment 2 matt 2012-01-12 20:58:31 UTC
This problem persists once I updated to 9.0-RELEASE.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Comment 3 Tilman Keskinoz freebsd_committer 2012-01-21 19:40:36 UTC
Same here.

Also Fabian Keil reported this in
http://lists.freebsd.org/pipermail/freebsd-current/2011-July/025696.html

Any ideas?
Comment 4 freebsd-listen 2012-01-21 20:01:18 UTC
Tilman Keskinöz <arved@FreeBSD.org> wrote:

> Same here.
> 
> Also Fabian Keil reported this in
> http://lists.freebsd.org/pipermail/freebsd-current/2011-July/025696.html


This has been fixed in CURRENT shortly thereafter:
http://lists.freebsd.org/pipermail/freebsd-pf/2011-July/006199.html

Maybe the fix hasn't been MFC'd.

Fabian
Comment 5 Tilman Keskinoz freebsd_committer 2012-01-21 20:52:09 UTC
On Jan 21, 2012, at 21:01 , Fabian Keil wrote:

> Tilman Keskin=F6z <arved@FreeBSD.org> wrote:
>=20
>> Same here.
>>=20
>> Also Fabian Keil reported this in
>> =
http://lists.freebsd.org/pipermail/freebsd-current/2011-July/025696.html
>=20
> This has been fixed in CURRENT shortly thereafter:
> http://lists.freebsd.org/pipermail/freebsd-pf/2011-July/006199.html
>=20
> Maybe the fix hasn't been MFC'd.

Hm, r223765 happend before the RELENG_9 Branchpoint.
So maybe the Fix was not complete?=
Comment 6 bzeeb-lists 2012-01-21 21:01:41 UTC
On 21. Jan 2012, at 20:52 , Tilman Keskin=F6z wrote:

>=20
> On Jan 21, 2012, at 21:01 , Fabian Keil wrote:
>=20
>> Tilman Keskin=F6z <arved@FreeBSD.org> wrote:
>>=20
>>> Same here.
>>>=20
>>> Also Fabian Keil reported this in
>>> =
http://lists.freebsd.org/pipermail/freebsd-current/2011-July/025696.html
>>=20
>> This has been fixed in CURRENT shortly thereafter:
>> http://lists.freebsd.org/pipermail/freebsd-pf/2011-July/006199.html
>>=20
>> Maybe the fix hasn't been MFC'd.
>=20
> Hm, r223765 happend before the RELENG_9 Branchpoint.
> So maybe the Fix was not complete?

See thread from earlier this month on freebsd-pf

--=20
Bjoern A. Zeeb                                 You have to have visions!
   It does not matter how good you are. It matters what good you do!
Comment 7 Tilman Keskinoz freebsd_committer 2012-01-22 10:41:12 UTC
* Bjoern A. Zeeb [Sat, 21 Jan 2012 21:01:41 +0000]:
> 
> On 21. Jan 2012, at 20:52 , Tilman Keskinöz wrote:
> 
>>
>> On Jan 21, 2012, at 21:01 , Fabian Keil wrote:
>>
>>> Tilman Keskinöz <arved@FreeBSD.org> wrote:
>>>
>>>> Same here.
>>>>
>>>> Also Fabian Keil reported this in
>>>> http://lists.freebsd.org/pipermail/freebsd-current/2011-July/025696.html
>>>
>>> This has been fixed in CURRENT shortly thereafter:
>>> http://lists.freebsd.org/pipermail/freebsd-pf/2011-July/006199.html
>>>
>>> Maybe the fix hasn't been MFC'd.
>>
>> Hm, r223765 happend before the RELENG_9 Branchpoint.
>> So maybe the Fix was not complete?
> 
> See thread from earlier this month on freebsd-pf
> 

The Thread suggests:

* Matt Lager [Thu, 12 Jan 2012 15:48:23 -0700]:
> So it looks likeI can comment out this code in
> /usr/src/sys/contrib/pf/net/pf.c:
>
>                 /* mismatch. must not happen. */
>                 printf("pf: state key linking mismatch! dir=%s, "
>                     "if=%s, stored af=%u, a0: ",
>                     dir == PF_OUT ? "OUT" : "IN", kif->pfik_name, a->af);
>
> When this error occurs, I guess for valid reasons, does PF drop packets
> or do something else with them, or is this purely an information  notice?

I can confirm that removing this printf, brings back the performance for me.

Please fix :)
Comment 8 Ermal Luçi freebsd_committer 2012-01-23 11:16:38 UTC
On Sun, Jan 22, 2012 at 11:41 AM, Tilman Keskinöz <arved@freebsd.org> wrote:

> * Bjoern A. Zeeb [Sat, 21 Jan 2012 21:01:41 +0000]:
> >
> > On 21. Jan 2012, at 20:52 , Tilman Keskinöz wrote:
> >
> >>
> >> On Jan 21, 2012, at 21:01 , Fabian Keil wrote:
> >>
> >>> Tilman Keskinöz <arved@FreeBSD.org> wrote:
> >>>
> >>>> Same here.
> >>>>
> >>>> Also Fabian Keil reported this in
> >>>>
> http://lists.freebsd.org/pipermail/freebsd-current/2011-July/025696.html
> >>>
> >>> This has been fixed in CURRENT shortly thereafter:
> >>> http://lists.freebsd.org/pipermail/freebsd-pf/2011-July/006199.html
> >>>
> >>> Maybe the fix hasn't been MFC'd.
> >>
> >> Hm, r223765 happend before the RELENG_9 Branchpoint.
> >> So maybe the Fix was not complete?
> >
> > See thread from earlier this month on freebsd-pf
> >
>
> The Thread suggests:
>
> * Matt Lager [Thu, 12 Jan 2012 15:48:23 -0700]:
> > So it looks likeI can comment out this code in
> > /usr/src/sys/contrib/pf/net/pf.c:
> >
> >                 /* mismatch. must not happen. */
> >                 printf("pf: state key linking mismatch! dir=%s, "
> >                     "if=%s, stored af=%u, a0: ",
> >                     dir == PF_OUT ? "OUT" : "IN", kif->pfik_name, a->af);
> >
> > When this error occurs, I guess for valid reasons, does PF drop packets
> > or do something else with them, or is this purely an information  notice?
>
> I can confirm that removing this printf, brings back the performance for
> me.
>
>


Probably a sysctl to disable this should be provided.
There might be unexpected consequences from this and the better fix is to
find the section where the mbuf is being reused.


> Please fix :)
>
>  _______________________________________________
> freebsd-pf@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-pf
> To unsubscribe, send any mail to "freebsd-pf-unsubscribe@freebsd.org"
>




-- 
Ermal
Comment 9 Tilman Keskinoz freebsd_committer 2012-01-23 12:13:55 UTC
* Ermal Luçi [Mon, 23 Jan 2012 11:50:07 GMT]:

>  
>  Probably a sysctl to disable this should be provided.
>  There might be unexpected consequences from this and the better fix is to
>  find the section where the mbuf is being reused.

What consequences?

Is there anything that can be done, to debug where the mbuf is reused?

>  
>  
>  > Please fix :)
Comment 10 Ermal Luçi freebsd_committer 2012-01-23 16:21:21 UTC
On Mon, Jan 23, 2012 at 1:13 PM, Tilman Keskin=F6z <arved@freebsd.org> wrot=
e:
>
> * Ermal Lu=E7i [Mon, 23 Jan 2012 11:50:07 GMT]:
>
> >
> > =A0Probably a sysctl to disable this should be provided.
> > =A0There might be unexpected consequences from this and the better fix =
is to
> > =A0find the section where the mbuf is being reused.
>
> What consequences?
>
> Is there anything that can be done, to debug where the mbuf is reused?
>

You have to find the subsystem that does the re-use.
Starting from the pf state seeing if it is udp/tcp/... then trying
finding the specific part that would trigger this.
As in TCP after a RST or somesuch.

> >
> >
> > =A0> Please fix :)




--
Ermal
Comment 11 mike.jakubik 2012-10-30 17:14:06 UTC
A year has gone by and my router is still flooded with these, some users 
complain that VPN (mpd) is very slow, indeed this only comes up when users are 
connected via VPN.

Who is reponsible for this code, is anyone willing to fix this?
Comment 12 mlager 2012-10-30 17:21:50 UTC
I ended up just commenting out the code to display this message in the 
source and recompiled.

On 10/30/12 10:14 AM, Mike Jakubik wrote:
> A year has gone by and my router is still flooded with these, some users
> complain that VPN (mpd) is very slow, indeed this only comes up when users are
> connected via VPN.
>
> Who is reponsible for this code, is anyone willing to fix this?
>
>


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Comment 13 gkontos.mail 2013-02-02 18:26:21 UTC
Same problem here. IPv6 tunnel encapsulating IPv4 packets:

kernel: pf: state key linking mismatch! dir=OUT, if=re0, stored af=28,
a0: xxxx:xxxx:1001:5f00::86, a1: xxxx:xxx:8f00:2c00::2093, proto=50,
found af=2, a0: 10.30.1.140:53444, a1: 10.1.1.3:22, proto=6.

Any solution to that?

-- 
George Kontostanos
---
http://www.aisecure.net
Comment 14 nrh 2013-11-06 23:08:23 UTC
Similar problem with L2TP over IPSEC, (via mpd5)  with the nasty additional surprise that pf appears not to be correctly processing packets that come in on the resulting ng0 interface when the pf rules refer to the ng interface involved.  That is, this statement:

pass in log quick on ng0 proto tcp to port 25

doesn't result in output when I look at a tcpdump of pflog0, even though I'm arriving on the ng0 interface, and I can telnet to a port 25 somewhere.   Redirects and such also fail.

Oddly, similar rules succeed when we use mpd5 to do PPTP, rather than L2TP/IPSEC.

And of course, we get a zillion error messages.

pf: state key linking mismatch! dir=OUT, if=enc0, stored af=2, a0: [concealed ip address]:443, a1: 10.119.24.2:52893, proto=6, found af=2, a0:[concealed ip address]:51375, a1: [concealed ip address]:1701, proto=17.
pf: state key linking mismatch! dir=OUT, if=enc0, stored af=2, a0: [concealed ip address]:443, a1: 10.119.24.2:52893, proto=6, found af=2, a0: [concealed ip address]:51375, a1: [concealed ip address]:1701, proto=17.


I've replaced some IP addresses by "[concealed ip address]".

Comment 15 nrh 2013-11-06 23:39:38 UTC
I should have mentioned that this was 9.2 release, recompiled to include IPSEC.
Comment 16 dan 2014-04-27 23:09:11 UTC
FYI, this problem persists with FreeBSD 9.2-RELEASE-p4

-- 
Dan Langille - http://langille.org
Comment 17 Dan Langille freebsd_committer 2014-09-08 15:40:41 UTC
This situation persists in FreeBSD 9.3-RELEASE
Comment 18 Alexey Pereklad 2014-12-24 16:43:53 UTC
Got the same problem with PPTP through NAT. When some user try to connect to some external server with PPTP, we see in log file (replaced some digits in IP with "OUR.NAT" string):

==================================================================
Dec 24 18:19:21 vpn1-spb kernel: pf: state key linking mismatch! dir=OUT, if=vlan434, stored af=2, a0: 10.12.1.0:57782, a1: 78.29.24.10:1723, proto=6, found af=2, a0: 78.29.24.10:1723, a1: OUR.NAT.185.52:57782, proto=6.
Dec 24 18:19:21 vpn1-spb kernel: pf: state key linking mismatch! dir=OUT, if=vlan434, stored af=2, a0: 10.12.1.0:57782, a1: 78.29.24.10:1723, proto=6, found af=2, a0: 78.29.24.10:1723, a1: OUR.NAT.185.52:57782, proto=6.
Dec 24 18:19:21 vpn1-spb kernel: pf: state key linking mismatch! dir=OUT, if=ng264, stored af=2, a0: 78.29.24.10:1723, a1: OUR.NAT.185.52:57782, proto=6, found af=2, a0: 10.12.1.0:57782, a1: 78.29.24.10:1723, proto=6.
Dec 24 18:19:21 vpn1-spb kernel: pf: state key linking mismatch! dir=OUT, if=ng264, stored af=2, a0: 78.29.24.10:1723, a1: OUR.NAT.185.52:57782, proto=6, found af=2, a0: 10.12.1.0:57782, a1: 78.29.24.10:1723, proto=6.
Dec 24 18:20:24 vpn1-spb kernel: pf: state key linking mismatch! dir=OUT, if=ng264, stored af=2, a0: 78.29.24.10:1723, a1: OUR.NAT.185.52:57939, proto=6, found af=2, a0: 10.12.1.0:57939, a1: 78.29.24.10:1723, proto=6.
Dec 24 18:20:24 vpn1-spb kernel: pf: state key linking mismatch! dir=OUT, if=vlan434, stored af=2, a0: 10.12.1.0:57939, a1: 78.29.24.10:1723, proto=6, found af=2, a0: 78.29.24.10:1723, a1: OUR.NAT.185.52:57939, proto=6.
Dec 24 18:20:24 vpn1-spb kernel: pf: state key linking mismatch! dir=OUT, if=vlan434, stored af=2, a0: 10.12.1.0:57939, a1: 78.29.24.10:1723, proto=6, found af=2, a0: 78.29.24.10:1723, a1: OUR.NAT.185.52:57939, proto=6.
Dec 24 18:20:25 vpn1-spb kernel: pf: state key linking mismatch! dir=OUT, if=ng264, stored af=2, a0: 78.29.24.10:1723, a1: OUR.NAT.185.52:57939, proto=6, found af=2, a0: 10.12.1.0:57939, a1: 78.29.24.10:1723, proto=6.
Dec 24 18:20:25 vpn1-spb kernel: pf: state key linking mismatch! dir=OUT, if=ng264, stored af=2, a0: 78.29.24.10:1723, a1: OUR.NAT.185.52:57939, proto=6, found af=2, a0: 10.12.1.0:57939, a1: 78.29.24.10:1723, proto=6.
==================================================================

Some info about our configuration:
# uname -a
FreeBSD bras.office.ru 9.3-RELEASE-p6 FreeBSD 9.3-RELEASE-p6 #0 r275674: Wed Dec 10 17:25:20 MSK 2014     root@bras.office.ru:/usr/obj/usr/src/sys/GENERIC  amd64


pf config:
==================================================================
dolg_server="192.168.177.135"
nat_ip="OUR.NAT.185.52"

table <clients> persist { !10.12.0.1, 10.12/16, 10.13/16 }
table <spam> persist

set limit states 200000
set block-policy drop

nat on vlan434 from <clients> to any -> $nat_ip
no nat on vlan434 proto gre all
no nat on vlan434 proto tcp from <clients> to any port 1723
no nat on vlan434 proto tcp from any port 1723 to any

pass in all
pass out all

pass in inet proto tcp from <clients> to any port 25 keep state ( max-src-conn-rate 5/30, overload <spam> flush global )
block in inet proto tcp from <spam> to any port 25

block in quick inet from <clients> to <clients>
==================================================================


As pf can't do NAT for PPTP, I disabled NAT for PPTP and tcp port 1723 connections in pf.conf. We use ipfw to NAT PPTP connections:
==================================================================
#!/bin/sh

cmd="/sbin/ipfw -q"

nat_ip="OUR.NAT.185.52"
nat_if="vlan434"

clients="10.12.0.0/16"

${cmd} -f flush

${cmd} add nat 1 log gre from any to any via ${nat_if}
${cmd} add nat 1 log tcp from ${clients} to any dst-port 1723 out via ${nat_if}
${cmd} add nat 1 log tcp from any 1723 to any in via ${nat_if}

${cmd} nat 1 config ip ${nat_ip} unreg_only same_ports

${cmd} add 65534 allow all from any to any
==================================================================
Comment 19 kuPyxa 2016-10-20 14:15:37 UTC
Hi, FreeBSD 9.3 amd64 have this problem too with PF. MPD5 connect over VPN for create secure Internet to External server and configured NAT on ng0 for internal local nets client.