Hi, We have a IPSec/IKEV2 Server running in PFSense 2.4.4-RELEASE-p3 (amd64). The VPN server serves an average of 40 concurrent mobile clients. Each phase 1 tunnel created has three phase 2 tunnels. When the "reqid" variable reaches the value "16384", the "trap not found" error logged in the logs below occurs and users can connect but cannot traffic over the VPN. In my environment this value is reached approximately every 30 days. To resolve the issue, I need to stop the VPN service and start it again for the variable to be reset. Logs samples: Aug 18 20:12:10 vpn2 charon: 02[KNL] creating acquire job for policy serverIP/32|/0 === clientIP/32|/0 with reqid {16384} Aug 18 20:12:10 vpn2 charon: 13[CFG] trap not found, unable to acquire reqid 16384 Dec 11 11:34:34 vpn2 charon: 14[KNL] creating acquire job for policy serverIP/32|/0 === clientIP/32|/0 with reqid {16384} Dec 11 11:34:34 vpn2 charon: 01[CFG] trap not found, unable to acquire reqid 16384 Strongswan developer response: That because of IPSEC_MANUAL_REQID_MAX (0x3fff == 16383), file "include/uapi/linux/ipsec.h". Which is a strangely low limit (at least for keying daemons like strongSwan that manage reqids themselves) since reqids are 32-bit numbers. reqids are currently allocated sequentially using a sttic counter (source:src/libcharon/kernel/kernel_interface.c#L328). The code that allocates them does not know anything about the limit above (it doesn't even know or care that it runs on a FreeBSD kernel). My report: https://forum.netgate.com/topic/148857/ipsec-ikev2-error-trap-not-found-unable-to-acquire-reqid Others reports: https://wiki.strongswan.org/issues/2315 https://lists.strongswan.org/pipermail/dev/2018-August/001929.html
(In reply to Geovane from comment #0) FreeBSD already contains a suitable allocator in "sys/kern/subr_unit.c".
Andrey, can you comment this?
(In reply to crest from comment #1) Hi, The StrongSwan team answered: "Which doesn't seem related to the issue. Probably someone replied to the wrong email thread." Thanks
IPSEC_MANUAL_REQID_MAX is not FreeBSD-specific; it is also 0x3fff on Linux.
Anyway, the comment in the header is clear enough: REQIDs over 0x3fff are reserved for the kernel. Linux uses this range for the kernel as well (see net/key/af_key.c#L1915, gen_reqid()). They simply ignore bogus user requests for higher numbers: https://github.com/torvalds/linux/blob/master/net/key/af_key.c#L1959 if (t->reqid > IPSEC_MANUAL_REQID_MAX) t->reqid = 0;
In fact, FreeBSD does something similar, but produces a warning first (ipseclog LOG_DEBUG, "reqid=%d range violation, updated by kernel"). That code is present since 2002. I can't tell if libcharon is broken on Linux and merely doesn't observe it there, or if it's just poorly designed. I don't know if pfsense has any modifications to FreeBSD in this area that might be relevant. Can you reproduce the problem on FreeBSD, or just pfsense?
(In reply to Conrad Meyer from comment #6) Hi Conrad, Unfortunately, in our environment we have only one PFSense VPN server with enough demand to reach the 16k limit of the "reqid" variable. It seems the StrongSwan team is working on a variable reuse solution after my report: https://wiki.strongswan.org/issues/2315 Thnak you. Geovane
It seems the problem is fixed since Strongswan 5.8.3.
Fixed in StrongSwan, see https://wiki.strongswan.org/issues/2315