Summary: | ipsec / cesa memory issue | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Graham Collinson <graham_freebsd> | ||||
Component: | arm | Assignee: | freebsd-arm (Nobody) <freebsd-arm> | ||||
Status: | Closed Overcome By Events | ||||||
Severity: | Affects Some People | CC: | markj, mjg, zbb | ||||
Priority: | --- | ||||||
Version: | 11.2-RELEASE | ||||||
Hardware: | arm | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Graham Collinson
2020-06-11 11:02:51 UTC
I'm having trouble seeing how cesa_newsession() could raise an error here. The only place where it returns a non-zero error number is if cesa_prep_aes_key() returns an error, which only happens if the IPSec code passes an invalid key length. (In reply to Mark Johnston from comment #1) I've probably got something wrong in the way I'm interpreting the source code. The error value appears to be checked straight after doing the xform_init and then outputs the message I've seen. if (error) { ipseclog((LOG_DEBUG, "%s: unable to initialize SA type %u.\n", __func__, mhp->msg->sadb_msg_satype)); goto fail; } We're still running fine since taking cesa out. Perhaps it wasn't cesa directly causing our issues but something to do with using a hardware driver. Looking at the correct version of the cesa code now I think https://github.com/freebsd/freebsd/blob/release/11.2.0/sys/dev/cesa/cesa.c I see there was a point where it would return ENOMEM cs = cesa_alloc_session(sc); if (!cs) return (ENOMEM); This code has changed since the release we're on. Appears to be this commit : https://github.com/freebsd/freebsd/commit/99ba792d73cb1765bd7271160d3d81500308a2c6 so this is probably not a problem in later versions. Checking through git it looks as though that commit made its way into the 12 release. 11 releases still use the old memory management and likely still have the issue we're seeing. It looks like the enomem will be returned when cesa has run out of available sessions which is set at 64. (CESA_SESSIONS in https://github.com/freebsd/freebsd/blob/release/11.2.0/sys/dev/cesa/cesa.h) Perhaps there's a session leak somehow or we just get to a point where our system is demanding more than 64 sessions at a time? There doesn't seem to be a way I can track the allocation of sessions. Perhaps there could be a way for crypto_select_driver in https://github.com/freebsd/freebsd/blob/release/11.2.0/sys/opencrypto/crypto.c to know that a device has hit maximum number of sessions and not select it? Or a way to fallback to software if a CRYPTODEV_NEWSESSION call fails on a hardware device? The same restriction of 64 sessions in cesa appears to still be in place in 11.4 Created attachment 216367 [details] cesa session limit patch (untested) (In reply to Graham Collinson from comment #5) Nice catch, I forgot that this had been refactored since FreeBSD 11. It looks like the driver imposes a session limit only because it pre-allocates session structures. That limit is gone now in head, where session management has factored out of the original drivers. I see no such limit in 11 in the software crypto driver. I would guess that your workload simply requires more than 64 crypto sessions. I would be interesting to see the output of "vmstat -m | grep crypto" from a system that has been up for a while. I wrote an untested patch that bumps the session limit in cesa and makes it configurable at boot time. I'll let the pfsense folks know about it; I'm not sure if the issue you're seeing has been observed elsewhere. Should this get closed? Session limit disappeared in stable/12 with: commit 1b0909d51a8aa8b5ec5a61c2dc1a69642976a732 Author: Conrad Meyer <cem@FreeBSD.org> Date: Wed Jul 18 00:56:25 2018 +0000 OpenCrypto: Convert sessions to opaque handles instead of integers Yes, I don't think this will be addressed in FreeBSD since 11 is EOL. |