Bug 258849 - IPSec may generate duplicate SPIs
Summary: IPSec may generate duplicate SPIs
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Mateusz Guzik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-01 22:29 UTC by Herbie.Robinson
Modified: 2022-06-22 15:22 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Herbie.Robinson 2021-10-01 22:29:31 UTC
In key_do_getnewspi in key.c:

		/* when requesting to allocate spi ranged */
		while (count--) {
			/* generate pseudo-random SPI value ranged. */
			newspi = min + (key_random() % (max - min + 1));
			if (!key_checkspidup(htonl(newspi)))
				break;
		}

		if (count == 0 || newspi == 0) {
			ipseclog((LOG_DEBUG,
			    "%s: failed to allocate SPI.\n", __func__));
			return 0;
		}

If I am not mistaken, the loop exit in the failure case will leave "count" at -1 (it's post-decrement), not zero.
Comment 1 Herbie.Robinson 2021-10-01 22:58:05 UTC
Looked at this a little bit more.  Can't duplicate SPI also be allocated because the SAH lock isn't held over the window between when the SPI is allocated and the sav gets threaded?
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2021-10-02 16:57:38 UTC
(In reply to Herbie.Robinson from comment #1)
I think you are right about both bugs.
Comment 3 Mateusz Guzik freebsd_committer freebsd_triage 2021-10-05 08:57:02 UTC
That sounds right. I'm going to sort this out probably this week.

Thank you for the report.
Comment 4 Mateusz Guzik freebsd_committer freebsd_triage 2021-10-07 08:41:26 UTC
So I don't think this is fixable in a clean manner without significant 
refactoring in the area.

I think the pragmatic thing to do here is to try to hoist the sahtree lock out of key_do_getnewspi. The problem is that there are several callers and callees to be adjusted which may end up being rather hairy.

As a cop out one can slap an additional lock around all of this, in this manner:
diff --git a/sys/netipsec/key.c b/sys/netipsec/key.c
index 72c598586d8e..f5da63d7b8f1 100644
--- a/sys/netipsec/key.c
+++ b/sys/netipsec/key.c
@@ -5628,6 +5628,7 @@ key_add(struct socket *so, struct mbuf *m, const struct sadb_msghdr *mhp)
         * secasindex.
         * XXXAE: IPComp seems also doesn't use SPI.
         */
+       SPI_ALLOC_LOCK();
        if (proto == IPPROTO_TCP) {
                sav = key_getsav_tcpmd5(&saidx, &spi);
                if (sav == NULL && spi == 0) {
@@ -5648,6 +5649,7 @@ key_add(struct socket *so, struct mbuf *m, const struct sadb_msghdr *mhp)
        }
 
        sav = key_newsav(mhp, &saidx, spi, &error);
+       SPI_ALLOC_UNLOCK();
        if (sav == NULL)
                return key_senderror(so, m, error);
        KEYDBG(KEY_STAMP,

Note this may be turn out to not be a big problem as key_newsav takes rm lock for writing which already comes with drastic overhead.

Any thoughts on this, ae?
Comment 5 Mateusz Guzik freebsd_committer freebsd_triage 2021-11-02 17:29:21 UTC
ae, ping?
Comment 6 Andrey V. Elsukov freebsd_committer freebsd_triage 2021-11-02 20:22:42 UTC
(In reply to Mateusz Guzik from comment #5)

I'll take a look
Comment 7 Andrey V. Elsukov freebsd_committer freebsd_triage 2021-11-03 12:06:09 UTC
(In reply to Mateusz Guzik from comment #4)

I'm agree with your solution, will you publish the full diff?
Comment 8 Mateusz Guzik freebsd_committer freebsd_triage 2021-11-03 12:48:20 UTC
Well I'm going to have to write one first. ;)

I'll try to do it today.
Comment 9 Mateusz Guzik freebsd_committer freebsd_triage 2021-11-03 16:27:04 UTC
https://reviews.freebsd.org/D32826
Comment 10 commit-hook freebsd_committer freebsd_triage 2021-11-03 19:52:25 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=626bd0970abfdfba596bced3bc8a47adaf11a46d

commit 626bd0970abfdfba596bced3bc8a47adaf11a46d
Author:     Mateusz Guzik <mjg@FreeBSD.org>
AuthorDate: 2021-11-03 14:00:53 +0000
Commit:     Mateusz Guzik <mjg@FreeBSD.org>
CommitDate: 2021-11-03 19:51:40 +0000

    ipsec: fix edge case detection in key_do_getnewspi

    The 'count' variable would end up being -1 post loop, while the
    following condition would check for 0 instead.

    PR:             258849
    Reported by:    Herbie.Robinson@stratus.com
    Reviewed by:    ae
    Sponsored by:   Rubicon Communications, LLC ("Netgate")
    Differential Revision:  https://reviews.freebsd.org/D32826

 sys/netipsec/key.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)
Comment 11 commit-hook freebsd_committer freebsd_triage 2021-11-03 19:52:27 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=10ea195fa269888d362f548279f3d3fa252662d7

commit 10ea195fa269888d362f548279f3d3fa252662d7
Author:     Mateusz Guzik <mjg@FreeBSD.org>
AuthorDate: 2021-11-03 13:10:34 +0000
Commit:     Mateusz Guzik <mjg@FreeBSD.org>
CommitDate: 2021-11-03 19:51:40 +0000

    ipsec: add a lock encompassing SPI allocation

    SPIs get allocated and inserted in separate steps. Prior to the change
    there was nothing preventing 2 differnet threads from ending up with the
    same one.

    PR:             258849
    Reported by:    Herbie.Robinson@stratus.com
    Reviewed by:    ae
    Sponsored by:   Rubicon Communications, LLC ("Netgate")
    Differential Revision:  https://reviews.freebsd.org/D32826

 sys/netipsec/key.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)
Comment 12 Franco Fichtner 2022-06-22 10:16:39 UTC
Any chance this will hit stable branches? We've got a report on 13-STABLE (which would mean at least 13.1 is affected if not 13.0 already) about SPIs filling up the system, which slows its performance down eventually.

I've also noticed that "setkey -D" reliably stalls reading on specific entries and never exits.


Cheers,
Franco
Comment 13 Andrey V. Elsukov freebsd_committer freebsd_triage 2022-06-22 10:56:36 UTC
(In reply to Franco Fichtner from comment #12)
I think when `setkey -D` fails with large number of SAs is probably related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=88336.
Comment 14 Franco Fichtner 2022-06-22 11:43:50 UTC
I saw this issue, but it wasn't failing... it was stalling after a specific entry and if you call it multiple times it would always hang after the same entry which was odd to see. From what we could tell it would never exit or progress afterwards.
Comment 15 Mateusz Guzik freebsd_committer freebsd_triage 2022-06-22 15:22:47 UTC
I see there have been other fixes in the area. someone(tm) should probably prepare a full sync with main, unless there is a showstopper there