Bug 274007 - IPSec asymmetric crypto broken
Summary: IPSec asymmetric crypto broken
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.2-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2023-09-21 20:24 UTC by Timothy Pearson
Modified: 2023-10-12 17:49 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Timothy Pearson 2023-09-21 20:24:48 UTC
After upgrading from FreeBSD 11 to FreeBSD 13, I noticed the IPSec asymmetric crypto option (net.inet.ipsec.async_crypto=1) no longer functions correctly. 

On FreeBSD 11, enabling this option pushed the bandwidth of an accelerated (AES-NI) AES 256 GCM tunnel from ~500Mbit/s to ~800Mbit/s with no packet loss, but on FreeBSD 13 it causes massive packet loss inside the tunnel, well over 20%.

The hardware is AMD Opteron CPUs with Intel X520 10Gb NICs.  MTU on the underlying link is set to 2000, with MTU inside the tunnel at the standard 1500.
Comment 1 Zhenlei Huang freebsd_committer freebsd_triage 2023-09-22 08:21:58 UTC
May you please share your setup briefly?
Comment 2 Timothy Pearson 2023-09-22 16:56:53 UTC
(In reply to Zhenlei Huang from comment #1)

What would you like to know in particular?

The hardware is fairly straightforward on both test boxes, we are using Opteron CPUs with igb Ethernet cards and the aforementioned Intel X520 card.  Each of the X520 cards in each box are directly connected together, with the IPsec link running across them, and plain-text packets are being forwarded from the igb interfaces across the tunnel in both directions.

On the Strongswan / IPSec side, the P2 tunnel is established in AES256-GCM mode with no hashing using the in-kernel AES-NI acceleration.

This setup works perfectly as long as async_crypto=0, as soon as async_crypto is set to 1 on the FreeBSD 13 system packets start being dropped as they transit the IPSec tunnel.  Setting async_crypto back to 0 immediately stops the packet loss. 
 Reverting to FreeBSD 11 with otherwise the same setup completely "resolves" the issue, but that is obviously not a viable solution.
Comment 3 Shawn Anastasio 2023-10-09 21:07:57 UTC
I am able to reproduce this on -CURRENT on powerpc64le. With a debug kernel build, I'm hitting the following assertion when flooding an ipsec link between two VMs using ipsec3 with the net.inet.ipsec.async_crypto tunable set to 1:

panic: vtnet_txq_encap: no mbuf packet header!
cpuid = 13
time = 1696530952
KDB: stack backtrace:
0xc00800006f554300: at kdb_backtrace+0x60
0xc00800006f554410: at vpanic+0x1b8
0xc00800006f5544c0: at panic+0x44
0xc00800006f5544f0: at vtnet_txq_encap+0x3c8
0xc00800006f5545d0: at vtnet_txq_mq_start_locked+0x17c
0xc00800006f554690: at vtnet_txq_tq_deferred+0x6c
0xc00800006f5546d0: at taskqueue_run_locked+0x100
0xc00800006f5547d0: at taskqueue_thread_loop+0x144
0xc00800006f554820: at fork_exit+0xc4
0xc00800006f5548c0: at fork_trampoline+0x18
0xc00800006f5548f0: at -0x4
KDB: enter: panic

Not being intimately familiar with the FreeBSD network stack, it looks to me like there might be a use-after-free on the mbuf with the tunable enabled.
Comment 4 Shawn Anastasio 2023-10-12 17:49:06 UTC
On further inspection, it appears the failure I observed is caused by an unrelated bug in the virtio network driver rather than IPSec (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268699).