Bug 201447 - aes-gcm corrupted packets with ipsec
Summary: aes-gcm corrupted packets with ipsec
Status: Closed DUPLICATE of bug 228094
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: George V. Neville-Neil
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-07-09 23:49 UTC by olivier
Modified: 2018-07-21 19:28 UTC (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description olivier 2015-07-09 23:49:38 UTC
With a simple static ipsec setup, packet are corrupted (during encryption or decryption):

[root@ENCryptor]~# cat /etc/setkey.conf
flush;
spdflush;
spdadd 1.0.0.0/8 3.0.0.0/8 any -P out ipsec esp/tunnel/2.2.2.2-2.2.2.3/require;
spdadd 3.0.0.0/8 1.0.0.0/8 any -P in ipsec esp/tunnel/2.2.2.3-2.2.2.2/require;
add 2.2.2.2 2.2.2.3 esp 0x1000 -E aes-gcm-16 0x3ffe05014819ffff3ffe05014819ffff;
add 2.2.2.3 2.2.2.2 esp 0x1001 -E aes-gcm-16 0x3ffe05014819ffff3ffe05014819ffff;


[root@DECryptor]~# cat /etc/setkey.conf
flush;
spdflush;
spdadd 1.0.0.0/8 3.0.0.0/8 any -P in ipsec esp/tunnel/2.2.2.2-2.2.2.3/require;
spdadd 3.0.0.0/8 1.0.0.0/8 any -P out ipsec esp/tunnel/2.2.2.3-2.2.2.2/require;
add 2.2.2.2 2.2.2.3 esp 0x1000 -E aes-gcm-16 0x3ffe05014819ffff3ffe05014819ffff;
add 2.2.2.3 2.2.2.2 esp 0x1001 -E aes-gcm-16 0x3ffe05014819ffff3ffe05014819ffff;


packet generated, but result on DECryptor side:

[root@DECryptor]~# netstat -ssp esp
esp:
        3527445 packets dropped; bad encryption detected
        3581287 packets in
        1933894980 bytes in
        ESP output histogram:
                aes-gcm-16: 3581287


Pcap file available here:

http://dev.bsdrp.net/r285336-aes-gcm-16.pcap
Comment 1 George V. Neville-Neil freebsd_committer freebsd_triage 2015-07-10 02:33:23 UTC
Testing this on a two host system with a similar test from the netperf test suite I do not see this problem.  We'll have to dig further.  Can you give tell me the hardware configuration (CPU, and NIC type) used in this test?  Also, can you try to turn off any special features (TCP Offload etc.) that might be enabled on the NIC?
Comment 2 olivier 2015-07-10 09:46:39 UTC
Here are more information:
- FreeBSD: 11.0-CURRENT #3 r285336M: Fri Jul 10 02:02:25 CEST 2015
- IBM System x3550 M3: Intel Xeon CPU L5630 @ 2.13GHz (2133.46-MHz K8-class CPU)
- NIC: Intel 82580 (class=0x020000 card=0x12a28086 chip=0x150e8086)
- TSO/RSO disabled on interfaces.
- aesni module NOT loaded.

Notice the not-so-random IV value in the previous pcap too.
I don't know if the problem came from encrypter or decrypter side.

Encrypter side stats:
[root@ENcrypter]~# netstat -ss -p esp
esp:
        3452430 packets out
        1822883040 bytes out
        ESP output histogram:
                aes-gcm-16: 3452430

Decrypter side stats:
[root@DECrypter]~# netstat -ss -p esp
esp:
        1600986 packets dropped; bad encryption detected
        1621300 packets in
        875502000 bytes in
        ESP output histogram:
                aes-gcm-16: 1621300
Comment 3 George V. Neville-Neil freebsd_committer freebsd_triage 2015-07-26 18:16:52 UTC
I am now looking into this.
Comment 4 George V. Neville-Neil freebsd_committer freebsd_triage 2015-08-05 09:42:41 UTC
Can you please re-test with a HEAD on or after 286292

https://svnweb.freebsd.org/changeset/base/286292

Thanks
Comment 5 olivier 2015-08-05 22:20:52 UTC
Tested on r286308.

With the exact same configuration file I've got:
Invalid key length at [0x3ffe05014819ffff3ffe05014819ffff]

But once key size increase, it's now working.

Thanks
Comment 6 George V. Neville-Neil freebsd_committer freebsd_triage 2015-08-05 22:51:21 UTC
Actually specifying those bytes when using setkey() is required, they're the nonce, in the language of the RFC.  Thanks for testing and the update.
Comment 7 olivier 2015-08-06 13:59:38 UTC
Ooops, more information, it didn't work in ALL case:

encryptor    |  decryptor   | result
----------------------------------
aesni loaded | aesni loaded | OK
aesni loaded | No aesni     | OK
No aesni     | aesni loaded | Failed
No aesni     | No aesni     | Failed

Encrypting with aes-gcm seems to works with AESNI loaded only.
Comment 8 George V. Neville-Neil freebsd_committer freebsd_triage 2015-09-01 15:23:19 UTC
I have just tested this with HEAD (Fri Aug 21 06:54:12 EDT 2015 vintage) and do not see a problem.  I am using slightly modified configs from here:

https://github.com/gvnn3/netperf/tree/master/IPSEC/Configs

Can you tell me if this is still a problem on HEAD as well as which configs are giving the problem?

1. Created a copy of a VM that supports AESNI
2. Adjusted keys in aes-gcm.conf files to have sufficient key length.
3. Set up em1 on each VM in the 172.16.0.0/24 space.
4. Set devbox as "source" and aeanitest as "destination"
5. used ping to test between hosts
6. saw encrypted packets on devboox and pings were answered
7. ssh'd from aesnitest to devbox
8. kldloaded and unloaded aesni module on both ends, no change
Comment 9 olivier 2015-09-03 20:12:51 UTC
(In reply to George V. Neville-Neil from comment #8)

Ok, new test on head (r287407).
Full setup detailled here:
http://bsdrp.net/documentation/examples/ipsec_performance_lab_of_an_ibm_system_x3550_m3_with_intel_82580

aesni NOT loaded on both DUT (encryptor and decryptor):

[root@encrypt]~# kldstat -v | grep aesni
[root@encrypt]~#


[root@decrypt]~# kldstat -v | grep aesni
[root@decrypt]~#

setkey.conf use the same key as yours:

[root@encrypt]~# cat /etc/setkey.conf
flush;
spdflush;
spdadd 198.18.0.0/16 198.19.0.0/16 any -P out ipsec esp/tunnel/198.18.3.2-198.18.3.3/require;
spdadd 198.19.0.0/16 198.18.0.0/16 any -P in ipsec esp/tunnel/198.18.3.3-198.18.3.2/require;
add 198.18.3.2 198.18.3.3 esp 0x1000 -E aes-gcm-16 0x1122334455667788990011223344556677889900;
add 198.18.3.3 198.18.3.2 esp 0x1001 -E aes-gcm-16 0x1122334455667788990011223344556677889900;

[root@decrypt]~# cat /etc/setkey.conf
flush;
spdflush;
spdadd 198.18.0.0/16 198.19.0.0/16 any -P in ipsec esp/tunnel/198.18.3.2-198.18.3.3/require;
spdadd 198.19.0.0/16 198.18.0.0/16 any -P out ipsec esp/tunnel/198.18.3.3-198.18.3.2/require;
add 198.18.3.2 198.18.3.3 esp 0x1000 -E aes-gcm-16 0x1122334455667788990011223344556677889900;
add 198.18.3.3 198.18.3.2 esp 0x1001 -E aes-gcm-16 0x1122334455667788990011223344556677889900;

Now I'm generating load and obtain lot's of corrupted packets:

[root@encrypt]~# netstat -ssp esp
esp:
        3500850 packets out
        1848448800 bytes out
        ESP output histogram:
                aes-gcm-16: 3500850

[root@decrypt]~# netstat -ssp esp
esp:
        52 packets dropped; no TDB
        1586565 packets dropped; bad encryption detected
        1649468 packets in
        877489312 bytes in
        ESP output histogram:
                aes-gcm-16: 1649416


For solving this problem I've just need to load aesni on the encrypter side:

[root@encrypt]~# kldload aesni
[root@encrypt]~# service ipsec restart

and no more problem on the receiver:

[root@decrypt]~# netstat -ssp esp
esp:
        2412033 packets in
        1283201556 bytes in
        ESP output histogram:
                aes-gcm-16: 2412036
Comment 10 olivier 2015-09-03 23:26:27 UTC
As instructed:

1. Replaced netmap's pkt-gen by iperf3 for a one-flow TCP test:
And there is no problem with this test!

2. Then I've used netblast/netreceive for a one-flow UDP test:
And there is no problem with is test too!

3. Then I redone my test with netmap's pkt-gen (2000 flows, UDP): 
And there are lot's of corruption again.

For the record, the pkt-gen command used:
pkt-gen -U -f tx -i igb2 -l 542 -R 5000 -n 50000 -d 198.19.10.1:2000-198.19.10.100 -D 00:1b:21:d3:8f:3e -s 198.18.10.1:2000-198.18.10.20

Notice option "-U": It's a patch for adding software UDP checksum because Intel card seems to disable the hardware UDP checksum feature in netmap mode.

Then the difference between netblast and netmap's pkt-gen:
- Lot's of different source/destination IP for pkt-gen, same for netblast
Comment 11 George V. Neville-Neil freebsd_committer freebsd_triage 2015-09-04 01:30:06 UTC
Thanks for all of this.  I'm going to code up a test for this in my own lab and see if I can reproduce it.  I've run several tests with pkt-gen but only with single stream, thus far.  It's likely an issue with the many different IPs, as you indicate.
Comment 12 Olivier Cochard freebsd_committer freebsd_triage 2017-10-30 09:55:27 UTC
I'm still hitting this bug: Can you try a head on even a 11.1-RELEASE with IPSec aes-gcm WITHOUT aesni module loaded ?

With the aesni module I didn't have any problem, but without I still have 99.99% packets corruption.
Comment 13 Alan Somers freebsd_committer freebsd_triage 2018-07-21 18:27:49 UTC
The test that Oliver wrote is now passing in CI.  Could somebody please investigate to see if this bug is really fixed?

https://ci.freebsd.org/job/FreeBSD-head-amd64-test/8297/testReport/sys.netipsec.tunnel/aesni_aes_gcm_128/v6/
Comment 14 Andrey V. Elsukov freebsd_committer freebsd_triage 2018-07-21 18:47:45 UTC
I think it was fixed in PR 228094
Comment 15 Alan Somers freebsd_committer freebsd_triage 2018-07-21 19:24:31 UTC
That looks right, ae.  They began passing immediately after that bug was fixed.
https://ci.freebsd.org/job/FreeBSD-head-amd64-test/8028/
I'm closing this bug as a duplicate.

*** This bug has been marked as a duplicate of bug 228094 ***
Comment 16 commit-hook freebsd_committer freebsd_triage 2018-07-21 19:28:43 UTC
A commit references this bug:

Author: asomers
Date: Sat Jul 21 19:28:08 UTC 2018
New revision: 336586
URL: https://svnweb.freebsd.org/changeset/base/336586

Log:
  Clear expected failures for aesni_aes_gcm tests

  These tests were fixed by r335584

  PR:		228094
  PR:		201447
  MFC after:	2 weeks
  X-MFC-With:	335584

Changes:
  head/tests/sys/netipsec/tunnel/aesni_aes_gcm_128.sh
  head/tests/sys/netipsec/tunnel/aesni_aes_gcm_256.sh