Bug 219356 - Using AES-GCM with IPSEC with aesni module loaded panics FreeBSD 11 stable
Summary: Using AES-GCM with IPSEC with aesni module loaded panics FreeBSD 11 stable
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.0-STABLE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Andrey V. Elsukov
URL:
Keywords: patch, regression
Depends on:
Blocks:
 
Reported: 2017-05-17 14:39 UTC by lab
Modified: 2017-06-02 10:06 UTC (History)
4 users (show)

See Also:


Attachments
Core text file from panic (159.86 KB, text/plain)
2017-05-17 14:39 UTC, lab
no flags Details
Core.txt file from FreeBSD 12 (r318563). (82.36 KB, text/plain)
2017-05-21 05:53 UTC, lab
no flags Details
Proposed patch (untested) (1.53 KB, patch)
2017-05-22 08:51 UTC, Andrey V. Elsukov
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description lab 2017-05-17 14:39:36 UTC
Created attachment 182666 [details]
Core text file from panic

Using iperf to pass data between two hosts behind two FreeBSD gateways that have an IPSec tunnel between them will panic gateway. The gateway that panics os the one doing most of the decryption (gateway in front of iperf running in server mode). I used iperf in UDP mode. Not sure if that is needed. If I use 11.0-RELEASE-p9 I do not see this issue. 

I used strongswan to create IPSec tunnel between gateways. If duplicating, make sure GCM option is turned on for strongswan.

Setkey -D shows:
172.16.72.71 172.16.73.67
        esp mode=tunnel spi=3420721730(0xcbe41242) reqid=1(0x00000001)
        E: aes-gcm-16  83cc9338 e415ad69 340ecec3 1e698f52 c2b2dc8e 19687c70 192200ca 9c7564a8
 27bba7d2
        seq=0x00000001 replay=0 flags=0x00000000 state=mature
        created: May 17 10:37:56 2017   current: May 17 10:38:01 2017
        diff: 5(s)      hard: 3600(s)   soft: 2935(s)
        last: May 17 10:37:57 2017      hard: 0(s)      soft: 0(s)
        current: 140(bytes)     hard: 0(bytes)  soft: 0(bytes)
        allocated: 1    hard: 0 soft: 0
        sadb_seq=1 pid=808 refcnt=1
172.16.73.67 172.16.72.71
        esp mode=tunnel spi=3464455471(0xce7f652f) reqid=1(0x00000001)
        E: aes-gcm-16  032a2b86 1f878f00 b7b09d0e f95233e1 14af88a4 f5e3ad11 380a9fa7 8afc3a01
 c72438bc
        seq=0x00000000 replay=4 flags=0x00000000 state=mature
        created: May 17 10:37:56 2017   current: May 17 10:38:01 2017
        diff: 5(s)      hard: 3600(s)   soft: 2530(s)
        last: May 17 10:37:57 2017      hard: 0(s)      soft: 0(s)
        current: 84(bytes)      hard: 0(bytes)  soft: 0(bytes)
        allocated: 1    hard: 0 soft: 0
        sadb_seq=0 pid=808 refcnt=1
Comment 1 Andrey V. Elsukov freebsd_committer 2017-05-17 14:56:22 UTC
(In reply to lab from comment #0)
> Created attachment 182666 [details]
> Core text file from panic
> 
> Using iperf to pass data between two hosts behind two FreeBSD gateways that
> have an IPSec tunnel between them will panic gateway. The gateway that
> panics os the one doing most of the decryption (gateway in front of iperf
> running in server mode). I used iperf in UDP mode. Not sure if that is
> needed. If I use 11.0-RELEASE-p9 I do not see this issue. 
> 
> I used strongswan to create IPSec tunnel between gateways. If duplicating,
> make sure GCM option is turned on for strongswan.

What revision do you use to build FreeBSD 11.0-STABLE and what strongswan version do you use? Does it panics immediately after starting iperf test or it needs some time?
Comment 2 lab 2017-05-17 15:19:53 UTC
uname -a
FreeBSD ext11-1.gta.com 11.0-STABLE FreeBSD 11.0-STABLE #0 r318331: Tue May 16 12:28:34 EDT 2017     lab@ext11-1.gta.com:/usr/src/sys/amd64/compile/GENERIC  amd64

Strong version 5.5.2_1.

Panic happens within a few minutes. It panics only if AES-GCM is used. It will not panic if standard AES256 is used.
Comment 3 lab 2017-05-21 05:53:09 UTC
Created attachment 182770 [details]
Core.txt file from FreeBSD 12 (r318563).

I upgraded the panicing gateway to FreeBSD current (r318563). FreeBSD 12 panics as well. Panic happened quickly (maybe 10 seconds).
Comment 4 Andrey V. Elsukov freebsd_committer 2017-05-22 08:51:07 UTC
Created attachment 182793 [details]
Proposed patch (untested)

Thanks. The last trace is very helpful. Can you try this patch? It should be applicable to both stable/11 and head/.

The problem triggered by error in crypto code. When AES-GCM fails authenticate decrypted data, it returns EBADMSG error code. This error code was handled incorrectly in new IPsec code and double free occurred for SA reference. This patch removed extra free().
Comment 5 lab 2017-05-22 11:15:50 UTC
Your patch seems to have fixed this issue. I have been running iperf for 15 minutes. It has never lasted this long before. I will continue running to be sure. 

It is surprising how many errors are generated during decryption. Lots of "esp_input_cb: crypto error 89" messages on screen. I don't think these show up if using AES-256. Out of curiosity, I am going to run same test later with AES-256 instaed of AES-GCM.
Comment 6 lab 2017-05-23 04:03:03 UTC
Since applying the proposed patch, I been unable to panic my FreeBSD 11 or FreeBSD 12 gateways while using AES-GCM.
Comment 7 commit-hook freebsd_committer 2017-05-23 09:02:41 UTC
A commit references this bug:

Author: ae
Date: Tue May 23 09:01:49 UTC 2017
New revision: 318734
URL: https://svnweb.freebsd.org/changeset/base/318734

Log:
  Fix possible double releasing for SA reference.

  There are two possible ways how crypto callback are called: directly from
  caller and deffered from crypto thread.

  For inbound packets the direct call chain is the following:
   IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() ->
   -> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
   -> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue().

  The SA reference is held while crypto processing is not finished.
  The error handling code wrongly expected that crypto callback always called
  from the crypto thread context, and it did SA reference releasing in
  xform_input_cb(). But when the crypto callback called directly, in case of
  error (e.g. data authentification failed) the error handling in
  ipsec_common_input() also did SA reference releasing.

  To fix this, remove error handling from ipsec_common_input() and do it
  in xform_input() before crypto_dispatch().

  PR:		219356
  MFC after:	10 days

Changes:
  head/sys/netipsec/ipsec_input.c
  head/sys/netipsec/xform_ah.c
  head/sys/netipsec/xform_esp.c
  head/sys/netipsec/xform_ipcomp.c
Comment 8 commit-hook freebsd_committer 2017-06-02 09:55:43 UTC
A commit references this bug:

Author: ae
Date: Fri Jun  2 09:54:42 UTC 2017
New revision: 319492
URL: https://svnweb.freebsd.org/changeset/base/319492

Log:
  MFC r318734:
    Fix possible double releasing for SA reference.

    There are two possible ways how crypto callback are called: directly from
    caller and deffered from crypto thread.

    For inbound packets the direct call chain is the following:
     IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() ->
     -> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
     -> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue().

    The SA reference is held while crypto processing is not finished.
    The error handling code wrongly expected that crypto callback always called
    from the crypto thread context, and it did SA reference releasing in
    xform_input_cb(). But when the crypto callback called directly, in case of
    error (e.g. data authentification failed) the error handling in
    ipsec_common_input() also did SA reference releasing.

    To fix this, remove error handling from ipsec_common_input() and do it
    in xform_input() before crypto_dispatch().

    PR:		219356

  MFC r318738:
    Fix possible double releasing for SA and SP references.

    There are two possible ways how crypto callback are called: directly from
    caller and deffered from crypto thread.

    For outbound packets the direct call chain is the following:
     IPSEC_OUTPUT() method -> ipsec[46]_common_output() ->
     -> ipsec[46]_perform_request() -> xform_output() ->
     -> crypto_dispatch() -> crypto_invoke() -> crypto_done() ->
     -> xform_output_cb() -> ipsec_process_done() -> ip[6]_output().

    The SA and SP references are held while crypto processing is not finished.
    The error handling code wrongly expected that crypto callback always called
    from the crypto thread context, and it did references releasing in
    xform_output_cb(). But when the crypto callback called directly, in case of
    error the error handling code in ipsec[46]_perform_request() also did
    references releasing.

    To fix this, remove error handling from ipsec[46]_perform_request() and do it
    in xform_output() before crypto_dispatch().

  Approved by:	re (kib)

Changes:
_U  stable/11/
  stable/11/sys/netipsec/ipsec_input.c
  stable/11/sys/netipsec/ipsec_output.c
  stable/11/sys/netipsec/xform_ah.c
  stable/11/sys/netipsec/xform_esp.c
  stable/11/sys/netipsec/xform_ipcomp.c