Bug 56233 - IPsec tunnel (ESP) over IPv6: MTU computation is wrong
Summary: IPsec tunnel (ESP) over IPv6: MTU computation is wrong
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 4.9-PRERELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-08-31 13:40 UTC by Thomas Pornin
Modified: 2018-05-28 19:45 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Pornin 2003-08-31 13:40:14 UTC
The maximum ESP header size computation, function esp_hdrsiz(), in
sys/netinet6/esp_output.c, is wrong when used with a 16-byte block
cipher such as Rijndael. It is also wrong with a 8-byte block cipher,
but in the other direction, hence inducing no particular problem except
a possible slight performance hit.

Fix: 

The immediate work-around is to use an 8-byte block cipher such as
Blowfish. With such a block size, the MTU becomes 1422. In my setting, O
sends ICMP packets requesting a MTU of 1415, which is wrong again, but
in the safe direction: T sends packets shorter than needed, but data
flows.

A quick fix in the source code would be to change the code of
esp_hdrsiz() in sys/netinet6/esp_output.c, lines 139 and 153. This
function uses an estimate function which goes along the line:

ESP header length + IV length + 9 + Authlen

where the ESP header length is 8 bytes, the IV length is equal to the
cipher block length, and Authlen is the authentication algorithm
output length, here 12 for HMAC-SHA-1-96. The "9" is described in
a comment lines 149 and 150: it is the maximum padding length,
including the Pad-Length and Next-Header fields. This value is
correct for 8-byte block ciphers such as Blowfish and 3DES, but
should be "17" for 16-byte block ciphers.

So a quick fix would be to replace the "9" values in both lines 139
and 153 by "17".

A complete fix would require a more exact computation of the header
length, but it depends on the outter MTU and the cipher block length,
and I don't know to which extent that data is available at that
place in the kernel code.
How-To-Repeat: 
My office network and my own home network use the same ISP, which
provides IPv4 and IPv6 connectivity through ADSL links. Both networks
have a router/firewall running FreeBSD (4-STABLE from the end of August
2003).

I set up an IPsec tunnel between the two routers; the setkey
configuration file is simple:

spdadd [home-network-IP]/48 [office-network]/48 any -P out ipsec esp/tunnel/[home-router-IP]-[office-router-IP]/require ;
spdadd [office-network-IP]/48 [home-network]/48 any -P in ipsec esp/tunnel/[office-router-IP]-[home-router-IP]/require ;
add [home-router-IP] [office-router-IP] esp 0x10001 -m tunnel -E rijndael-cbc [symmetric-cipher-key] -A hmac-sha1 [authentication-key] ;
add [office-router-IP] [home-router-IP] esp 0x10002 -m tunnel -E rijndael-cbc [symmetric-cipher-key] -A hmac-sha1 [authentication-key] ;

(This is the configuration file on my home router; on the office router,
a similar configuration file is used.)

If I call N a machine of my home network, H my home router, O the office
router, and T a machine on the office network, I can send IPv6 packets
from N to T and back, and some tcpdumps show that those packet are duly
encrypted and authenticated between H and O.

Problems come when I try to send long packets. Typically, I establish
a TCP connection (through rlogin or ssh) and begin viewing a text with
an editor. The machine T wants to send a quite big packet to N. T has a
local MTU with O of 1500 (ethernet link) but the MTU is smaller between
O and H, so O sends to T an ICMP packet "too big" stating that T should
lower its MTU for this connection to 1407. Which T does. And there is
the true problem: 1407 is too big, the maximum being 1406. T and O enter
a loop where T repeatedly sends its 1407-byte packet, and O repeatedly
reject the packets and sends ICMPs stating that the MTU is 1407.


Details on the MTU computation:
External connectivity uses PPPoE and implies a MTU of 1492 (this is
important ! With a MTU of 1500, the problem is much less a nuisance).
The external IPv6 header uses 40 bytes. The ESP header uses 8 bytes
before the encrypted part, and 12 after (for the HMAC-SHA1 truncated
to 96 bits, as described in RFC 2404). So there remains 1432 bytes for
the encrypted part. The first 16 bytes are for the CBC initial value,
and there are 1416 bytes for data. However, since Rijndael uses 16-byte
blocks, the encrypted part must have a length multiple of 16. So the
real maximum encrypted data size is 1408 bytes. Since the Pad-Length
and Next-Header fields are mandatory, only 1406 bytes of data of
available.

Hence, the packets T sends to N through O and H must not exceed 1406
bytes, including their own IPv6 header.
Comment 1 Bruce M Simpson freebsd_committer freebsd_triage 2004-06-16 10:12:31 UTC
Responsible Changed
From-To: freebsd-bugs->bms

I'll take this
Comment 2 Bruce M Simpson freebsd_committer freebsd_triage 2004-06-16 10:12:45 UTC
State Changed
From-To: open->feedback

May have been fixed by recent commits in this area, can you test them?
Comment 3 Thomas Pornin 2004-06-16 11:35:42 UTC
On Wed, Jun 16, 2004 at 09:13:11AM +0000, Bruce M Simpson wrote:
> May have been fixed by recent commits in this area, can you test them?

Not immediately: one of the machines involved is a very small Alpha and
it takes three days to recompile the world.

I now begin the update procedure; I may be able to test things next
Sunday, or at some point next week (if the update is successful, that
is).


	--Thomas
Comment 4 Thomas Pornin 2004-06-30 11:13:59 UTC
On Wed, Jun 16, 2004 at 09:13:11AM +0000, Bruce M Simpson wrote:
> Synopsis: IPsec tunnel (ESP) over IPv6: MTU computation is wrong
> 
> State-Changed-From-To: open->feedback
> State-Changed-By: bms
> State-Changed-When: Wed Jun 16 09:12:45 GMT 2004
> State-Changed-Why: 
> May have been fixed by recent commits in this area, can you test them?
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=56233

I have finally been able to test it. The Alpha machine took 42 hours
to recompile the world, and I had to do it twice because at some point
the compile stop, apparently due to some bug in the filesystem code
(all processes entering a specific directory blocked forever).

In brief: I am sorry to report that the problem is still there.


With details: the two routers involved are, respectively:
1. a PC, running -STABLE from Jun 16 10:59:00 GMT 2004
2. an Alpha, running -STABLE from Jun 20 09:58:00 GMT 2004

An IPv6 ESP tunnel is established between those two routers, using
Rijndael as symmetric encryption cipher (with a 128-bit key) and
HMAC/SHA-1 for integrity check. That tunnel is established over a link
which provides a MTU of 1492 (ADSL link with PPPoE at both ends).
In this configuration, it can be computed that encapsulated traffic
(packets which go through the tunnel) can be at most 1406 bytes long
(accounting for the outer IPv6 header, including ESP/AH headers).

I am transfering a big file with rcp from a third machine (a PC, which
uses router 1) to the Alpha (router 2). The machine 3 is connected to
router 1 through a standard ethernet link with MTU 1500. When the file
is sent (after initial rcp work), the machine 3 first attempts to send
a 1500-byte packet to router 1. The packet is too big to go through
the tunnel. As is mandated by IPv6, router 1 does _not_ fragment the
packet, but instead reports the problem to machine 3 with an ICMPv6
packet "too big". That packet should contain the maximum MTU of 1406 (or
some slightly lower number). It contains 1407, which is wrong; machine
3 then sends a 1407-byte packet, which is rejected with the same ICMPv6
packet, and this process loops forever.


I am willing to make other tests if needed, but:
-- my Alpha machine is really slow to recompile things;
-- I will be on vacation for the next three weeks.

I think that the problem could be exhibited with a more simple setup
(ethernet LAN) by reducing arbitrarily the MTU of some of the interfaces
("ifconfig xl0 mtu 1492"). I have not tried, nor will I for the next
three weeks.


	--Thomas Pornin
Comment 5 Bruce M Simpson freebsd_committer freebsd_triage 2006-09-23 17:28:40 UTC
Responsible Changed
From-To: bms->freebsd-net

I must focus on more specific areas.
Comment 6 Bruce M Simpson freebsd_committer freebsd_triage 2006-09-24 09:57:37 UTC
Responsible Changed
From-To: freebsd-net->gnn

by request
Comment 7 Mark Linimon freebsd_committer freebsd_triage 2008-03-01 20:12:50 UTC
State Changed
From-To: feedback->analyzed

Feedback was received, stating that the problem still exists.
Comment 8 George V. Neville-Neil freebsd_committer freebsd_triage 2010-06-15 18:47:41 UTC
Responsible Changed
From-To: gnn->freebsd-net

I'm not working on IPSec at the moment, handing this one back.
Comment 9 Eitan Adler freebsd_committer freebsd_triage 2018-05-28 19:45:43 UTC
batch change:

For bugs that match the following
-  Status Is In progress 
AND
- Untouched since 2018-01-01.
AND
- Affects Base System OR Documentation

DO:

Reset to open status.


Note:
I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.