253464 – [tcp] Fast Recovery not entered, when SACK negotiated, but no SACK blocks received in DupAcks

Bug 253464 - [tcp] Fast Recovery not entered, when SACK negotiated, but no SACK blocks received in DupAcks

Summary: [tcp] Fast Recovery not entered, when SACK negotiated, but no SACK blocks rec...

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	CURRENT
Hardware:	Any Any

Importance:	--- Affects Only Me
Assignee:	Richard Scheffenegger

URL:
Keywords:

Depends on:
Blocks:

Reported:	2021-02-12 17:44 UTC by Richard Scheffenegger
Modified:	2021-07-01 23:08 UTC (History)
CC List:	1 user (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Richard Scheffenegger freebsd_committer

2021-02-12 17:44:22 UTC

Just to document a peculiar effect.

Using packetdrill to set up a session, with fbsd being the client (actively establishing the session), SACK was negotiated. However, the script itself
would only send duplicate ACKs without SACK blocks.

In this instance, Fastrecovery doesn't appear to get triggered - even after a high number of duplicate ACKs. Only RTO resume forward progress eventually.

Note that this is an edge case: hosts negotiating SACK are expected to provide the information which segment triggered the duplicate ACK with the SACK option, thus impact is small.

But curiously enough, this effect does NOT happen, when fbsd is the server side (listening and accepting the session) - even when the remainder of the two sripts is absolutely identical.

```
// Testing ECN Packet markings
// Active Client

--tolerance_usecs=50000

// Flush Hostcache
0.0 `sysctl net.inet.tcp.cc.algorithm=newreno`
+0.01 `sysctl net.inet.tcp.ecn.enable=1`
+0.01 `sysctl net.inet.tcp.initcwnd_segments=10`
+0.01 `sysctl net.inet.tcp.hostcache.purgenow=1`

// Create a TCP socket.
+0.50 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
+0.01 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0.01 setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0.01 setsockopt(4, SOL_SOCKET, SO_DEBUG, [1], 4) = 0
+0.01 setsockopt(4, SOL_SOCKET, SO_SNDBUF, [1048576], 4) = 0

// Establish a TCP connection with ECN

+0.04 connect(4, ..., ...) = -1 // EINPROGRESS

+0 >[noecn] SEW 0:0(0) win 65535 <mss 1460, nop,wscale 6,sackOK,TS val ... ecr 0>
+0 <[noecn] SE. 0:0(0) ack 1 win 65535 <mss 1000, nop, wscale 10, sackOK, eol, nop>
+0 >[noecn] .   1:1(0) ack 1 <...>

+0.01 getsockopt(4, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
+0.01 fcntl(4, F_SETFL, O_RDWR) = 0   // set back to blocking

// Send IW plus 1 segment, check ECN bits
1.00 write(4, ..., 11000) = 11000
+0    >[ect0]  .      1:1001(1000) ack 1
+0    <[noecn] E.     1:1(0) ack 1001 win 65535
+0    >[ect0]  .   1001:2001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 2001 win 65535
+0    >[ect0]  .   2001:3001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 3001 win 65535
+0    >[ect0]  .   3001:4001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 4001 win 65535
+0    >[ect0]  .   4001:5001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 5001 win 65535
+0    >[ect0]  .   5001:6001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 6001 win 65535
+0    >[ect0]  .   6001:7001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 7001 win 65535
+0    >[ect0]  .   7001:8001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 8001 win 65535
+0    >[ect0]  .   8001:9001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 9001 win 65535
+0    >[ect0]  .   9001:10001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 10001 win 65535
+0    >[ect0]  PW. 10001:11001(1000) ack 1


// Send few more "loss" packets
+0.01 write(4, ..., 5000) = 5000

+0    >[ect0]  .  11001:12001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 10001 win 65535  // 1st dupack
+0    >[ect0]  .  12001:13001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 10001 win 65535  // 2nd dupack
+0    >[ect0]  .  13001:14001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 10001 win 65535  // 3rd dupack
+0    >[ect0]  .  14001:15001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 10001 win 65535  // 4th dupack

+0    >[ect0]  P. 15001:16001(1000) ack 1         // not sent, RTO for the missing segment instead
+0    <[noecn] .      1:1(0) ack 10001 win 65535
/*
// Retransmission without ECT
+0    >[noecn] .  10001:11001(1000) ack 1

+0    <[noecn] .      1:1(0) ack 11001 win 65535
+0    >[noecn] .  11001:12001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 12001 win 65535
+0    >[noecn] .  12001:13001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 16001 win 65535

//More data should be ECT marked again, and
// have the CWR flag set now.
+0.01 write(4, ..., 1000) = 1000
+0    >[ect0]  PW. 16001:17001(1000) ack 1
+0    <[noecn] .       1:1(0) ack 17001 win 65535

+0.00 close(4) = 0
+0.01 >[noecn] F.  17001:17001(0) ack 1
+0.03 <[noecn] F.      1:1(0) ack 17002 win 65535
+0.01 >[noecn] .   17002:17002(0) ack 2

```

Comment 1 Richard Scheffenegger freebsd_committer

2021-02-13 15:11:31 UTC

Looked into this issue. The discrepancy between Client and Server side turned out to be a typo in the testscript. Provided that SACK is negotiated for in the 3WHS, any subsequent duplicate ACKs without SACK blocks are effectively ignored for purposes of recovering from loss.

https://reviews.freebsd.org/D28634 

When duplicate ACKs arrive after SACK was negotiated, make sure that SACK processing only happens with actual SACK blocks are also present in the TCP option of the ACK. Otherwise, the normal newreno processing path would be taken.

And as always, the retransmission timer remains as ultimate fallback, thus any problematic TCP sessions would only exhibit poor performance under loss, but no catastrophic failure (stall / closing of the session).

Comment 2 Richard Scheffenegger freebsd_committer

2021-02-26 10:21:43 UTC

Fix under discussion: https://reviews.freebsd.org/D28634

Comment 3 Michael Tuexen freebsd_committer

2021-07-01 23:08:51 UTC

The fix under discussion has been committed.