Bug 286631 - TCP SACK: CWND set to 2 MSS after successful SACK recovery
Summary: TCP SACK: CWND set to 2 MSS after successful SACK recovery
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.3-STABLE
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-05-06 14:54 UTC by JY
Modified: 2025-05-06 16:59 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description JY 2025-05-06 14:54:00 UTC
We have observed multiple occurrences of CWND set to 2 after successful SACK recovery in the following condition

 - small number (2~4) of segment loss and recovered successfully with SACK
 - at the time of loss, CWND is limited by receiver-window
 - congestion-control algorithms: cubic, newreno


After digging through the issue, we found the following potential issues resulting in CWND-1 issue.

1. too much shrunk pipe

The issue is observed with patch: https://github.com/freebsd/freebsd-src/commit/0b3f9e435f2bde9e5be27030d9f574a977a1ad47

Which makes calling post_recovery after updating snd_una to the latest ack which is very close to snd_max.
And this leads to the pipe size shrunk too much as there is not much difference between snd_una and snd_max.

(Note that the CWND can't grow further during the recovery as the receiver-window limit which makes this pipe shrunk effect more dramatic)

I understand the necessity of the patch but I think this corner case should be addressed. Also during the investigation, I found couple more interesting observations:

2. negative value for pipe computation

When calculating the pipe (tcp_compute_pipe), it takes the total-sack-rxt-bytes minus total-sacked-bytes. 
For calculating the sacked-bytes in tcp_sack_doack, it actually tally up the total bytes acked from the receiver without checking if they are sack-xmit-recovered or just regular segments being delivered.
When there are small number of segments loss with big inflight bytes, the sack block's right edge is increasing and makes sacked-bytes larger than sack-rxt-bytes which results in tcp_compute_pipe returning negative value.

3. The above negative pipe value actually impacts the post-recovery when it compares the pipe (signed) to the ssthresh (unsigned) in cubic and newreno.


4. Instable sacked-bytes calculation in tcp_sack_doack
 - when accounting DSACK blocks for moving snd_fack forward especially when this happens at the last packet of recovery