| Summary: | [netinet] [patch] tcp_output() might generate invalid TSO frames when len > TCP_MAXWIN - hdrlen - optlen | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Base System | Reporter: | Renaud Lienhart <renaud> | ||||
| Component: | kern | Assignee: | Andre Oppermann <andre> | ||||
| Status: | Closed FIXED | ||||||
| Severity: | Affects Only Me | ||||||
| Priority: | Normal | ||||||
| Version: | Unspecified | ||||||
| Hardware: | Any | ||||||
| OS: | Any | ||||||
| Attachments: |
|
||||||
Responsible Changed From-To: freebsd-bugs->freebsd-net Over to maintainer(s). This may be the cause of some of the other TSO issues that have been spotted recently. Responsible Changed From-To: freebsd-net->andre Take over. Author: andre Date: Sat Aug 14 21:41:33 2010 New Revision: 211317 URL: http://svn.freebsd.org/changeset/base/211317 Log: When using TSO and sending more than TCP_MAXWIN sendalot is set and we loop back to 'again'. If the remainder is less or equal to one full segment, the TSO flag was not cleared even though it isn't necessary anymore. Enabling the TSO flag on a segment that doesn't require any offloaded segmentation by the NIC may cause confusion in the driver or hardware. Reset the internal tso flag in tcp_output() on every iteration of sendalot. PR: kern/132832 Submitted by: Renaud Lienhart <renaud-at-vmware com> MFC after: 1 week Modified: head/sys/netinet/tcp_output.c Modified: head/sys/netinet/tcp_output.c ============================================================================== --- head/sys/netinet/tcp_output.c Sat Aug 14 21:04:27 2010 (r211316) +++ head/sys/netinet/tcp_output.c Sat Aug 14 21:41:33 2010 (r211317) @@ -153,7 +153,7 @@ tcp_output(struct tcpcb *tp) int idle, sendalot; int sack_rxmit, sack_bytes_rxmt; struct sackhole *p; - int tso = 0; + int tso; struct tcpopt to; #if 0 int maxburst = TCP_MAXBURST; @@ -211,6 +211,7 @@ again: SEQ_LT(tp->snd_nxt, tp->snd_max)) tcp_sack_adjust(tp); sendalot = 0; + tso = 0; off = tp->snd_nxt - tp->snd_una; sendwin = min(tp->snd_wnd, tp->snd_cwnd); sendwin = min(sendwin, tp->snd_bwnd); @@ -490,9 +491,9 @@ after_sack_rexmit: } else { len = tp->t_maxseg; sendalot = 1; - tso = 0; } } + if (sack_rxmit) { if (SEQ_LT(p->rxmit + len, tp->snd_una + so->so_snd.sb_cc)) flags &= ~TH_FIN; @@ -1051,6 +1052,8 @@ send: * XXX: Fixme: This is currently not the case for IPv6. */ if (tso) { + KASSERT(len > tp->t_maxopd - optlen, + ("%s: len <= tso_segsz", __func__)); m->m_pkthdr.csum_flags |= CSUM_TSO; m->m_pkthdr.tso_segsz = tp->t_maxopd - optlen; } _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" State Changed From-To: open->patched State Changed From-To: patched->closed All MFC's done. |
The tcp_output() routine has an issue when the send window exceeds TCP_MAXWIN and the underlying interface supports TSO. There is a check to ensure the data being pushed doesn't exceed the maximum TSO packet size. If this is the case, "len" is trimmed and "sendalot" is set: --- 8< --- if (tso) { if (len > TCP_MAXWIN - hdrlen - optlen) { len = TCP_MAXWIN - hdrlen - optlen; len = len - (len % (tp->t_maxopd - optlen)); sendalot = 1; --- >8 --- Everything's hunky-dory, until the next "sendalot" iteration; If the remaining window does not require TSO (i.e. len <= tp->t_maxseg), this following piece of code fails to properly reset "tso": --- 8< --- if (len > tp->t_maxseg) { if (<tso_conds>) tso = 1; } else { len = tp->t_maxseg; sendalot = 1; tso = 0; } } --- >8 --- "tso" remains set to 1 and the resulting packet is tagged with CSUM_TSO although it doesn't require TSO. This causes problems with a large number of nics which refuse to send the resulting frame and, in some case, wedge. Fix: The solution is to always reset "tso" at the beginning of the "sendalot" loop in order to ensure it is not stale. In my patch I also added a KASSERT() which directly catches any problematic frame before it reaches any other layer. Patch attached with submission follows: How-To-Repeat: Using netperf (or any TCP workload) with a large socket buffer size exposes the issue in a matter of seconds.