Bug 252449 - [tcp] TCP/IP regression or incompatibility since r368181
Summary: [tcp] TCP/IP regression or incompatibility since r368181
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.2-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: Michael Tuexen
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2021-01-05 20:46 UTC by Marek Zarychta
Modified: 2021-01-14 21:24 UTC (History)
5 users (show)

See Also:
tuexen: mfc-stable12?


Attachments
Traffic dumps (11.08 KB, text/plain)
2021-01-05 20:46 UTC, Marek Zarychta
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marek Zarychta 2021-01-05 20:46:57 UTC
Created attachment 221307 [details]
Traffic dumps

I have recently upgraded a few systems running 12.2-STABLE and after upgrading one by one I am losing the ability to connect with some Alcatel-Lucent switches (Omniswitch 63xx and 64xx models). Connection with the host seems to be established, but I am not able to get the login prompt:
telnet 172.x.x.x
Trying 172.x.x.x...
Connected to 172.x.x.x.
Escape character is '^]'.

^]
This state last forever and the session can be only terminated.

Telneting to other devices, from other vendors or even some different ALU switches, still goes fine.

Since I was upgrading systems one by one and losing connectivity, I took some tcpdumps to document this oddity.

Perhaps the culprit is TCP stack or its settings, which are the same on all machines:

net.inet.tcp.ecn.enable=1
net.inet.tcp.nolocaltimewait=1
net.inet.tcp.mssdflt=1448
net.inet.tcp.syncache.rexmtlimit=1 
net.inet.tcp.drop_synfin=1
net.inet.tcp.rfc6675_pipe=1
net.inet.tcp.abc_l_var=44
net.inet.tcp.functions_default=rack
net.inet.tcp.cc.algorithm=htcp
net.inet.tcp.cc.htcp.rtt_scaling=1
net.inet.tcp.cc.htcp.adaptive_backoff=1
net.inet.tcp.fastopen.server_enable=1

When the system is booted from kernel.old telnet from new userland and above TCP stack settings is able to connect again.
Comment 1 Marek Zarychta 2021-01-05 22:17:57 UTC
I have done a few tests from recent 13-CURRENT running with default TCP/IP stack settings. Recent 13.0-CURRENT main-c255445-gb6d54565c2 seems also to be affected and unable to connect to these devices. Moreover SSH connection with these devices cant'n be estabilished either and it's not problem of ciphers but connection negotiation.

12.2 host running kernels from late November 2020 are abble to connect with both protocols: SSH and Telnet.
Comment 2 Marek Zarychta 2021-01-06 00:45:26 UTC
Reverting suspicious r368181 aka 455a97e447557c8c92d81de9356d44d109ac4e10 solves the issue.[1]

It probably implies that either TCP/IP stack for these Alcatel-Lucent switches is broken or our TCP/IP stack was abnormally fixed with this MFC.[2]


[1] https://svnweb.freebsd.org/base?view=revision&revision=368181
[2] https://reviews.freebsd.org/D27148
Comment 3 Michael Tuexen freebsd_committer 2021-01-06 09:12:33 UTC
(In reply to Marek Zarychta from comment #2)
Is there any middlebox between the FreeBSD host and the switches?

The problem is that the FreeBSD host and the switches successfully negotiate TCP timestamp support: the switch responds in the SYN-ACK with a TCP timestamp option.

The problem is that the switch send later segments without the TCP timestamp option.
The current specification says in https://tools.ietf.org/html/rfc7323#section-3.2:

   Once TSopt has been successfully negotiated, that is both <SYN> and
   <SYN,ACK> contain TSopt, the TSopt MUST be sent in every non-<RST>
   segment for the duration of the connection, and SHOULD be sent in an
   <RST> segment (see Section 5.2 for details).  The TCP SHOULD remember
   this state by setting a flag, referred to as Snd.TS.OK, to one.  If a
   non-<RST> segment is received without a TSopt, a TCP SHOULD silently
   drop the segment.  A TCP MUST NOT abort a TCP connection because any
   segment lacks an expected TSopt.

The patch you are referring to enforces the following sentence of the above paragraph:

   If a non-<RST> segment is received without a TSopt, a TCP SHOULD silently
   drop the segment.

So the FreeBSD behaviour is what I would expect. To get immediate connectivity again,
you can:
(a) Disable TCP timestamps on the FreeBSD host by using net.inet.tcp.rfc1323=0.
(b) Disable TCP timestamps on the Switches.

I can add a sysctl variable which would allow accepting TCP segments without timestamps, even is timestamp support was negotiated. The default would be off, to honour the above "SHOULD", but would allow you to talk to the (in my view) broken switches again without globally disabling TCP timestamps on the FreeBSD hosts.
Does that sound acceptable to you?

Thanks for reporting the issue!
Comment 4 Marek Zarychta 2021-01-06 09:25:19 UTC
A was just about to write that this looks like negotiated timestamps are missing on their (ALU) side and it stalls the connection, but I might be wrong.

Thanks for the feedback Michael.

Could we be less strict only in this case?
Comment 5 Marek Zarychta 2021-01-06 10:49:40 UTC
(In reply to Michael Tuexen from comment #3)

Thank you for the clear explanation.

An additional sysctl knob, for example: net.inet.tcp.rfc7323 switched to on by default would be nice, but I am not the one to judge here or burden anyone with duties.
Comment 6 Marek Zarychta 2021-01-06 11:08:53 UTC
(In reply to Michael Tuexen from comment #3)
I am sorry - at a glance, I ignored the question about the middlebox. This happens irregardless of middlebox existence, so both: routed and direct connections are affected.

This is probably ALU firmware bug, but some models of these switches are still sold. I will try to let them know, but I don't know if they would care.
Comment 7 Michael Tuexen freebsd_committer 2021-01-06 12:50:25 UTC
(In reply to Marek Zarychta from comment #5)
I will add some knob and will let you know. Then you can test it.
Comment 8 Michael Tuexen freebsd_committer 2021-01-06 12:51:47 UTC
(In reply to Marek Zarychta from comment #6)
Thanks for the information on the middleboxes. Then this is definitely an issue with the switches. I wanted to make sure that we are not looking at a middlebox issue.
Comment 9 Michael Tuexen freebsd_committer 2021-01-13 22:20:50 UTC
A patch is under review D28142.
Comment 10 commit-hook freebsd_committer 2021-01-14 19:40:03 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d2b3ceddccac60b563f642898e3a314647666a10

commit d2b3ceddccac60b563f642898e3a314647666a10
Author:     Michael Tuexen <tuexen@FreeBSD.org>
AuthorDate: 2021-01-13 21:48:17 +0000
Commit:     Michael Tuexen <tuexen@FreeBSD.org>
CommitDate: 2021-01-14 18:28:25 +0000

    tcp: add sysctl to tolerate TCP segments missing timestamps

    When timestamp support has been negotiated, TCP segements received
    without a timestamp should be discarded. However, there are broken
    TCP implementations (for example, stacks used by Omniswitch 63xx and
    64xx models), which send TCP segments without timestamps although
    they negotiated timestamp support.
    This patch adds a sysctl variable which tolerates such TCP segments
    and allows to interoperate with broken stacks.

    Reviewed by:            jtl@, rscheff@
    Differential Revision:  https://reviews.freebsd.org/D28142
    Sponsored by:           Netflix, Inc.
    PR:                     252449
    MFC after:              1 week

 share/man/man4/tcp.4          | 23 ++++++++++++++++++++---
 sys/netinet/tcp_input.c       |  5 +++--
 sys/netinet/tcp_stacks/bbr.c  |  5 +++--
 sys/netinet/tcp_stacks/rack.c |  5 +++--
 sys/netinet/tcp_subr.c        |  5 +++++
 sys/netinet/tcp_syncache.c    | 26 +++++++++++++++++++-------
 sys/netinet/tcp_timewait.c    |  5 +++--
 sys/netinet/tcp_var.h         |  2 ++
 8 files changed, 58 insertions(+), 18 deletions(-)