Bug 251540 - [tcp] regression: no FIN on TCP connections when shutting down
Summary: [tcp] regression: no FIN on TCP connections when shutting down
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Cy Schubert
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2020-12-02 18:27 UTC by Martin Birgmeier
Modified: 2020-12-08 03:35 UTC (History)
4 users (show)

See Also:
koobs: mfc-stable12+
koobs: mfc-stable11-


Attachments
Bring down only clone interfaces and much later. (2.00 KB, patch)
2020-12-04 14:27 UTC, Cy Schubert
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Birgmeier 2020-12-02 18:27:34 UTC
Scenario:
- FreeBSD 13.0-CURRENT #0 r367245M: Sun Nov  1 15:44:09 CET 2020
- running in bhyve or natively on armv6 (RPI-B+)
- TCP connections are open, for example an xload to a remote display, or an iSCSI connection to a remote iSCSI target

Result:
- The "xload" on the remote display remains frozen until its TCP connection times out.
- The iSCSI server logs "WARNING: 192.168.1.195 (iqn.1995-06.xyzzy.v903:iscsid): no ping reply (NOP-Out) after 30 seconds; dropping connection".
- Similar behavior exists for other active TCP connections.
- The likely cause is that When shutting down the system, no FIN is sent on active TCP connections; they seem to just get torn down immediately.

Expected result:
- Until some time before r367245, TCP connections were torn down correctly when the kernel shut down.
- A FIN should be sent on each TCP connection before the kernel finally shuts down.
- The "xload" window would be removed from the display immediately when the kernel shuts down.
- The iSCSI server would not complain about a NOP-Out.

Note:
- Linux seems to implement the behavior noted in "result" for a long time, but not FreeBSD. Until recently, FreeBSD head did send FINs on TCP connections when the system goes down. Even if it cannot be guaranteed that the FIN is actually received, at least a one-time effort should be made.

-- Martin
Comment 1 Michael Tuexen freebsd_committer 2020-12-03 10:23:20 UTC
I just tested running a server

nc -l 8080

on FreeBSD head (r367530) against a client running nc 192.168.1.60 8080 and then shutting down the host on which the server is running.

I'm also running tcpdump on the machine one which the client runs. I do see a FIN segment sent by the server stack. When running the client using truss nc 192.168.1.60 8080 I can also see that the incoming FIN is processed, because the reading end on the socket is shutdown.

Can you provide a reproducible setup where no FIN is sent?
Comment 2 Michael Tuexen freebsd_committer 2020-12-03 20:12:32 UTC
A fix is under review: https://reviews.freebsd.org/D27464
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2020-12-04 03:16:25 UTC
^Triage:

- Assign to committer apparently resolving (please reassign if someone else)
- Assuming this is CURRENT (regression) only, if needs merging, set mfc-stable* flags to (?) accordingly, until merged
Comment 4 Michael Tuexen freebsd_committer 2020-12-04 12:07:49 UTC
(In reply to Michael Tuexen from comment #2)
The review has been abandoned. Just for tracking: This issue is caused by base r366857.
Comment 5 Cy Schubert freebsd_committer 2020-12-04 14:27:07 UTC
Created attachment 220247 [details]
Bring down only clone interfaces and much later.

This patch should resolve your problem.
Comment 6 commit-hook freebsd_committer 2020-12-04 19:31:25 UTC
A commit references this bug:

Author: cy
Date: Fri Dec  4 19:31:16 UTC 2020
New revision: 368345
URL: https://svnweb.freebsd.org/changeset/base/368345

Log:
  Revert r366857.

  r366857 created a number of problems, tearing down interfaces too
  early in shutdown. This resulted in:

  - hung ssh sessions when shutting down or rebooting remotely using
    shutdown (I've used exec shutdown, for years, as apposed to simply
    shutdown).

  - NFS mounted filesystems "disappear" prior to unmount.

  - dhclient attached to a VLAN on an interface who's parent interface
    has already shut down prints errors.

  The path forward is to teach lagg(4) and vlan(4) about WOL.

  PR:		251531, 251540
  PR:		158734, 109980 are broken again
  Reported by:	jhb, emaste, jtl, Helge Oldach<freebsd_oldach.net>
  		Martin Birgmeier <d8zNeCFG_aon.at>
  MFC after:      Immediately
  Discussion at:	https://reviews.freebsd.org/D27459

Changes:
  head/libexec/rc/rc.d/netif
Comment 7 commit-hook freebsd_committer 2020-12-04 19:36:27 UTC
A commit references this bug:

Author: cy
Date: Fri Dec  4 19:35:44 UTC 2020
New revision: 368346
URL: https://svnweb.freebsd.org/changeset/base/368346

Log:
  Revert r366857.

  r366857 created a number of problems, tearing down interfaces too
  early in shutdown. This resulted in:

  - hung ssh sessions when shutting down or rebooting remotely using
    shutdown (I've used exec shutdown, for years, as apposed to simply
    shutdown).

  - NFS mounted filesystems "disappear" prior to unmount.

  - dhclient attached to a VLAN on an interface who's parent interface
    has already shut down prints errors.

  The path forward is to teach lagg(4) and vlan(4) about WOL.

  PR:		251531, 251540
  PR:		158734, 109980 are broken again
  Reported by:	jhb, emaste, jtl, Helge Oldach<freebsd_oldach.net>
  		Martin Birgmeier <d8zNeCFG_aon.at>
  Discussion at:	https://reviews.freebsd.org/D27459

Changes:
_U  stable/12/
  stable/12/libexec/rc/rc.d/netif
Comment 8 Kubilay Kocak freebsd_committer freebsd_triage 2020-12-08 01:38:26 UTC
^Triage: Track merge to stable
Comment 9 Cy Schubert freebsd_committer 2020-12-08 03:35:17 UTC
Reverted. 

Functionality to resolve the original PRs will be moved into the kernel.