Bug 196755 - SCTP aborts connections when primary is affected by packetloss but secondary path is clean
Summary: SCTP aborts connections when primary is affected by packetloss but secondary ...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 9.3-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: Michael Tuexen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-15 12:08 UTC by Frans Slothouber
Modified: 2015-03-26 22:17 UTC (History)
1 user (show)

See Also:


Attachments
server used for testing (4.15 KB, text/x-csrc)
2015-03-26 19:25 UTC, Michael Tuexen
no flags Details
client used for testing (3.93 KB, text/x-csrc)
2015-03-26 19:26 UTC, Michael Tuexen
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Frans Slothouber 2015-01-15 12:08:18 UTC
I have two machines.  Each with two network interfaces.
The two machines are connected by two separate IP networks.

On these machines runs client/server application that sets up an SCTP
association between these two machines.  

The both client and server bind to the two network interfaces (that is IP
addresses assigned to them).
The client connects to the server, thus setting up two SCTP paths.

Every second both client and server send each other a message.

If I now introduce packetloss on the network for the primary SCTP path I
observe the following behavior:

  - With low levels of packetloss, <30%, or high levels of packetloss >85%,
    the association stays intact.
  - With medium levels of packetloss the association is aborted.

Looking at the packetdump, I see that the FreeBSD SCTP stack keeps insists on
sending the SACK packets over the primary path, this causes the other side
to abort the connections due to an excess of retransmission.


These experiments have been carried out with the following change
in the default SCTP settings:

    sysctl -w net.inet.sctp.heartbeat_interval=500
    sysctl -w net.inet.sctp.rto_initial=300
    sysctl -w net.inet.sctp.rto_min=100
    sysctl -w net.inet.sctp.rto_max=500
    sysctl -w net.inet.sctp.path_rtx_max=2
    sysctl -w net.inet.sctp.assoc_rtx_max=5


The experiments have been conducted with a Linux - FreeBSD combination and a
Linux - Linux combination.  (With the FreeBSD machine being the server.)

The Linux - Linux combination does not show this behaviour.


Some background:  this behaviour was found while carrying out tests
to see if SCTP can be used for a train-signalling network.
Comment 1 Michael Tuexen freebsd_committer freebsd_triage 2015-03-07 18:24:21 UTC
Hi Frans,

thank you very much for the report. I have a clarification question:
When you introduced the packet loss, did this only affect the packets
in one direction, or did it apply the packets in both directions.

Best regards
Michael
Comment 2 Frans Slothouber 2015-03-08 09:06:12 UTC
(In reply to Michael Tuexen from comment #1)

Hoi Michael,

Packets are affected in both directions (but this happens on only one of the paths).


Best regards,
Frans
Comment 3 Michael Tuexen freebsd_committer freebsd_triage 2015-03-08 09:51:33 UTC
OK. And each side sends a message once a second, independently. It is not one side sending a packet a second and the other reflects it on reception.

Best regards
Michael
Comment 4 Frans Slothouber 2015-03-08 10:08:46 UTC
(In reply to Michael Tuexen from comment #3)

Correct; they both send messages independently of each other.  The only
thing each sides does with the with the message is log that it was received and when.

Each side sends a message every second, with a few mili second variation.
Comment 5 Michael Tuexen freebsd_committer freebsd_triage 2015-03-08 11:35:18 UTC
Ahh, OK. That should allow me to reproduce and analyse the problem. Might take some time, but I'll come back.
Thanks for your help so far.

Best regards
Michael
Comment 6 Michael Tuexen freebsd_committer freebsd_triage 2015-03-26 19:25:47 UTC
Created attachment 154841 [details]
server used for testing
Comment 7 Michael Tuexen freebsd_committer freebsd_triage 2015-03-26 19:26:37 UTC
Created attachment 154842 [details]
client used for testing
Comment 8 Michael Tuexen freebsd_committer freebsd_triage 2015-03-26 22:17:06 UTC
OK, the logic to choose the destination address for SACK chunks was sub-optimal. Randy and myself worked on a fix which is checked in as
https://svnweb.freebsd.org/changeset/base/280714

Thanks for reporting the issue!

Best regards
Michael