Bug 196755

Summary: SCTP aborts connections when primary is affected by packetloss but secondary path is clean
Product: Base System Reporter: Frans Slothouber <frans.slothouber>
Component: kernAssignee: Michael Tuexen <tuexen>
Status: Closed FIXED    
Severity: Affects Only Me CC: tuexen
Priority: ---    
Version: 9.3-RELEASE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
server used for testing
none
client used for testing none

Description Frans Slothouber 2015-01-15 12:08:18 UTC
I have two machines.  Each with two network interfaces.
The two machines are connected by two separate IP networks.

On these machines runs client/server application that sets up an SCTP
association between these two machines.  

The both client and server bind to the two network interfaces (that is IP
addresses assigned to them).
The client connects to the server, thus setting up two SCTP paths.

Every second both client and server send each other a message.

If I now introduce packetloss on the network for the primary SCTP path I
observe the following behavior:

  - With low levels of packetloss, <30%, or high levels of packetloss >85%,
    the association stays intact.
  - With medium levels of packetloss the association is aborted.

Looking at the packetdump, I see that the FreeBSD SCTP stack keeps insists on
sending the SACK packets over the primary path, this causes the other side
to abort the connections due to an excess of retransmission.


These experiments have been carried out with the following change
in the default SCTP settings:

    sysctl -w net.inet.sctp.heartbeat_interval=500
    sysctl -w net.inet.sctp.rto_initial=300
    sysctl -w net.inet.sctp.rto_min=100
    sysctl -w net.inet.sctp.rto_max=500
    sysctl -w net.inet.sctp.path_rtx_max=2
    sysctl -w net.inet.sctp.assoc_rtx_max=5


The experiments have been conducted with a Linux - FreeBSD combination and a
Linux - Linux combination.  (With the FreeBSD machine being the server.)

The Linux - Linux combination does not show this behaviour.


Some background:  this behaviour was found while carrying out tests
to see if SCTP can be used for a train-signalling network.
Comment 1 Michael Tuexen freebsd_committer freebsd_triage 2015-03-07 18:24:21 UTC
Hi Frans,

thank you very much for the report. I have a clarification question:
When you introduced the packet loss, did this only affect the packets
in one direction, or did it apply the packets in both directions.

Best regards
Michael
Comment 2 Frans Slothouber 2015-03-08 09:06:12 UTC
(In reply to Michael Tuexen from comment #1)

Hoi Michael,

Packets are affected in both directions (but this happens on only one of the paths).


Best regards,
Frans
Comment 3 Michael Tuexen freebsd_committer freebsd_triage 2015-03-08 09:51:33 UTC
OK. And each side sends a message once a second, independently. It is not one side sending a packet a second and the other reflects it on reception.

Best regards
Michael
Comment 4 Frans Slothouber 2015-03-08 10:08:46 UTC
(In reply to Michael Tuexen from comment #3)

Correct; they both send messages independently of each other.  The only
thing each sides does with the with the message is log that it was received and when.

Each side sends a message every second, with a few mili second variation.
Comment 5 Michael Tuexen freebsd_committer freebsd_triage 2015-03-08 11:35:18 UTC
Ahh, OK. That should allow me to reproduce and analyse the problem. Might take some time, but I'll come back.
Thanks for your help so far.

Best regards
Michael
Comment 6 Michael Tuexen freebsd_committer freebsd_triage 2015-03-26 19:25:47 UTC
Created attachment 154841 [details]
server used for testing
Comment 7 Michael Tuexen freebsd_committer freebsd_triage 2015-03-26 19:26:37 UTC
Created attachment 154842 [details]
client used for testing
Comment 8 Michael Tuexen freebsd_committer freebsd_triage 2015-03-26 22:17:06 UTC
OK, the logic to choose the destination address for SACK chunks was sub-optimal. Randy and myself worked on a fix which is checked in as
https://svnweb.freebsd.org/changeset/base/280714

Thanks for reporting the issue!

Best regards
Michael