200379 – SCTP stack is not FIB aware

Bug 200379 - SCTP stack is not FIB aware

Summary: SCTP stack is not FIB aware

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	CURRENT
Hardware:	Any Any

Importance:	--- Affects Only Me
Assignee:	Michael Tuexen

URL:
Keywords:

Depends on:
Blocks:

Reported:	2015-05-22 01:34 UTC by Craig Rodrigues
Modified:	2015-12-09 23:01 UTC (History)
CC List:	4 users (show)

See Also:

Attachments
server.c (2.09 KB, text/plain) 2015-05-22 01:44 UTC, Craig Rodrigues	no flags	Details
client.c (2.32 KB, text/plain) 2015-05-22 01:44 UTC, Craig Rodrigues	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Craig Rodrigues freebsd_committer

2015-05-22 01:34:33 UTC

If I use a system with multiple routing FIB's, I cannot send SCTP traffic
to an address which uses an alternate FIB.

To reproduce:

1.  Put the following in /boot/loader.conf to create 5 FIBs

    net.fibs=5

2.  Reboot

3.  Create a VLAN on an alternate FIB:

    For instance, if I have em0:

   ifconfig em0.3275 create name craig0 fib 2
   ifconfig craig0 inet6 ifdisabled fib 2
   ifconfig ctrleth0 inet 172.8.1.3/16 up fib 2
   route add 127.0.0.0/8 -interface lo0 -fib 2

4.  Run the server program, which binds to 172.8.1.3 address and
    uses fib 2:

   ./server 172.8.1.3 2

5. Perform similar steps to 3 on a separate machine or VM and use 172.8.1.4 as 
   the address.
   Run the client program which binds to 172.8.1.4 address, sends
   to 172.8.1.3 address, and uses fib 2:

   ./client 172.8.1.4 172.8.1.3 2

The data does not get sent from client to server.
If I use addresses which are not associated with an alternate FIB then
the data gets sent from the client to the server.

Comment 1 Craig Rodrigues freebsd_committer

2015-05-22 01:44:17 UTC

Created attachment 157024 [details]
server.c

Comment 2 Craig Rodrigues freebsd_committer

2015-05-22 01:44:42 UTC

Created attachment 157025 [details]
client.c

Comment 3 Craig Rodrigues freebsd_committer

2015-05-22 01:49:52 UTC

In sys/netinet/sctp_os_bsd.h , 
the SCTP_RTALLOC macro calls the rtalloc_ign() function which ignores fibs.

It should probably be changed to call rtalloc_ign_fib()

In addition, it may be necessary to store the fib_num in the
inp and inherit it when accepting/peelingoff.

Comment 4 Michael Tuexen freebsd_committer

2015-05-22 07:46:01 UTC

Hi Craig,

thank you very much for reporting the issue and providing steps to reproduce the problem. I'll look into it.

Best regards
Michael

Comment 5 Michael Tuexen freebsd_committer

2015-06-15 11:14:09 UTC

Hi Craig,

when setting up two VMs as suggested, they can just reach each other.
Even ping 172.8.1.4 works, I don't need setfib 2 ping 172.8.1.4.

What config do I need to test the fib stuff. I don't think storing it
in the inp is the way, since this values wouldn't be updated if a
setsockopt() operation is performed. So I think just using the fibnum
from the socket in the SCTP_RTALLOC macro is the way to go. But I want
to test it before committing it.

Best regards
Michael

Comment 6 Craig Rodrigues freebsd_committer

2015-06-15 15:11:27 UTC

You need to set up a default IP address and routing table
on em0 that is *not* a 172 address.

That way, if you do:

netstat -r

you will see the default routing table,
and if you do

setfib 2 netstat -r

you will see the routing table for the 172 addresses

Comment 7 Michael Tuexen freebsd_committer

2015-06-15 16:06:19 UTC

Here is what I do and what happens to the routing table. As you see, a route gets added to fib 0. Is this expected? Intended?

> ifconfig em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
	ether 00:0c:29:8a:89:10
	inet 192.168.115.171 netmask 0xffffff00 broadcast 192.168.115.255 
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
> netstat -nrfinet
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.115.2      UGS         em0
127.0.0.1          link#2             UH          lo0
192.168.115.0/24   link#1             U           em0
192.168.115.171    link#1             UHS         lo0
> sudo ifconfig em0.3275 create name craig0 fib 2
Password:
> netstat -nrfinet
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.115.2      UGS         em0
127.0.0.1          link#2             UH          lo0
192.168.115.0/24   link#1             U           em0
192.168.115.171    link#1             UHS         lo0
> sudo ifconfig craig0 inet6 ifdisabled fib 2
> netstat -nrfinet
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.115.2      UGS         em0
127.0.0.1          link#2             UH          lo0
192.168.115.0/24   link#1             U           em0
192.168.115.171    link#1             UHS         lo0
> sudo ifconfig craig0 inet 172.8.1.3/16 up fib 2
> sudo ifconfig craig0 
craig0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=3<RXCSUM,TXCSUM>
	ether 00:0c:29:8a:89:10
	inet 172.8.1.3 netmask 0xffff0000 broadcast 172.8.255.255 
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	fib: 2
	vlan: 3275 parent interface: em0
	groups: vlan 
> netstat -nrfinet
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.115.2      UGS         em0
127.0.0.1          link#2             UH          lo0
172.8.0.0/16       link#3             U        craig0
192.168.115.0/24   link#1             U           em0
192.168.115.171    link#1             UHS         lo0
> setfib 2 netstat -nrfinet
Routing tables (fib: 2)

Internet:
Destination        Gateway            Flags     Netif Expire
127.0.0.1          link#2             UH          lo0
172.8.0.0/16       link#3             U        craig0
172.8.1.3          link#3             UHS         lo0
192.168.115.0/24   link#1             U           em0
> sudo route add 127.0.0.0/8 -interface lo0 -fib 2
add net 127.0.0.0: gateway lo0 fib 2
> setfib 2 netstat -nrfinet
Routing tables (fib: 2)

Internet:
Destination        Gateway            Flags     Netif Expire
127.0.0.0/8        lo0                US          lo0
127.0.0.1          link#2             UH          lo0
172.8.0.0/16       link#3             U        craig0
172.8.1.3          link#3             UHS         lo0
192.168.115.0/24   link#1             U           em0
> netstat -nrfinet
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.115.2      UGS         em0
127.0.0.1          link#2             UH          lo0
172.8.0.0/16       link#3             U        craig0
192.168.115.0/24   link#1             U           em0
192.168.115.171    link#1             UHS         lo0

Comment 8 Michael Tuexen freebsd_committer

2015-06-15 19:46:41 UTC

OK, I need 
sysctl -w net.add_addr_allfibs=0
to reproduce your problem.

Best regards
Michael

Comment 9 Alan Somers freebsd_committer

2015-06-15 19:56:31 UTC

BTW, I wrote some FIB-related tests in tests/sys/netinet.  They use tap(4) interfaces so they don't need two machines.  If you look at the udp_dontroute test, you can see an example that passes traffic from one to the other.

Comment 10 Craig Rodrigues freebsd_committer

2015-06-16 03:51:08 UTC

(In reply to Michael Tuexen from comment #8)
Are you good to go with having an environment to repro the problem?

My routing table looks like this

FIB 0
=====
netstat -nr

default            10.47.1.1          UGS      vtnet2
10.47.0.0/16       link#3             U        vtnet2
10.47.250.26       link#3             UHS         lo0
127.0.0.1          link#5             UH          lo0



FIB 2
=====
setfib 2 netstat -nr

Routing tables (fib: 2)

Internet:
Destination        Gateway            Flags      Netif Expire
127.0.0.0/8        lo0                US          lo0
172.8.0.0/16     link#6             U      craig0
172.8.1.3       link#6             UHS         lo0

Comment 11 Michael Tuexen freebsd_committer

2015-06-16 06:39:21 UTC

Yes, I do. Using net.add_addr_allfibs=0, I get routing tables like you have. However, I had to call setfib() before socket() in your examples. With that I can reproduce the problem.

Comment 12 Michael Tuexen freebsd_committer

2015-06-16 20:54:34 UTC

Hi Craig,

I'm in the process of understanding how fibs work and thinking about how they should work for SCTP. So one possibility is that a socket uses a fib. So all paths of an SCTP association will use the same fib. Another possibility is to allow a fib per path. Especially, SCTP would "learn" a fib from incoming packets. Do you have an opinion which one is more appropriate? Any reasons to share?

Best regards
Michael

Comment 13 Alan Somers freebsd_committer

2015-06-16 21:07:32 UTC

FIBs are used to have different routing policies for different kinds of traffic.  In general, you can't correctly learn which fib you ought to use based on any feature of a received packet, because different applications can use different FIBs at the same time.  One application can even use more than one FIB.

I don't know much about SCTP, but I think that there should be a FIB per socket.  That's what the socket API currently allows, and it makes intuitive sense.  Would you ever want to have multiple paths of the same socket get routed out different interfaces or to different gateways?

Comment 14 Michael Tuexen freebsd_committer

2015-06-17 07:58:12 UTC

(In reply to Alan Somers from comment #13)
I think interfaces can assign fibs to packets, it is a field in the mbuf packet header. It makes sense to use this information in case you have no socket to get the fib from (for example when receiving a TCP SYN and you have no listening socket).

An SCTP end-point can have multiple IP addresses. When using multihoming you use multiple local and remote IP-addresses to provide network fault tolerant. So you use multiple local interfaces and route traffic on all of them to be able to fail over in case of network problems. Of course you can setup this in a single routing table and have a socket in a single fib. I'm tending to implement it this way. This also means that for response packets (like acks for data) use the socket's fib, not the one from the incoming packet. At least this is conceptually simpler. Codewise it doesn't make much of a difference.

Thanks for your feedback.

Best regards
Michael

Comment 15 Alan Somers freebsd_committer

2015-06-17 14:47:41 UTC

(In reply to Michael Tuexen from comment #14)

You're right about the interface FIB.  It will take incoming packets with a certain FIB.  But it's not completely general; it's possible to have outbound traffic use multiple FIBs on a single interface.

The part about multihoming is more interesting.  Can you use SCTP to failover from one ISP to another?  Different ISPs require different gateways, and hence different routing tables.  In that case, a single fib per SCTP socket wouldn't be sufficient.  We would need to set the FIB separately for each local IP address of the SCTP socket.

Comment 16 Michael Tuexen freebsd_committer

2015-06-17 17:38:05 UTC

(In reply to Alan Somers from comment #15)
Yes, you can failover from one ISP to another. Currently this is done by having corresponding entries in a single routing table for the multiple peer addresses.
I have checked in support for FIB support in
https://svnweb.freebsd.org/changeset/base/284515
This is a single fib per socket. This way you can have multiple applications on
a single host using SCTP and they can have individual setups. Better than the current
situation.

Comment 17 Craig Rodrigues freebsd_committer

2015-06-17 18:44:31 UTC

(In reply to Michael Tuexen from comment #16)

Thanks for working on this.  Next time you commit a fix for this PR via MFC,
remember to put in the following in the commit log message, so that the
commit scripts can auto-update the PR:

PR: 200379

Comment 18 Michael Tuexen freebsd_committer

2015-06-17 18:57:50 UTC

(In reply to Craig Rodrigues from comment #17)
Ahh. Thanks for the hint. Will do.

Best regards
Michael

Comment 19 commit-hook freebsd_committer

2015-06-20 08:26:30 UTC

A commit references this bug:

Author: tuexen
Date: Sat Jun 20 08:25:31 UTC 2015
New revision: 284633
URL: https://svnweb.freebsd.org/changeset/base/284633

Log:
  MFC r284515:
  Add FIB support for SCTP.
  This fixes https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200379

  PR:	200379

Changes:
_U  stable/10/
  stable/10/sys/netinet/sctp_asconf.c
  stable/10/sys/netinet/sctp_input.c
  stable/10/sys/netinet/sctp_input.h
  stable/10/sys/netinet/sctp_os_bsd.h
  stable/10/sys/netinet/sctp_output.c
  stable/10/sys/netinet/sctp_output.h
  stable/10/sys/netinet/sctp_pcb.c
  stable/10/sys/netinet/sctp_pcb.h
  stable/10/sys/netinet/sctp_usrreq.c
  stable/10/sys/netinet/sctputil.c
  stable/10/sys/netinet/sctputil.h
  stable/10/sys/netinet6/sctp6_usrreq.c

Comment 20 Craig Rodrigues freebsd_committer

2015-12-09 23:01:28 UTC

Sorry for the late feedback.  I can confirm that you fixed the problem.
This was a very weird and convoluted corner case.  Thank you so
much for digging into this and fixing it!!