Bug 264858 - IPv6: 'route add' sets wrong netif in fib when iface has different fib
Summary: IPv6: 'route add' sets wrong netif in fib when iface has different fib
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 13.1-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: Alexander V. Chernikov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-06-23 19:52 UTC by Peter Much
Modified: 2022-07-25 22:28 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Much 2022-06-23 19:52:47 UTC
When adding a route into a fib, it may get connected to lo0 instead of the desired interface, and traffic does not work.

Example:
--------

# ifconfig tun6
tun6: flags=8010<POINTOPOINT,MULTICAST> metric 0 mtu 1371
        options=80000<LINKSTATE>
        groups: tun
        nd6 options=1<PERFORMNUD>
# setfib 4 netstat -rn6
Routing tables (fib: 4)

Internet6:
Destination                       Gateway                       Flags     Netif Expire
::/96                             ::1                           UGRS        lo0
::1                               link#1                        UHS         lo0
::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
fe80::/10                         ::1                           UGRS        lo0
ff02::/16                         ::1                           UGRS        lo0

# ifconfig tun6 inet6 fe80::1%tun6 prefixlen 124
# route -6 add -net fe80::1%tun6/124 -iface tun6 -fib 4
add net fe80::1%tun6/124: gateway tun6 fib 4
# route -6 add -net default fe80::2%tun6 -fib 4
add net default: gateway fe80::2%tun6 fib 4
# setfib 4 netstat -rn6
Routing tables (fib: 4)

Internet6:
Destination                       Gateway                       Flags     Netif Expire
::/96                             ::1                           UGRS        lo0
default                           fe80::2%tun6                  UGS         lo0 <<
::1                               link#1                        UHS         lo0
::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
fe80::/10                         ::1                           UGRS        lo0
fe80::%tun6/124                   link#4                        US         tun6
ff02::/16                         ::1                           UGRS        lo0


But then, when changing the interface itself to the fib, it suddenly works as desired:

# route -6 delete -net default fe80::2%tun6 -fib 4
delete net default: gateway fe80::2%tun6 fib 4
# ifconfig tun6 inet fib 4
# route -6 add -net default fe80::2%tun6 -fib 4
add net default: gateway fe80::2%tun6 fib 4
# setfib 4 netstat -rn6
Routing tables (fib: 4)

Internet6:
Destination                       Gateway                       Flags     Netif Expire
::/96                             ::1                           UGRS        lo0
default                           fe80::2%tun6                  UGS        tun6 <<
::1                               link#1                        UHS         lo0
::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
fe80::/10                         ::1                           UGRS        lo0
fe80::%tun6/124                   link#4                        US         tun6
ff02::/16                         ::1                           UGRS        lo0


I don't think this is the correct behaviour, because the fib on the interface should concern incoming traffic, while the route table concerns outgoing traffic, and one might want this to be separate concerns.
Comment 1 Alexander V. Chernikov freebsd_committer freebsd_triage 2022-07-11 13:38:36 UTC
Thank you for the report!

Indeed there is a number of bugs/issues in this use case.

Let me start with some background.

Each route needs to have "preferred source" address to provide input to SAS. When adding a route, during the initial "enrichment" stage kernel tries to determine this source address and the transmit interface. The business logic for that is pretty complicated, as it has to support a wide variety of (mostly legacy) ways to specify gateway/interface. Additionally, it focuses first on determining source address and deriving transmit interface from it (IPv4 legacy).

So, what happens here is the following:

# Find route nexthop index:
13:21 [1] m@devel0 setfib 1 netstat -rnW6
Destination                       Gateway                       Flags   Nhop#    Mtu    Netif Expire
default                           fe80::2%tun6                  UGS         4  16384      lo0
# Look into nhop
13:22 [1] m@devel0 setfib 1 netstat -onW6
Idx   Type         IFA                           Gateway                        Flags      Use Mtu       Netif   Addrif Refcnt Prepend
4            v6/gw ::1                           fe80::2%tun6                  GS            0  16384      lo0             2

The source address that got selected is "::1" - that's because there is no other source addresses in this fib to pick from.
As currently net.add_addr_allfibs is set to 0 by default, address from tun6 was not added to the fib 4.

The problems I see are the following:
1) transmit interface selection should be decoupled from the source address selection
2) the logic for "sharing" interface addresses across fibs should be more fine granular than "all or nothing".

I have some ideas about addressing (2), but it obviously depends on what people do with multi-fib configurations. I'd love to hear a bit more about your use-case to add it to the list of the cases solution needs to address.
Comment 2 Peter Much 2022-07-25 22:28:54 UTC
@Alexander, sorry for being late, and thank You for the explanation. 
I found a workaround in that even the default route can be specified with -iface (at least for tun devices), and then it appears on the proper interface:
   route add -6 -fib 4 -net -iface default tun6  # -> works for now

In any case, fib in Rel. 13 works better than in Rel. 12; and I probably would not be able to configure my things in Rel. 12.
As I understand Your comment now, I just missed to add some of the necessary prereq routes for a default route. Sorry for that.

My usecase is a couple of openvpn tunnels, each has it's own fib and some routes to it's other end, probably also a default route, and ipfw stateful flows will then choose the fib so answer traffic is sent back into the same tunnel where the originating request came from (no matter source or destination address).
This connects a few sites transparently for systems-management etc. (it was IPv4 first, then I tried to make it transport IPv6 in the same way) - and routes should be added only after the respective tunnel comes up.
So bottomline is: I can certainly live with all the routes to be loaded explicitely into the fib; this is fine for me - anyway I need to script all that to run from openvpn startup script, as only openvpn creates the tun device.