Bug 256681

Summary: [route] Incorrect loopback route for aliases IP addresses
Product: Base System Reporter: Zhenlei Huang <zlei>
Component: kernAssignee: Alexander V. Chernikov <melifaro>
Status: Closed Works As Intended    
Severity: Affects Some People CC: cryx-ports, melifaro, net, zarychtam
Priority: ---    
Version: 13.0-STABLE   
Hardware: Any   
OS: Any   
URL: https://reviews.freebsd.org/D30811
See Also: https://reviews.freebsd.org/D28246

Description Zhenlei Huang freebsd_committer freebsd_triage 2021-06-18 02:52:58 UTC
Observed this regression on stable/13 and current/14.

Steps to repeat:

# ifconfig tap0 create inet 192.0.2.1/24
# ifconfig tap0 inet 192.0.2.2/32 alias

To verify the route table:

# netstat -4rnW
---------------------------------------
Destination        Gateway            Flags   Nhop#    Mtu      Netif Expire
...
192.0.2.0/24       link#4             U           5   1500       tap0
192.0.2.1          link#4             UHS         6  16384        lo0
192.0.2.2          link#4             UH          7   1500       tap0
...
---------------------------------------

See the loopback route for alias IP address 192.0.2.2 is incorrect.

To verify the impact:
# ping 192.0.2.2
PING 192.0.2.2 (192.0.2.2): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host


Expected route table:
192.0.2.2          link#4             UHS         7  16384       lo0

The Mtu should be 16384 and Netif should be lo0. The Flags should contains `S`, ie. should be same with 192.0.2.1

Expected behavior:
ping alias IP address with success
Comment 1 Zhenlei Huang freebsd_committer freebsd_triage 2021-06-18 02:57:05 UTC
It seems https://reviews.freebsd.org/D28246 introduce this side effect.
Comment 2 Zhenlei Huang freebsd_committer freebsd_triage 2021-06-18 08:04:11 UTC
Proposed patch: https://reviews.freebsd.org/D30811
Comment 3 Rodney W. Grimes freebsd_committer freebsd_triage 2021-06-18 15:46:44 UTC
I strongly disagree that this is the "expected" route for a /32
placed on an "interface"

Expected route table:
192.0.2.2          link#4             UHS         7  16384       lo0

The bug to find is why can we no longer access 192.0.2.2 via the routing table and the tap0 interface address, this is a local interface address and should be reachable no matter WHAT the routing table looks like.
Comment 4 Zhenlei Huang freebsd_committer freebsd_triage 2021-06-19 18:02:01 UTC
(In reply to Rodney W. Grimes from comment #3)
Comparing to stable/12 and stable/11, I call the route is "expected".

Spent hours to repeat this on previous releases.

Prior to FreeBSD 8.0, upon adding IP address to interface, only the on-link prefix route will be created. As for /32 aliases, the prefix route is /32. For output routine, ip_output() will consult rtalloc() to "generate" the loopback route for local IP address and use it.

From FreeBSD 8.0 to FreeBSD 12.2, the on-link prefix route will be created, as well as the loopback route for local IP address. See [1]. 

From FreeBSD 13.0 and onward, due to significant rework of routing subsystem, route for the /32 aliases is treated specially. See [2].

I reviewed the design of loopback route. If I understand correctly, the loopback route short cut the out path of packets those are destined for local, it directly forward it to local and bypass the physical interface. See [3].

So in some cases, if the loopback route is disabled and the hardware/logical interface can not forward those packets destined for local, 'No route to host' should be generated.

PS, there is a 'net.link.ether.inet.useloopback' sysctl tunable, it is default on, and  controls whether installing a loopback route or not. This tunable existed at least since 4.11 and was removed from stable/11. See [4].


[1]: https://cgit.freebsd.org/src/commit/sys/netinet/in.c?h=stable/8&id=ebc90701ac6c1f814c5bd6f3e19f0113ebe06156
[2]: https://reviews.freebsd.org/D28668
[3]: https://cgit.freebsd.org/src/tree/sys/netinet/if_ether.c?h=stable/4#n109
[4]: https://cgit.freebsd.org/src/commit/sys/netinet/if_ether.c?h=stable/11&id=b1b9dcae46803ae79255a9994584cb03d2a77048
Comment 5 Zhenlei Huang freebsd_committer freebsd_triage 2021-06-21 08:36:50 UTC
> So in some cases, if the loopback route is disabled and the hardware/logical interface can not forward those packets destined for local, 'No route to host' should be generated.

Since release/8.1, a new feature 'IFCAP_LINKSTATE' was introduced, see [5] and [6]. If an interface have 'IFCAP_LINKSTATE' capability, then the "link" state should be checked before passing those packets to it.

I verified this feature on stable/12 and stable/13. The steps:
1. ifconfig vxlan0 create vxlanid 100 vxlanlocal 10.x.x.x vxlanremote 10.y.y.y
2. ifconfig vxlan0 inet 192.0.2.1/24
3. ifconfig vxlan0 inet 192.0.2.2/32 alias
4. netstat -rnWf inet | grep 192.0.2
5. ping -c4 192.0.2.1
6. ping -c4 192.0.2.2
7. route delete 192.0.2.1
8. route delete 192.0.2.2
9. repeat step 4
10. ifconfig vxlan0 down && 

If create vxlan0 without vxlanid vxlanlocal and vxlanremote, then the link state is not ready. If we delete the loopback route to 192.0.2.1 and 192.0.2.2, ping will response with 'No route to host'.


[5]: https://cgit.freebsd.org/src/commit/sys/netinet/ip_output.c?h=stable/8&id=c951da56b4f19a637c7fdf734fc500560a9555de
[6]: https://cgit.freebsd.org/src/commit/sys?h=stable/8&id=94190b3925795b145fbd1fbc39df0841ef52f5d5
Comment 6 Zhenlei Huang freebsd_committer freebsd_triage 2021-06-21 08:48:08 UTC
(In reply to Rodney W. Grimes from comment #3)
> The bug to find is why can we no longer access 192.0.2.2 via the routing table and
> the tap0 interface address, this is a local interface address and should be 
> reachable no matter WHAT the routing table looks like.

I can conclude this is not bug that the local interface address is not reachable for tap0 interface. It behaves correctly as it should be. It is a FEATURE.

@CC Alexander V. Chernikov
Comment 7 Philipp Wuensche 2022-06-01 10:25:19 UTC
This is hitting me in my jail setups also.

Up until 12.3 I had jails running on lo1 interfaces in the e.g. 127.1.1.0/24 range running services for the jails that are running on ipaddr. of the hosts external interfaces.
Like a jail running postgresql on 127.1.1.1 and several webservice jails running on external ipaddr. using this postgresql jail as their database.

This had the nice effect of jails on the loopback ipaddr. not being able to reach the internet and vice versa, even without a firewall in place, and me not guessing rfc1918 ipaddr. that might not being used somewhere else in the network.

Multiple jails on the external interface resulted in /32 aliases on the external interface, which wasn't a problem until FreeBSD 13.
From now on, the jails that had a /32 alias ipaddr. where unable to reach the services running in loopback jails, due to the missing lo0 route.

For me this is a regression or at least it is somewhat unpleasant that this change in behaviour is just mentioned as "Duplicate routes installation issue for /32 or /128 interface aliases has been fixed" in the release notes of 13.0. 

I know there are solutions like VNET for jails etc.pp. but I just wanted to mention this here for all the users that will run into this issue.
Comment 8 Marek Zarychta 2022-06-01 10:56:05 UTC
Maybe you can add one more loopback interface to the jail with the public address? This should fulfil your needs.
Probably it's still not a regression.
Comment 9 Philipp Wuensche 2022-06-01 15:45:34 UTC
(In reply to Marek Zarychta from comment #8)

Yes this is one of the workarounds.
Comment 10 Marek Zarychta 2022-06-01 15:59:05 UTC
(In reply to Philipp Wuensche from comment #9)
>Yes this is one of the workarounds.
I would say it is the correct solution, not a workaround.