Summary: | IPv6 routing problem when using FreeBSD as a VPS at a cloud provider | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | peos42 <peo_s> | ||||
Component: | kern | Assignee: | Alexander V. Chernikov <melifaro> | ||||
Status: | New --- | ||||||
Severity: | Affects Some People | CC: | ae, bz, cem, emaste, freebsd.bugs, hrs, jamie, jinmei, lx, melifaro, sephe | ||||
Priority: | --- | ||||||
Version: | 11.2-STABLE | ||||||
Hardware: | amd64 | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
peos42
2018-11-18 00:20:28 UTC
What's the problem with configuring the /48 on your gateway interface? I cannot access some hosts over IPv6 (these are also hosted at RamNode within their block). When the VPS was Linux with the same IP and the net-mask on the server it worked ok (Linux used the default gw with a /64 set as net-mask). When I reinstalled the VPS to FreeBSD 11.2 nothing worked over IPv6 at the initial config. Then I saw the post I referred to and change net-mask to a /48 so the gw was included in the net-mask. Then internet and everything else worked. But not IPv6 communication to some host. Most likely due to the /48 net-mask set... I guess due to std problem. Why send the traffic on to gw if you think the host is on your local network as the mask say so. I see. As a workaround, maybe two addresses can be configured on the interface? A /128 to the VPS' gateway, and a second address from your actual /64 allocation? That way it is valid to send traffic to the VPS gateway via the interface, but hosts outside of your /64 are directed over the gateway rather than the local link? (In reply to peos42 from comment #0) I used to have such a setup with a very well known European hoster. It's idiotic IPv4 behaviour (and was exactly that there as well) and it'll eventually cause them a lot of trouble in IPv6 land as their neighbour tables on the L2/3 device in front of you can easily fill up. My European one after 1.5 years of silence has just updated and rolled out the new setup with a transition period years after. They never said anything but I was happy they listened. The solution for any hoster is to have a fe80::1/64 as a default gateway on all interfaces for all customers. It's a link-local address, there'll not be too many of them and then, given they know the ether address of their customers route whatever network their customers get to that; no extra neighbour table addresses; their router is a lot less attackable as there's no public /64 on each interface, etc. So much more to say about all this but that's their problem and not yours. You can still make this work with FreeBSD and some "glue" and magic and I'll just braindump here what comes to my mind: (a) set your ipv6_default_interface to your external interface (b) look at ndp -an to find your routers link-local address and then set ipv6_defaultrouter="fe80:....%${ipv6_default_interface}" Note this is a hack as that address can change if your hoster changes things or moves the VM around; in a more or less static setups it works; it could be "automated"; (c) I wonder if ping6 -n ff02::2%<interface> will give you answers, that should be the same address as in (b). If the address from (b) changes you might be out of luck and the best you could do is to script a "checker" which validates the address every minute and updates the IPv6 default route accordingly. (d) The above assumes that calling rtsol on the interface doesn't help you in that setup. Would be great if it would. (e) alternatively: you might be able to set the default gateway using -link; can't remember if that works; haven't tried that in years. Try and see if you can work it out from there. I'd be curious to hear... I saw that Sepherosa has added support of non-prefix directly reachable routes to DragonflyBSD. Also I saw several times in the our mailing lists the question why such routes don not work on FreeBSD without properly configured prefixes. Maybe it is time to rethink this and add such support? The noted commits are https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/4957d64ac9c5d914414e533e7da909f8162b7973 https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/b72db1d3321d7a80f4da3f727765bcc200f30278 But FreeBSD needs a bit more changes. Maybe there is a reason why DragonflyBSD fixed it. The cloud provider in the same support case I started this thread with said: --snip-- Additionally, if BSD followed RFC compliance for neighbour table discovery (https://tools.ietf.org/html/rfc4861) it would not be an issue, but they do not. This has been know and unfortunately has affected *BSD all the back to 2012. It's actually BSD that is not RFC compliant in this case. --snip-- I have not looked that deep. But is the case that FreeBSD does not follow the RFC4861 regarding Neighbour Discovery? IF it is not... Then I suggest this will be added to the future ToDo list for fixing. Created attachment 199377 [details]
Proposed patch
I just tried to patch, and it seems with this patch I can add on-link route to address that is not in configured prefixes list, and ND6 is able to send NS and receive NA. The patch should be applicable to FreeBSD 11+
(In reply to peos42 from comment #6) Could they be more specific in how they think BSD is non-compliant with that RFC? It's a large document and the critique is not specific. (In reply to peos42 from comment #6) Maybe this part? Router Advertisements contain a list of prefixes used for on-link determination and/or autonomous address configuration; flags associated with the prefixes specify the intended uses of a particular prefix. Hosts use the advertised on-link prefixes to build and maintain a list that is used in deciding when a packet's destination is on-link or beyond a router. So far, so good. Note that a destination can be on-link even though it is not covered by any advertised on- link prefix. In such cases, a router can send a Redirect informing the sender that the destination is a neighbor. So I guess that may be the complaint here? Further (§8.3, Host Specification): A host receiving a valid redirect SHOULD update its Destination Cache accordingly so that subsequent traffic goes to the specified target. ... If the Target and Destination Addresses are the same, the host MUST treat the Target as on-link. If the Target Address is not the same as the Destination Address, the host MUST set IsRouter to TRUE for the target. (In reply to Conrad Meyer from comment #8) RFC 4861 say: --snip-- If the source address of the packet prompting the solicitation is the same as one of the addresses assigned to the outgoing interface, that address SHOULD be placed in the IP Source Address of the outgoing solicitation. Otherwise, any one of the addresses assigned to the interface should be used. --snip-- So it IS permissible for another address to appear here. RFC 5942 that updates RFC 4861 seems to not change this. This is probably why it works on Linux, windows, DragonflyBSD etc. I guess they have seen this as the statement is quite clear. /Peo I don't think this text is relevant to the topic: --snip-- If the source address of the packet prompting the solicitation is the same as one of the addresses assigned to the outgoing interface, that address SHOULD be placed in the IP Source Address of the outgoing solicitation. Otherwise, any one of the addresses assigned to the interface should be used. --snip-- The "otherwise" case is basically about a forwarding node (router), in which case the source address of the packet being forwarded is normally different from any of the outgoing interface of the forwarding node. Obviously this case should be an exception to the SHOULD. As far as I know FreeBSD is compliant to this spec. Besides, I don't see any relevance of the source address selection of outgoing NS to this issue. The problem description is a bit unclear, but I don't see anything in the FreeBSD's implementation that may be related to this trouble and is not RFC-compliant. If I were to guess, the expected operation here is to allow the user to manually specify an on-link prefix (in this case, that would be <router's IPv6 address>/128). As far as I know there's no RFC that requires a host to implement such a manual configuration. But supporting it may not be a bad idea. And, if we add support for it, I'd do so by extending 'ndp' so that it allows the user to manually create an entry that would be listed by 'ndp -p', rather than allowing route(8) to tweak the routing table that causes the same effect (which b72db1d3321d7a80f4da3f727765bcc200f30278 of the dragonfly patch seems to do). (In reply to Andrey V. Elsukov from comment #7) Isn't this patch a bit of a kludge? The existing check for the entry in our L2 entry cache should be sufficient — why don't we populate LLE cache with on-link off-prefix routers? It's not clear to me the exact ordering, but it seems somehow we get a router advertisement and insert it into our routing table without populating the LLE of the sender in the LLE cache. I think we must be violating the following somehow (or ignoring SHOULD): After extracting information from the fixed part of the Router Advertisement message, the advertisement is scanned for valid options. If the advertisement contains a Source Link-Layer Address option, the link-layer address SHOULD be recorded in the Neighbor Cache entry for the router (creating an entry if necessary) and the IsRouter flag in the Neighbor Cache entry MUST be set to TRUE. If no Source Link-Layer Address is included, but a corresponding Neighbor Cache entry exists, its IsRouter flag MUST be set to TRUE. Maybe it's bogus that nd6_onlink_ns_rfc4861 defaults to off? (In reply to Conrad Meyer from comment #13) > (In reply to Andrey V. Elsukov from comment #7) > Isn't this patch a bit of a kludge? The existing check for the entry in our > L2 entry cache should be sufficient — why don't we populate LLE cache with > on-link off-prefix routers? > > It's not clear to me the exact ordering, but it seems somehow we get a > router advertisement and insert it into our routing table without populating > the LLE of the sender in the LLE cache. Such route can by added by administrator. The main user's complain is that for IPv4 you can add route like `route add -host A.B.C.D -iface em0`, but for IPv6 this won't work, because you need to have configured prefix on the interface, without the prefix ND6 will think that address on this link is not neighbor, and won't send NS, and you will get ENOBUFS error when try to send a packet to specified host. This patch adds the check and now the kernel at least will try to resolve address on the interface. So, in general you are able to add on-link route to your gateway like this: route -6 add -host fd00::1 -iface em0 (In reply to Andrey V. Elsukov from comment #14) I see, thanks for explaining Andrey. (In reply to Bjoern A. Zeeb from comment #4) I have some VPS with vultr, and their default freebsd image seems to be set up in a similar way, though they use accept_rtadv. Your router-ping idea works there: catflap% netstat -rn6|grep vtnet0 default fe80::fc00:ff:fe05:f2a7%vtnet0 UG 20736465 1500 vtnet0 2001:19f0:300:2185::/64 link#1 U 0 1500 vtnet0 fe80::%vtnet0/64 link#1 U 343447 1500 vtnet0 fe80::5400:ff:fe05:f2a7%vtnet0 link#1 UHS 107 16384 lo0 catflap% ping6 -n ff02::2%vtnet0 PING6(56=40+8+8 bytes) fe80::5400:ff:fe05:f2a7%vtnet0 --> ff02::2%vtnet0 16 bytes from fe80::fc00:ff:fe05:f2a7%vtnet0, icmp_seq=0 hlim=64 time=0.149 ms 16 bytes from fe80::fc00:ff:fe05:f2a7%vtnet0, icmp_seq=1 hlim=64 time=0.130 ms 16 bytes from fe80::fc00:ff:fe05:f2a7%vtnet0, icmp_seq=2 hlim=64 time=0.149 ms ^C --- ff02::2%vtnet0 ping6 statistics --- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/std-dev = 0.130/0.143/0.149/0.009 ms catflap% grep vtnet0_ipv6 /etc/rc.conf ifconfig_vtnet0_ipv6="inet6 2001:19f0:300:2185::1:1 prefixlen 64 accept_rtadv" ifconfig_vtnet0_ipv6="inet6 2001:19f0:300:2185::1:1 prefixlen 64 accept_rtadv" ipv6_activate_all_interfaces="YES" rtsold_enable="YES" rtsold_flags="-Fa" # Flags to an IPv6 router solicitation This bug is resolved in commit https://reviews.freebsd.org/rGf998535a66b986f51dd65b5153d1a580d50ddfbe Are there plans to backport this fix to 12-STABLE and/or 13-STABLE? Assigning to the one who fixed it; best to answer the last question(s) and/or to close this. |