Bug 219699

Summary: Issue with IPv6 and neighbor notification
Product: Base System Reporter: Paul G Webster <paul>
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: New ---    
Severity: Affects Some People CC: dch, rkoberman
Priority: ---    
Version: 11.0-RELEASE   
Hardware: Any   
OS: Any   

Description Paul G Webster 2017-06-01 03:59:10 UTC
I will be honest here and admit that I do not know quite if this is that FreeBSD is doing IPv6 right and linux is implementing some hack to get around a problem or if FreeBSD is doing something wrong. Its a little to low level for me, so I hope someone with a little more knowledge can clarify the issue.

Ok so I grabbed a cheap VPS to run a small mail server on, it is XEN; all works lovely, virtio etc.. but for ipv6 the host implements a shared gateway for all its clients, notable: 2a07:4580:b0d::1/48

I have a /64 in my command on this range; 2a07:4580:b0d:27e::/64 for which I use 2a07:4580:b0d:27e::1/128 for my mail server. 

Now the issues; for me to reach that gateway from that prefix I needed to set:
ifconfig_vtnet0_ipv6="inet6 2a07:4580:b0d:27e::1/48"
ipv6_defaultrouter="2a07:4580:b0d::1"

because obviously, my assigned /64 could not reach there gateway, from what I understand from the IPv6 folks I should be able to simply set the gateway to the interface without such a hack, but once again realistically; no idea.

Now for the actual problem.. the ipv6 is spotty, 95% loss and what is going on is there gateway seems to believe I am not really using the address; if I do use it; it temporarily 'comes up', so I did this:

$ cat /etc/rc.local
daemon -f ping6 -i 2 -s 1 2a07:4580:b0d::1

and wallah perfect working ipv6, I assume that some sort of 'neigbhour notification' is not taking place, and the ping is enforcing it. But I also have several linux VPS's with these guys and they are all fine, hence my reason for calling out for help to figure out what exactly is wrong.

-- Cheers paul
Comment 1 Paul G Webster 2017-06-01 04:36:49 UTC
Just a clarification; 

'because obviously, my assigned /64 could not reach there gateway, from what I understand from the IPv6 folks I should be able to simply set the gateway to the interface without such a hack, but once again realistically; no idea.'

If I do attempt to set the gateway on the interface, it cannot 'find' the gateway almost like it was having trouble looking it up.
Comment 2 rkoberman 2017-06-01 19:22:44 UTC
What is in your routing table? "netstat -rnf inet6"
This should be collected what IPv6 is working (pings running) and when it's not.

What  about ndp? "ndp -a" and "ndp -p"

Less important but possibly useful, what does your interface config look like? "ifconfig vtnet0"
Comment 3 Paul G Webster 2017-06-01 19:39:15 UTC
Thank you for the reply; the information you requested is as follows

$ netstat -rnf inet6
you have mail
Routing tables

Internet6:
Destination                       Gateway                       Flags     Netif                               Expire
::/96                             ::1                           UGRS        lo0
default                           2a07:4580:b0d::1              UGS      vtnet0
::1                               link#2                        UH          lo0
::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
2a07:4580:b0d::/48                link#1                        U        vtnet0
2a07:4580:b0d:27e::1              link#1                        UHS         lo0
fe80::/10                         ::1                           UGRS        lo0
fe80::%vtnet0/64                  link#1                        U        vtnet0
fe80::216:3cff:fe81:38d8%vtnet0   link#1                        UHS         lo0
fe80::%lo0/64                     link#2                        U           lo0
fe80::1%lo0                       link#2                        UHS         lo0
ff02::/16                         ::1                           UGRS        lo0

$ ndp -a
Neighbor                             Linklayer Address  Netif Expire    S Flags
2a07:4580:b0d::1                     00:05:73:a0:00:09 vtnet0 23h59m59s S R
mail.tmp.group                       00:16:3c:81:38:d8 vtnet0 permanent R
fe80::a693:4cff:fe63:547f%vtnet0     a4:93:4c:63:54:7f vtnet0 expired   P  3
fe80::216:3cff:fe81:38d8%vtnet0      00:16:3c:81:38:d8 vtnet0 permanent R

$ ndp -p
2a07:4580:b0d::/48 if=vtnet0
flags=LO vltime=infinity, pltime=infinity, expire=Never, ref=1
  No advertising router
fe80::%vtnet0/64 if=vtnet0
flags=LAO vltime=infinity, pltime=infinity, expire=Never, ref=0
  No advertising router
fe80::%lo0/64 if=lo0
flags=LAO vltime=infinity, pltime=infinity, expire=Never, ref=0
  No advertising router


$ ifconfig vtnet0
vtnet0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=6c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:16:3c:81:38:d8
        inet 185.157.232.30 netmask 0xffff0000 broadcast 185.157.255.255
        inet6 fe80::216:3cff:fe81:38d8%vtnet0 prefixlen 64 scopeid 0x1
        inet6 2a07:4580:b0d:27e::1 prefixlen 48
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet 10Gbase-T <full-duplex>
        status: active
Comment 4 rkoberman 2017-06-01 22:44:08 UTC
(In reply to Paul G Webster from comment #3)
At the time that you collected this, everything looks good. The default route is pointing to the correct link. (link#1) and all ndp data looks correct.

One suggestion is to accept router advertisements. To do this, add "accept_rtadv" to you ifconfig_vtnet0_ipv6 line in rc.conf. This will cause your system to accept router advertisements on your LAN. Unless you have another local system configured to do IPv6 routing, this should be fine. I also doubt it will help, but it might be worth a try. (If it does not help, I'd remove it.)

You might want to monitor the default route and the NDP for 2a07:4580:b0d:27e::1 to see if they are stable. It sure sounds like something is flapping when the link is inactive for some period. My guess is NDP, but that is far from certain.

N.B. I was very active in working with IPv6 for over 15 years, but I've been retired for six years and I'll admit that I am not as sharp on it as I once was. Worse, Frontier, my ISP about half the year does not yet support IPv6 customer connections.
Comment 5 Paul G Webster 2017-06-01 22:49:10 UTC
root@mail:~ # ping6 google.com
PING6(56=40+8+8 bytes) 2a07:4580:b0d:27e::1 --> 2a00:1450:4009:80f::200e
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=0 hlim=55 time=7.064 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=1 hlim=55 time=6.720 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=2 hlim=55 time=6.686 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=3 hlim=55 time=6.556 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=4 hlim=55 time=6.605 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=5 hlim=55 time=6.741 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=6 hlim=55 time=6.802 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=7 hlim=55 time=6.934 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=8 hlim=55 time=6.600 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=9 hlim=55 time=6.805 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=10 hlim=55 time=6.670 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=11 hlim=55 time=6.668 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=12 hlim=55 time=7.080 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=25 hlim=55 time=6.864 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=26 hlim=55 time=6.725 ms
16 bytes from 2a00:1450:4009:80f::200e, icmp_seq=27 hlim=55 time=6.666 ms
^C
--- google.com ping6 statistics ---
28 packets transmitted, 16 packets received, 42.9% packet loss
round-trip min/avg/max/std-dev = 6.556/6.762/7.080/0.151 ms
root@mail:~ #

# Known working
ifconfig_vtnet0_ipv6="inet6 2a07:4580:b0d:27e::1/48 accept_rtadv"
ipv6_defaultrouter="2a07:4580:b0d::1"


No love still the same fault after I comment out:
root@mail:~ # cat /etc/rc.local
daemon -f ping6 -i 2 -s 1 2a07:4580:b0d::1
Comment 6 Paul G Webster 2017-06-01 22:49:37 UTC
As an aside, I am on virgin media in the uk, they do not even have an upgrade path for ipv6 yet :P so I feel your pain
Comment 7 Paul G Webster 2017-06-03 18:39:57 UTC
A little more on this fault; with the help of the host we have a working solution, from the host them self:

--quote
FreeBSD appears to use the link local address on the interface to send neighbor advertisements for the addresses it would like to be routed towards it, unfortunately our side only allows you to send neighbor advertisements from an allowed allocated prefix not an fe80:: address.

I have added an exception for this, could you try stopping your work around for now and seeing if IPv6 carries on working?
--/quote

To cut a story short and many tickets later yes in fact the work around did work, the host is using 'ebtables' on there host; they have contacted the panel provider hoping they can patch the upstream.

Will update if I can get a little more detail on what the patch was or a copy of it hopefully :)
Comment 8 Paul G Webster 2017-06-03 19:16:55 UTC
The host provided the following information for what they had to do with ebtables to get freebsd working; 

--quote
We use ebtables on the hosts to prevent IP stealing.

We have a chain setup for each VM which basically says "this VMs mac can only use these IPs", this is what was dropping your v6 NA's.

The patch to allow the link local address is simply:
ebtables -A kvm922.0 -p IPv6 --ip6-src fe80::/10 -j ACCEPT

With kvm922.0 being the chain that your VM belongs to.
--/quote
Comment 9 rkoberman 2017-06-05 00:08:11 UTC
(In reply to Paul G Webster from comment #8)
That would explain it. I am very surprised that Linux does not use the link-local address for routing information. That was one of the main reason for the link-local implementation in IPv6.

Can someone confirm that Linux does not use link-local as the default NDP communication connection?

Yes, ebtables (on linux) should always allow link-local. No, all routing is not over link-local. Protocols that communicate to non-adjacent nodes (e.g. BGP) cannot use link-local.

Glad you tracked this down. You saved the next guy problems.