Bug 263288

Summary: IPv6 system not responding to Neighbor Solicitation
Product: Base System Reporter: wcarson.bugzilla
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: New ---    
Severity: Affects Only Me CC: bugfreebsd, colin, grahamperrin, rblayzor, zlei
Priority: ---    
Version: 13.0-RELEASE   
Hardware: amd64   
OS: Any   
Attachments:
Description Flags
Packet capture none

Description wcarson.bugzilla 2022-04-15 00:01:44 UTC
Hello, recently after enabling ipv6_privacy in /etc/rc.conf and rebooting, I've been unable to get IPv6 connectivity to work in a hosted environment. (I don't know if this is a red herring or not.) I've tried disabling it, and even after rebooting, it still doesn't work. (Doesn't work meaning: I'm unable to ping6 hosts on the Internet that are reachable, e.g. ipv6.google.com.) I confirmed ipv6_privacy is actually disabled:

 # sysctl -a | grep tempaddr
 net.inet6.ip6.use_tempaddr: 0
 net.inet6.ip6.prefer_tempaddr: 0

If I boot into a Linux environment (the provider has a Rescue mode), I'm able to reach IPv6 just fine. Furthermore, if I then reboot back into FreeBSD 13.0-RELEASE-p10 it will work for around ~5 minutes and then connections time out.

Given the behavior and based on some tcpdumps, it looks like my system is not responding to the upstream router's Neighbor Solicitation messages. If I boot into Linux, it respond to the NS messages, the router caches the MAC address, and IPv6 works. If I'm fast enough and reboot into FreeBSD, IPv6 works until the the entry expires, and then I just see this:

13:24:58.901780 IP6 2600:3c00::f03c:91ff:feb0:a56f > 2605:6400:10:968:22:da15:28a6:c800: ICMP6, echo request, seq 40, length 16
13:24:59.277713 IP6 2600:3c00::8678:acff:fe1c:ec41 > ff02::1:ffb0:a56f: ICMP6, neighbor solicitation, who has 2600:3c00::f03c:91ff:feb0:a56f, length 32
13:24:59.277799 IP6 2600:3c00::8678:acff:fe1c:ec41 > ff02::1:ffb0:a56f: ICMP6, neighbor solicitation, who has 2600:3c00::f03c:91ff:feb0:a56f, length 32

3 packets, the echo request, then two NS requests, and no response -- and then it just repeats. 

I confirmed b0:a5:6f is the Device ID part of my MAC: 

 # ifconfig em0
 em0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
         options=481209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,NOMAP>
         ether f2:3c:91:b0:a5:6f <---
         inet6 fe80::f03c:91ff:feb0:a56f%em0 prefixlen 64 scopeid 0x1
         inet6 2600:3c00::f03c:91ff:feb0:a56f prefixlen 64 autoconf
         inet6 2600:3c00:e000:137::1 prefixlen 128
         inet6 2600:3c00:e000:137::1:1 prefixlen 128
         inet6 2600:3c00:e000:137::2:1 prefixlen 128
         inet6 2600:3c00:e000:137::3:1 prefixlen 128
         inet6 2600:3c00:e000:137:cafe:8a2e:370:7334 prefixlen 128
         inet 96.126.127.161 netmask 0xffffff00 broadcast 96.126.127.255
         inet 173.255.203.45 netmask 0xffffffff broadcast 173.255.203.45
         inet 96.126.122.129 netmask 0xffffffff broadcast 96.126.122.129
         inet 50.116.26.213 netmask 0xffffffff broadcast 50.116.26.213
         media: Ethernet autoselect (1000baseT <full-duplex>)
         status: active
         nd6 options=8023<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL,DEFAULTIF>

Therefore the Solicited-node multicast address ff02::1:ffb0:a56f looks to be correct. I've also confirmed the router's address is within the assigned SLAAC network (Router: 2600:3c00::8678:acff:fe1c:ec41, SLAAC address: 2600:3c00::f03c:91ff:feb0:a56f/64).Furthermore, the multicast address does show up in `ifmcstat`:

 # ifmcstat
 em0:
         inet6 fe80::f03c:91ff:feb0:a56f%em0 scopeid 0x1
         mldv2 flags=2<USEALLOW> rv 2 qi 125 qri 10 uri 3
                 group ff02::1:ff70:7334%em0 scopeid 0x1 mode exclude
                         mcast-macaddr 33:33:ff:70:73:34
                 group ff02::1:ff03:1%em0 scopeid 0x1 mode exclude
                         mcast-macaddr 33:33:ff:03:00:01
                 group ff02::1:ff02:1%em0 scopeid 0x1 mode exclude
                         mcast-macaddr 33:33:ff:02:00:01
                 group ff02::1:ff01:1%em0 scopeid 0x1 mode exclude
                         mcast-macaddr 33:33:ff:01:00:01
                 group ff02::1:ff00:1%em0 scopeid 0x1 mode exclude
                         mcast-macaddr 33:33:ff:00:00:01
         inet 96.126.127.161
         igmpv3 rv 2 qi 125 qri 10 uri 3
                 group 224.0.0.1 mode exclude
                         mcast-macaddr 01:00:5e:00:00:01
         inet6 fe80::f03c:91ff:feb0:a56f%em0 scopeid 0x1
         mldv2 flags=2<USEALLOW> rv 2 qi 125 qri 10 uri 3
                 group ff01::1%em0 scopeid 0x1 mode exclude
                         mcast-macaddr 33:33:00:00:00:01
                 group ff02::2:bdc6:c84d%em0 scopeid 0x1 mode exclude
                         mcast-macaddr 33:33:bd:c6:c8:4d
                 group ff02::2:ffbd:c6c8%em0 scopeid 0x1 mode exclude
                         mcast-macaddr 33:33:ff:bd:c6:c8
                 group ff02::1%em0 scopeid 0x1 mode exclude
                         mcast-macaddr 33:33:00:00:00:01
                 group ff02::1:ffb0:a56f%em0 scopeid 0x1 mode exclude <---
                         mcast-macaddr 33:33:ff:b0:a5:6f

I can even ping the address and it replies!

 # ping6 ff02::1:ffb0:a56f
 PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> ff02::1:ffb0:a56f
 16 bytes from fe80::f03c:91ff:feb0:a56f%em0, icmp_seq=0 hlim=64 time=0.451 ms
 16 bytes from fe80::f03c:91ff:feb0:a56f%em0, icmp_seq=1 hlim=64 time=0.446 ms
 16 bytes from fe80::f03c:91ff:feb0:a56f%em0, icmp_seq=2 hlim=64 time=0.618 ms
 ^C

Does anyone have any thoughts why it's not responding to the Neighbor Solicitation messages? I've been troubleshooting this for a few days now and can't figure it out. I also tried booting kernel.old (which I think is -p8 or -p9), but it made no difference. I've tried with and without pf enabled -- again, no difference.

I don't know if this is useful, but I validated routes are being discovered:

 # ndp -na
 Neighbor                             Linklayer Address  Netif Expire    S Flags
 2600:3c00:e000:137::1:1              f2:3c:91:b0:a5:6f    em0 permanent R
 2600:3c00:e000:137::1                f2:3c:91:b0:a5:6f    em0 permanent R
 fe80::1%em0                          00:05:73:a0:0f:ff    em0 23h56m36s S R <---
 2600:3c00:e000:137::3:1              f2:3c:91:b0:a5:6f    em0 permanent R
 2600:3c00:e000:137::2:1              f2:3c:91:b0:a5:6f    em0 permanent R
 2600:3c00::f03c:91ff:feb0:a56f       f2:3c:91:b0:a5:6f    em0 permanent R
 fe80::f03c:91ff:feb0:a56f%em0        f2:3c:91:b0:a5:6f    em0 permanent R
 fe80::8678:acff:fe1c:ec41%em0        84:78:ac:1c:ec:41    em0 23h49m7s  S R <---
 2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f   em0 permanent R

 # netstat -nr6
 Routing tables

 Internet6:
 Destination                       Gateway                       Flags     Netif Expire
 ::/96                             ::1                           UGRS        lo0
 default                           fe80::1%em0                   UG          em0 <---
 ::1                               link#2                        UHS         lo0
 ::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
 2600:3c00::f03c:91ff:feb0:a56f    link#1                        UHS         lo0
 2600:3c00:e000:137::1             link#1                        UHS         lo0
 2600:3c00:e000:137::1:1           link#1                        UHS         lo0
 2600:3c00:e000:137::2:1           link#1                        UHS         lo0
 2600:3c00:e000:137::3:1           link#1                        UHS         lo0
 2600:3c00:e000:137:cafe:8a2e:370:7334 link#1                    UHS         lo0
 fe80::/10                         ::1                           UGRS        lo0
 fe80::%em0/64                     link#1                        U           em0
 fe80::f03c:91ff:feb0:a56f%em0     link#1                        UHS         lo0
 fe80::%lo0/64                     link#2                        U           lo0
 fe80::1%lo0                       link#2                        UHS         lo0
 ff02::/16                         ::1                           UGRS        lo0

And here's the IPv6 part in my rc.conf:

 # ipv6
 rtsold_enable="YES"
 rtsold_flags="-aF"
 #ipv6_activate_all_interfaces="YES"
 ipv6_network_interfaces="em0"
 ipv6_default_interface="em0"
 ifconfig_em0_ipv6="inet6 accept_rtadv"
 ifconfig_em0_aliases="\
                inet6 2600:3c00:e000:0137::0:1/128 \
                inet6 2600:3c00:e000:0137::1:1/128 \
                inet6 2600:3c00:e000:0137::2:1/128 \
                inet6 2600:3c00:e000:0137::3:1/128 \
                inet6 2600:3c00:e000:0137:cafe:8a2e:0370:7334/128"

I'm at a complete loss. Any help troubleshooting this would be greatly appreciated.
Comment 1 Zhenlei Huang freebsd_committer freebsd_triage 2022-04-15 09:59:35 UTC
The `ipv6_privacy` is meant to use and prefer privacy addresses as per RFC 4941. It should not has any relationships to this issue.

The `ndp -na` result shows that the neighbor cache entry for your router / gw is stale.

fe80::1%em0                          00:05:73:a0:0f:ff    em0 23h56m36s S R <---
...
fe80::8678:acff:fe1c:ec41%em0        84:78:ac:1c:ec:41    em0 23h49m7s  S R <---

Can you please try to ping your gw and share the ndp result?

```
# ping6 fe80::1%em0
and
# ping6 fe80::8678:acff:fe1c:ec41%em0
```
Comment 2 Zhenlei Huang freebsd_committer freebsd_triage 2022-04-15 10:12:02 UTC
(In reply to Zhenlei Huang from comment #1)
And also try to ping gw's unicast IPv6 address, and see if it is reachable.

# ping6 2600:3c00::8678:acff:fe1c:ec41
Comment 3 wcarson.bugzilla 2022-04-15 17:09:19 UTC
The router seems to be immediately stale. If I clear all entries (ndp -c), they show up but are stale:

root@roast:~ # date ; ndp -c ; echo ; date ; ndp -na ; echo ; sleep 5 ; date ; ndp -na
Fri Apr 15 12:01:12 CDT 2022
fe80::1%em0 (fe80::1%em0) deleted

Fri Apr 15 12:01:12 CDT 2022
Neighbor                             Linklayer Address  Netif Expire    S Flags
2600:3c00:e000:137::1:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::1                f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::3:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::2:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00::f03c:91ff:feb0:a56f       f2:3c:91:b0:a5:6f    em0 permanent R
fe80::f03c:91ff:feb0:a56f%em0        f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f   em0 permanent R

Fri Apr 15 12:01:17 CDT 2022
Neighbor                             Linklayer Address  Netif Expire    S Flags
2600:3c00:e000:137::1:1              f2:3c:91:b0:a5:6f    em0 permanent R
fe80::1%em0                          00:05:73:a0:0f:ff    em0 23h59m56s S R
2600:3c00:e000:137::1                f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::3:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::2:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00::f03c:91ff:feb0:a56f       f2:3c:91:b0:a5:6f    em0 permanent R
fe80::f03c:91ff:feb0:a56f%em0        f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f   em0 permanent R


Also, I think they block ping on their routers, or at least I don't think it worked in Linux either.


root@roast:~ # ping ipv6.google.com
PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2607:f8b0:4023:1000::64
^C
--- ipv6.l.google.com ping6 statistics ---
24 packets transmitted, 0 packets received, 100.0% packet loss
root@roast:~ # ping6 2600:3c00::8678:acff:fe1c:ec41
PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2600:3c00::8678:acff:fe1c:ec41
^C
--- 2600:3c00::8678:acff:fe1c:ec41 ping6 statistics ---
10 packets transmitted, 0 packets received, 100.0% packet loss
root@roast:~ # ping6 fe80::1%em0
PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::1%em0
^C
--- fe80::1%em0 ping6 statistics ---
7 packets transmitted, 0 packets received, 100.0% packet loss
root@roast:~ # ping6 fe80::8678:acff:fe1c:ec41%em0
PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::8678:acff:fe1c:ec41%em0
^C
--- fe80::8678:acff:fe1c:ec41%em0 ping6 statistics ---
243 packets transmitted, 0 packets received, 100.0% packet loss
root@roast:~ #
Comment 4 Zhenlei Huang freebsd_committer freebsd_triage 2022-04-16 02:00:54 UTC
(In reply to wcarson.bugzilla from comment #3)
Try this:

# ndp -nr

# date; ndp -c; ping6 -c1 -t2 fe80::1%em0; ndp -na; echo; ping6 -c1 -t2 fe80::8678:acff:fe1c:ec41%em0; ndp -na;

and

# date; ndp -c; ping6 -c1 -t2 2607:f8b0:4023:1000::64; ndp -na;
Comment 5 wcarson.bugzilla 2022-04-16 17:15:35 UTC
Interestingly after issuing those commands it worked for about 65 seconds:

root@roast:~ # ndp -nr
fe80::1%em0 if=em0, flags=, pref=medium, expire=11s

root@roast:~ # date; ndp -c; ping6 -c1 -t2 fe80::1%em0; ndp -na; echo; ping6 -c1 -t2 fe80::8678:acff:fe1c:ec41%em0; ndp -na;

Sat Apr 16 12:11:06 CDT 2022

fe80::1%em0 (fe80::1%em0) deleted
fe80::8678:acff:fe1c:ec41%em0 (fe80::8678:acff:fe1c:ec41%em0) deleted


PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::1%em0

--- fe80::1%em0 ping6 statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss


Neighbor                             Linklayer Address  Netif Expire    S Flags
2600:3c00:e000:137::1:1              f2:3c:91:b0:a5:6f    em0 permanent R
fe80::1%em0                          00:05:73:a0:0f:ff    em0 13s       R R
2600:3c00:e000:137::1                f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::3:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::2:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00::f03c:91ff:feb0:a56f       f2:3c:91:b0:a5:6f    em0 permanent R
fe80::f03c:91ff:feb0:a56f%em0        f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f   em0 permanent R

PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::8678:acff:fe1c:ec41%em0

--- fe80::8678:acff:fe1c:ec41%em0 ping6 statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss


Neighbor                             Linklayer Address  Netif Expire    S Flags
2600:3c00:e000:137::1:1              f2:3c:91:b0:a5:6f    em0 permanent R
fe80::1%em0                          00:05:73:a0:0f:ff    em0 11s       R R
2600:3c00:e000:137::1                f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::3:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::2:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00::f03c:91ff:feb0:a56f       f2:3c:91:b0:a5:6f    em0 permanent R
fe80::f03c:91ff:feb0:a56f%em0        f2:3c:91:b0:a5:6f    em0 permanent R
fe80::8678:acff:fe1c:ec41%em0        84:78:ac:1c:ec:41    em0 13s       R R
2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f   em0 permanent R


root@roast:~ # ping6 kyoto.disillusion.net
PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2605:6400:10:968:22:da15:28a6:c800
16 bytes from 2605:6400:10:968:22:da15:28a6:c800, icmp_seq=0 hlim=53 time=42.126 ms
16 bytes from 2605:6400:10:968:22:da15:28a6:c800, icmp_seq=1 hlim=53 time=41.984 ms

*snip*

16 bytes from 2605:6400:10:968:22:da15:28a6:c800, icmp_seq=64 hlim=53 time=43.413 ms
^C
--- kyoto.disillusion.net ping6 statistics ---
110 packets transmitted, 64 packets received, 41.8% packet loss
round-trip min/avg/max/std-dev = 41.831/42.388/45.270/0.607 ms
Comment 6 wcarson.bugzilla 2022-04-16 17:17:40 UTC
But if I do the commands again, it doesn't even work for the ~65 seconds:

root@roast:~ # date ; ndp -c ; ping6 -c1 -t2 fe80::1%em0 ; ndp -na ; echo ; ping6 -c1 -t2 fe80::8678:acff:fe1c:ec41%em0 ; ndp -na ;
Sat Apr 16 12:16:50 CDT 2022
fe80::1%em0 (fe80::1%em0) deleted
fe80::8678:acff:fe1c:ec41%em0 (fe80::8678:acff:fe1c:ec41%em0) deleted
PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::1%em0

--- fe80::1%em0 ping6 statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss
Neighbor                             Linklayer Address  Netif Expire    S Flags
2600:3c00:e000:137::1:1              f2:3c:91:b0:a5:6f    em0 permanent R
fe80::1%em0                          00:05:73:a0:0f:ff    em0 13s       R R
2600:3c00:e000:137::1                f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::3:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::2:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00::f03c:91ff:feb0:a56f       f2:3c:91:b0:a5:6f    em0 permanent R
fe80::f03c:91ff:feb0:a56f%em0        f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f   em0 permanent R

PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::8678:acff:fe1c:ec41%em0

--- fe80::8678:acff:fe1c:ec41%em0 ping6 statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss
Neighbor                             Linklayer Address  Netif Expire    S Flags
2600:3c00:e000:137::1:1              f2:3c:91:b0:a5:6f    em0 permanent R
fe80::1%em0                          00:05:73:a0:0f:ff    em0 11s       R R
2600:3c00:e000:137::1                f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::3:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::2:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00::f03c:91ff:feb0:a56f       f2:3c:91:b0:a5:6f    em0 permanent R
fe80::f03c:91ff:feb0:a56f%em0        f2:3c:91:b0:a5:6f    em0 permanent R
fe80::8678:acff:fe1c:ec41%em0        84:78:ac:1c:ec:41    em0 13s       R R
2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f   em0 permanent R
root@roast:~ # ping6 kyoto.disillusion.net
PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2605:6400:10:968:22:da15:28a6:c800
^C
--- kyoto.disillusion.net ping6 statistics ---
11 packets transmitted, 0 packets received, 100.0% packet loss
Comment 7 Zhenlei Huang freebsd_committer freebsd_triage 2022-04-20 10:29:51 UTC
It looks like the upstream router's neighbor cache entry expired and then it issued Neighbor Solicitation messages but the host ignored them.

Can you please confirm that there're no firewall rules on your FreeBSD host blocking the NS messages? You can disable IPFW / PF / IPF and reboot to get a clean environment.

Check PF:
# pfctl -s Running

For IPFW:
# ipfw show

And also check statistics for ICMP6:
# netstat -sp icmp6
Comment 8 wcarson.bugzilla 2022-04-20 21:08:17 UTC
I do have these lines in my pf.conf, which have worked for the past many years and not changed. (I also double-checked by comparing to a backup from 2020.) 

    icmp6_types="{ 2, 128 }" # packet too big, echo request (ping6)
    # Neighbor Discovery Protocol (NDP) (types 133-137):
    #   Router Solicitation (RS), Router Advertisement (RA)
    #   Neighbor Solicitation (NS), Neighbor Advertisement (NA)
    #   Route Redirection
    icmp6_types_ext_if="{ 128, 133, 134, 135, 136, 137 }"

    pass in quick on $ext_if inet6 proto ipv6-icmp icmp6-type $icmp6_types keep state
    pass in quick on $ext_if inet6 proto ipv6-icmp from any to { $ext_if, ff02::1/16 } icmp6-type $icmp6_types_ext_if keep state


Additionally, I turned off pf completely (via /etc/rc.conf, pf_enable="NO", and rebooted) -- no change.

root@roast:~ # pfctl -d
pf disabled
root@roast:~ # ping6 kyoto.disillusion.net
PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2605:6400:10:968:22:da15:28a6:c800
^C
--- kyoto.disillusion.net ping6 statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss
root@roast:~ # date ; ndp -c ; ping6 -c1 -t2 fe80::1%em0 ; ndp -na ; echo ; ping6 -c1 -t2 fe80::8678:acff:fe1c:ec41%em0 ; ndp -na ;
Wed Apr 20 16:05:40 CDT 2022
fe80::1%em0 (fe80::1%em0) deleted
fe80::8678:acff:fe1c:ec41%em0 (fe80::8678:acff:fe1c:ec41%em0) deleted
fe80::e6c7:22ff:fe10:9cc1%em0 (fe80::e6c7:22ff:fe10:9cc1%em0) deleted
PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::1%em0

--- fe80::1%em0 ping6 statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss
Neighbor                             Linklayer Address  Netif Expire    S Flags
2600:3c00:e000:137::1:1              f2:3c:91:b0:a5:6f    em0 permanent R
fe80::1%em0                          00:05:73:a0:0f:ff    em0 23h59m58s S R
2600:3c00:e000:137::1                f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::3:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::2:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00::f03c:91ff:feb0:a56f       f2:3c:91:b0:a5:6f    em0 permanent R
fe80::f03c:91ff:feb0:a56f%em0        f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f   em0 permanent R

PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::8678:acff:fe1c:ec41%em0

--- fe80::8678:acff:fe1c:ec41%em0 ping6 statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss
Neighbor                             Linklayer Address  Netif Expire    S Flags
2600:3c00:e000:137::1:1              f2:3c:91:b0:a5:6f    em0 permanent R
fe80::1%em0                          00:05:73:a0:0f:ff    em0 23h59m56s S R
2600:3c00:e000:137::1                f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::3:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00:e000:137::2:1              f2:3c:91:b0:a5:6f    em0 permanent R
2600:3c00::f03c:91ff:feb0:a56f       f2:3c:91:b0:a5:6f    em0 permanent R
fe80::f03c:91ff:feb0:a56f%em0        f2:3c:91:b0:a5:6f    em0 permanent R
fe80::8678:acff:fe1c:ec41%em0        84:78:ac:1c:ec:41    em0 16s       R R
2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f   em0 permanent R
root@roast:~ # ping6 kyoto.disillusion.net
PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2605:6400:10:968:22:da15:28a6:c800
^C
--- kyoto.disillusion.net ping6 statistics ---
6 packets transmitted, 0 packets received, 100.0% packet loss
root@roast:~ # ping6 ipv6.google.com
PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2607:f8b0:4023:1000::71
^C
--- ipv6.l.google.com ping6 statistics ---
5 packets transmitted, 0 packets received, 100.0% packet loss

It seems it thinks there are lots of bad Neighbor Solicitation messages? Is there a way to understand why it thinks they're bad?


root@roast:~ # netstat -sp icmp6
icmp6:
        1717 calls to icmp6_error
        0 errors not generated in response to an icmp6 message
        0 errors not generated because of rate limitation
        Output histogram:
                unreach: 1717
                echo: 82607
                echo reply: 3
                neighbor solicitation: 8200
                neighbor advertisement: 1120
                MLDv2 listener report: 4
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        Input histogram:
                unreach: 1715
                echo: 3
                echo reply: 105
                router advertisement: 485020
                neighbor solicitation: 359208
                neighbor advertisement: 8191
        Histogram of error messages to be generated:
                0 no route
                0 administratively prohibited
                0 beyond scope
                0 address unreachable
                1717 port unreachable
                0 packet too big
                0 time exceed transit
                0 time exceed reassembly
                0 erroneous header field
                0 unrecognized next header
                0 unrecognized option
                0 redirect
                0 unknown
        3 message responses generated
        0 messages with too many ND options
        0 messages with bad ND options
        357910 bad neighbor solicitation messages   <-----
        0 bad neighbor advertisement messages
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 bad redirect messages
        0 default routers overflows
        0 prefix overflows
        0 neighbour entries overflows
        0 redirect overflows
        0 messages with invalid hop limit
        0 path MTU changes
Comment 9 Zhenlei Huang freebsd_committer freebsd_triage 2022-04-21 02:15:23 UTC
(In reply to wcarson.bugzilla from comment #8)
> It seems it thinks there are lots of bad Neighbor Solicitation messages? 

neighbor solicitation: 359208
357910 bad neighbor solicitation messages

It is about 99.64% bad NS messages. Looks weird.

> Is there a way to understand why it thinks they're bad?

Yes, there is a sysctl knob 'net.inet6.icmp6.nd6_debug' which is default off. You can turn it on.
# sysctl net.inet6.icmp6.nd6_debug=1

And then monitor the log from kernel:
# tail -F /var/log/messages
Comment 10 wcarson.bugzilla 2022-04-21 05:50:54 UTC
I’m not sure what to make of this, but it does seem to be what I described originally:

Apr 21 00:39:53 roast kernel: nd6_ns_input: NS packet from non-neighbor
Apr 21 00:39:53 roast kernel: nd6_ns_input: src=2600:3c00::8678:acff:fe1c:ec41
Apr 21 00:39:53 roast kernel: nd6_ns_input: dst=ff02:1::1:ffb0:a56f
Apr 21 00:39:53 roast kernel: nd6_ns_input: tgt=2600:3c00::f03c:91ff:feb0:a56f

However in the above debug message it added an extra :1 after ff02 in the destination that does not appear in the tcpdump. Is that normal?

Also how does it decide what is a non-neighbor? The src & tgt look to be on the same /64 to me.
Comment 11 Zhenlei Huang freebsd_committer freebsd_triage 2022-04-21 09:15:06 UTC
(In reply to wcarson.bugzilla from comment #10)

> However in the above debug message it added an extra :1 after ff02 in the 
> destination that does not appear in the tcpdump. Is that normal?
It may be the embedded form of IPv6 link-local scoped address, see https://docs.freebsd.org/en/books/developers-handbook/ipv6/#ipv6-scope-index .


If your host is not for production, try turn on 'net.inet6.icmp6.nd6_onlink_ns_rfc4861' to see if it helps.

# sysctl net.inet6.icmp6.nd6_onlink_ns_rfc4861=1

Be aware that knob is to prevent CVE-2008-2476, see also https://www.freebsd.org/security/advisories/FreeBSD-SA-08:10.nd6.asc.

The symptom is weird, and I could not reproduce it.

Do you have multiple fibs? Check these:
# sysctl net.fibs
# sysctl net.add_addr_allfibs
# ifconfig em0 | grep fib

It will also be helpful if you provide traffic dumps.
# tcpdump -nvi em0 'icmp6' -w dump.pcap
and then 
# service rtsold restart && sleep 3 && ndp -c && ping6 ipv6.google.com
Comment 12 Zhenlei Huang freebsd_committer freebsd_triage 2022-04-21 10:18:35 UTC
(In reply to Zhenlei Huang from comment #11)
I'm not sure if it is same as wcarson.bugzilla 's situation, I managed to reproduce 'nd6_ns_input: NS packet from non-neighbor' by setting up a router that advertising prefixes without 'on-link' flag.

I'll confirm when @wcarson.bugzilla share the traffic dumps.
Comment 13 wcarson.bugzilla 2022-04-21 13:59:01 UTC
I'm not sure how to answer your FIB questions, but here are the results of the commands:

root@roast:~ # sysctl net.fibs
net.fibs: 1
root@roast:~ # sysctl net.add_addr_allfibs
net.add_addr_allfibs: 0
root@roast:~ # ifconfig em0 | grep fib
root@roast:~ #

Also as soon as I changed net.inet6.icmp6.nd6_onlink_ns_rfc4861 -> 1, it started working. I find this very surprising because I hadn't even provisioned this server yet in 2008, and IPv6 had worked for many years up until just recently. Is the on-link flag a setting my provider could have changed?

Changing back net.inet6.icmp6.nd6_onlink_ns_rfc4861 -> 0 I expected it to break so I could take a packet capture, and ... well, even after more than 15 minutes it's still working. 

I did still have this in my scrollback, but I don't know if it has the data you're looking for since it's not a full capture (tcpdump -nnvvv):

00:43:51.408006 IP6 (class 0xe0, hlim 255, next-header ICMPv6 (58) payload length: 32) 2600:3c00::e6c7:22ff:fe10:9cc1 > ff02::1:ffb0:a56f: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2600:3c00::f03c:91ff:feb0:a56f
          source link-address option (1), length 8 (1): e4:c7:22:10:9c:c1
            0x0000:  e4c7 2210 9cc1
00:43:51.408416 IP6 (class 0xe0, hlim 255, next-header ICMPv6 (58) payload length: 32) 2600:3c00::e6c7:22ff:fe10:9cc1 > ff02::1:ffb0:a56f: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2600:3c00::f03c:91ff:feb0:a56f
          source link-address option (1), length 8 (1): e4:c7:22:10:9c:c1
            0x0000:  e4c7 2210 9cc1

I will keep a close eye on this and try a reboot to see if it persists. If it breaks again I will take a packet capture as described.

Thank you so much for all your help thus far!
Comment 14 wcarson.bugzilla 2022-04-22 13:36:06 UTC
Ok, after a reboot the problem comes back. It seems to work very briefly and stops. (This is with nd6_onlink_ns_rfc4861 set to 0.)

root@roast:~ # service rtsold onerestart && sleep 3 && ndp -c && ping6 ipv6.google.com
rtsold not running? (check /var/run/rtsold.pid).
Starting rtsold.
fe80::1%em0 (fe80::1%em0) deleted
PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2607:f8b0:4000:805::200e
16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=0 hlim=121 time=74.918 ms
16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=1 hlim=121 time=1.429 ms
16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=2 hlim=121 time=1.257 ms
16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=3 hlim=121 time=1.309 ms
16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=4 hlim=121 time=1.316 ms
16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=5 hlim=121 time=1.328 ms
16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=6 hlim=121 time=1.376 ms
16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=7 hlim=121 time=1.335 ms
16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=8 hlim=121 time=1.374 ms
^C
--- ipv6.l.google.com ping6 statistics ---
25 packets transmitted, 9 packets received, 64.0% packet loss
round-trip min/avg/max/std-dev = 1.257/9.516/74.918/23.123 ms
root@roast:~ #
Comment 15 wcarson.bugzilla 2022-04-22 13:36:36 UTC
Created attachment 233394 [details]
Packet capture
Comment 16 Zhenlei Huang freebsd_committer freebsd_triage 2022-04-25 04:42:40 UTC
(In reply to wcarson.bugzilla from comment #15)
From the pcap you provided, I see your upstream router is advertising prefix without the 'on-link' flag. It is basically same as my testing environment #12 except that your provider utilize HSRP to achieve first hop router failover.

I can confirm CentOS 8 work greatly with such case. It seems Linux is not affected by CVE-2008-2476.

The problem is a little complicated. If it is a good practice for network admin to advertise IPv6 prefix without 'on-link' flag, I think the problem will be common eventually. As `net.inet6.icmp6.nd6_onlink_ns_rfc4861` has some side effect, I wonder if there is better solution to resolve CVE-2008-2476.

The problem affects 12.3, 13.0, 13.1-RC4, stable/13 and current. To work around it, set 'net.inet6.icmp6.nd6_onlink_ns_rfc4861' to none-zero.

As IPv6 addresses is sufficient, most cloud providers provide at least a single dedicated /64 block to the customer. In this case I think CVE-2008-2476 could not happen, thus it is safe for providers to advertise prefixes with 'on-link' flag, or for FreeBSD users to change `net.inet6.icmp6.nd6_onlink_ns_rfc4861` to none-zero.

@wcarson.bugzilla you can contact your provider to confirm if the prefix 2600:3c00::/64 is dedicated for your host.
Comment 17 wcarson.bugzilla 2022-04-26 01:25:14 UTC
(In reply to Zhenlei Huang from comment #16)

Hmm, I don't think 2600:3c00::/64 is dedicated to my host, however 2600:3c00:e000:0137::/64 is. Here is the response I got back from my provider:

#####
To be perfectly clear: 2600:3c00::/64 is shared insofar as other customers have VMs with IPv6 addresses on the same IPv6 subnet. Your SLAAC-assigned IPv6 address - 2600:3c00::f03c:91ff:feb0:a56f - is a /128 range, which essentially makes it a single IPv6 address which only your VM can use. The addresses within your /64 range - 2600:3c00:e000:0137::/64 - can only be used by the VM it's routed towards.
#####
Comment 18 Zhenlei Huang freebsd_committer freebsd_triage 2022-04-26 03:42:20 UTC
(In reply to wcarson.bugzilla from comment #17)

> To be perfectly clear: 2600:3c00::/64 is shared insofar as other customers have VMs
> with IPv6 addresses on the same IPv6 subnet. Your SLAAC-assigned IPv6 address -
> 2600:3c00::f03c:91ff:feb0:a56f - is a /128 range, which essentially makes it a
> single IPv6 address which only your VM can use.

If other customers do NOT send spoofed NS packets, or your provider has means to prevent spoofed NS packets, then it is safe to turn on 'net.inet6.icmp6.nd6_onlink_ns_rfc4861'. 

> The addresses within your /64 range - 2600:3c00:e000:0137::/64 - can only be used
> by the VM it's routed towards.

Since your provider provided routed /64 block, the upstream router should have route to this /64 block, then the SLAAC-assigned IPv6 address 2600:3c00::f03c:91ff:feb0:a56f is not required, and you can use 2600:3c00:e000:0137::/64 directly. You can keep 'nd6_onlink_ns_rfc4861' untouched and try this:

# ping6 -S 2600:3c00:e000:0137:cafe:8a2e:0370:7334 ipv6.google.com

or disable SLAAC:

# service rtsold stop
# ifconfig em0 inet6 -accept_rtadv
# ifconfig em0 inet6 2600:3c00::f03c:91ff:feb0:a56f delete
# route -6 get default || route -6 add default fe80::1%em0
# ping6 ipv6.google.com
Comment 19 wcarson.bugzilla 2022-04-26 14:06:04 UTC
(In reply to Zhenlei Huang from comment #18)

I've asked for them to describe any technology in place to mitigating spoofed NS messages, but I've not yet heard back.

Unfortunately I think the dedicated /64 is routed to my SLAAC address, as it times out after awhile if I remove the address.
Comment 20 wcarson.bugzilla 2022-04-27 02:11:45 UTC
I was able to confirm there is filtering in place to prevent NS spoofing, so at least for me the resolution is the sysctl tunable. Thank you very much for your help figuring this out!

I do wonder if this will become a common issue, but you're much more capable of determining that and the appropriate resolution than I am :) If I can provide any more data, I'm certainly happy to.
Comment 21 Robert Blayzor 2024-05-30 02:54:01 UTC
I have run into this issue now a few times. I have seen hosts try to ping my IPv6 hosts from off network and they just hang. Upon investigation I have found that the host is hung up on ignoring NS messages from the router, even though on the host we have NDP entries for both the link local and the global IP address. ie:

# ndp -an | grep lagg1 | grep 00:09:0f
2607:f058:xx::1                      00:09:0f:09:00:01  lagg1 23h34m17s S R
fe80::209:fff:fe09:1%lagg1           00:09:0f:09:00:01  lagg1 23h33m53s S R



PCAP shows NS messages from the router, but there is zero response, the host just ignores them. No firewall enabled at all....

I can ping from other hosts on the same subnet, that seems to work.

The kicker is, if I ping6 FROM the host to the router it takes about 5 seconds (give or take) and then you're able to ping the gateway again. Once this happens, packets from remote are able to ping and traffic flows again.

If I stop sending traffic and let things sit for about a minute, the process repeats again. NDP sol messages from the router are ignored again and remain broken until I ping the router from the host again.

If I keep a continuous ping from a host off link, it will never fail. This seems to be some type of NDP timeout/cache issue.

I have tried setting: net.inet6.icmp6.nd6_onlink_ns_rfc4861=1. but that does not seem to solve the problem.

I am currently seeing this on 13.1-RELEASE-p9 which is on a TrueNAS host. While I realize 13.3 is current, TrueNAS seems to lag a little behind. I do have other TrueNAS hosts running this version that don't seem to experience this issue. (at least I've not reliably reproduced it on other machines)

I have tried just rebooting the host, but I CAN reliably reproduce this issue.

I have no other ND issues from the router to other hosts on this network. I have confirmed the host *is* receiving the NS messages; it just never replies..
Comment 22 Florent Delahaye 2024-08-02 18:14:48 UTC
Hello,

I have got the same issue with the same symptoms as OP and using net.inet6.icmp6.nd6_onlink_ns_rfc4861=1 trick solved it too.

I have got a router aka router1 announcing a prefix that freebsd (and other hosts) uses with SLAAC. There is another router aka router2 using another prefix only announcing a route to itself (its prefix is not announced since no SLAAC or DHCP is expected on that prefix). All devices (routers + host) are using the same segment.
Freebsd host gets all RA and route table is properly populated. If I try a ping from freebsd host to router2 then router2 sends back a NS and freebsd host never replies.

I have not checked the code but https://www.freebsd.org/security/advisories/FreeBSD-SA-08:10.nd6.asc says "The solution described below causes IPv6 Neighbor Discovery Neighbor Solicitation messages from non-neighbors to be ignored"
-> It seems the patch misinterprets the definition of a neighbor since all hosts sharing a segment are neighbors.

FYI no issues with Linux/Windows/Android stacks.

Florent