Summary: | IPv6 system not responding to Neighbor Solicitation | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | wcarson.bugzilla | ||||
Component: | kern | Assignee: | freebsd-net (Nobody) <net> | ||||
Status: | New --- | ||||||
Severity: | Affects Only Me | CC: | bugfreebsd, colin, grahamperrin, rblayzor, zlei | ||||
Priority: | --- | ||||||
Version: | 13.0-RELEASE | ||||||
Hardware: | amd64 | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
wcarson.bugzilla
2022-04-15 00:01:44 UTC
The `ipv6_privacy` is meant to use and prefer privacy addresses as per RFC 4941. It should not has any relationships to this issue. The `ndp -na` result shows that the neighbor cache entry for your router / gw is stale. fe80::1%em0 00:05:73:a0:0f:ff em0 23h56m36s S R <--- ... fe80::8678:acff:fe1c:ec41%em0 84:78:ac:1c:ec:41 em0 23h49m7s S R <--- Can you please try to ping your gw and share the ndp result? ``` # ping6 fe80::1%em0 and # ping6 fe80::8678:acff:fe1c:ec41%em0 ``` (In reply to Zhenlei Huang from comment #1) And also try to ping gw's unicast IPv6 address, and see if it is reachable. # ping6 2600:3c00::8678:acff:fe1c:ec41 The router seems to be immediately stale. If I clear all entries (ndp -c), they show up but are stale: root@roast:~ # date ; ndp -c ; echo ; date ; ndp -na ; echo ; sleep 5 ; date ; ndp -na Fri Apr 15 12:01:12 CDT 2022 fe80::1%em0 (fe80::1%em0) deleted Fri Apr 15 12:01:12 CDT 2022 Neighbor Linklayer Address Netif Expire S Flags 2600:3c00:e000:137::1:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::3:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::2:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00::f03c:91ff:feb0:a56f f2:3c:91:b0:a5:6f em0 permanent R fe80::f03c:91ff:feb0:a56f%em0 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f em0 permanent R Fri Apr 15 12:01:17 CDT 2022 Neighbor Linklayer Address Netif Expire S Flags 2600:3c00:e000:137::1:1 f2:3c:91:b0:a5:6f em0 permanent R fe80::1%em0 00:05:73:a0:0f:ff em0 23h59m56s S R 2600:3c00:e000:137::1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::3:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::2:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00::f03c:91ff:feb0:a56f f2:3c:91:b0:a5:6f em0 permanent R fe80::f03c:91ff:feb0:a56f%em0 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f em0 permanent R Also, I think they block ping on their routers, or at least I don't think it worked in Linux either. root@roast:~ # ping ipv6.google.com PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2607:f8b0:4023:1000::64 ^C --- ipv6.l.google.com ping6 statistics --- 24 packets transmitted, 0 packets received, 100.0% packet loss root@roast:~ # ping6 2600:3c00::8678:acff:fe1c:ec41 PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2600:3c00::8678:acff:fe1c:ec41 ^C --- 2600:3c00::8678:acff:fe1c:ec41 ping6 statistics --- 10 packets transmitted, 0 packets received, 100.0% packet loss root@roast:~ # ping6 fe80::1%em0 PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::1%em0 ^C --- fe80::1%em0 ping6 statistics --- 7 packets transmitted, 0 packets received, 100.0% packet loss root@roast:~ # ping6 fe80::8678:acff:fe1c:ec41%em0 PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::8678:acff:fe1c:ec41%em0 ^C --- fe80::8678:acff:fe1c:ec41%em0 ping6 statistics --- 243 packets transmitted, 0 packets received, 100.0% packet loss root@roast:~ # (In reply to wcarson.bugzilla from comment #3) Try this: # ndp -nr # date; ndp -c; ping6 -c1 -t2 fe80::1%em0; ndp -na; echo; ping6 -c1 -t2 fe80::8678:acff:fe1c:ec41%em0; ndp -na; and # date; ndp -c; ping6 -c1 -t2 2607:f8b0:4023:1000::64; ndp -na; Interestingly after issuing those commands it worked for about 65 seconds: root@roast:~ # ndp -nr fe80::1%em0 if=em0, flags=, pref=medium, expire=11s root@roast:~ # date; ndp -c; ping6 -c1 -t2 fe80::1%em0; ndp -na; echo; ping6 -c1 -t2 fe80::8678:acff:fe1c:ec41%em0; ndp -na; Sat Apr 16 12:11:06 CDT 2022 fe80::1%em0 (fe80::1%em0) deleted fe80::8678:acff:fe1c:ec41%em0 (fe80::8678:acff:fe1c:ec41%em0) deleted PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::1%em0 --- fe80::1%em0 ping6 statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss Neighbor Linklayer Address Netif Expire S Flags 2600:3c00:e000:137::1:1 f2:3c:91:b0:a5:6f em0 permanent R fe80::1%em0 00:05:73:a0:0f:ff em0 13s R R 2600:3c00:e000:137::1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::3:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::2:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00::f03c:91ff:feb0:a56f f2:3c:91:b0:a5:6f em0 permanent R fe80::f03c:91ff:feb0:a56f%em0 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f em0 permanent R PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::8678:acff:fe1c:ec41%em0 --- fe80::8678:acff:fe1c:ec41%em0 ping6 statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss Neighbor Linklayer Address Netif Expire S Flags 2600:3c00:e000:137::1:1 f2:3c:91:b0:a5:6f em0 permanent R fe80::1%em0 00:05:73:a0:0f:ff em0 11s R R 2600:3c00:e000:137::1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::3:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::2:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00::f03c:91ff:feb0:a56f f2:3c:91:b0:a5:6f em0 permanent R fe80::f03c:91ff:feb0:a56f%em0 f2:3c:91:b0:a5:6f em0 permanent R fe80::8678:acff:fe1c:ec41%em0 84:78:ac:1c:ec:41 em0 13s R R 2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f em0 permanent R root@roast:~ # ping6 kyoto.disillusion.net PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2605:6400:10:968:22:da15:28a6:c800 16 bytes from 2605:6400:10:968:22:da15:28a6:c800, icmp_seq=0 hlim=53 time=42.126 ms 16 bytes from 2605:6400:10:968:22:da15:28a6:c800, icmp_seq=1 hlim=53 time=41.984 ms *snip* 16 bytes from 2605:6400:10:968:22:da15:28a6:c800, icmp_seq=64 hlim=53 time=43.413 ms ^C --- kyoto.disillusion.net ping6 statistics --- 110 packets transmitted, 64 packets received, 41.8% packet loss round-trip min/avg/max/std-dev = 41.831/42.388/45.270/0.607 ms But if I do the commands again, it doesn't even work for the ~65 seconds: root@roast:~ # date ; ndp -c ; ping6 -c1 -t2 fe80::1%em0 ; ndp -na ; echo ; ping6 -c1 -t2 fe80::8678:acff:fe1c:ec41%em0 ; ndp -na ; Sat Apr 16 12:16:50 CDT 2022 fe80::1%em0 (fe80::1%em0) deleted fe80::8678:acff:fe1c:ec41%em0 (fe80::8678:acff:fe1c:ec41%em0) deleted PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::1%em0 --- fe80::1%em0 ping6 statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss Neighbor Linklayer Address Netif Expire S Flags 2600:3c00:e000:137::1:1 f2:3c:91:b0:a5:6f em0 permanent R fe80::1%em0 00:05:73:a0:0f:ff em0 13s R R 2600:3c00:e000:137::1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::3:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::2:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00::f03c:91ff:feb0:a56f f2:3c:91:b0:a5:6f em0 permanent R fe80::f03c:91ff:feb0:a56f%em0 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f em0 permanent R PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::8678:acff:fe1c:ec41%em0 --- fe80::8678:acff:fe1c:ec41%em0 ping6 statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss Neighbor Linklayer Address Netif Expire S Flags 2600:3c00:e000:137::1:1 f2:3c:91:b0:a5:6f em0 permanent R fe80::1%em0 00:05:73:a0:0f:ff em0 11s R R 2600:3c00:e000:137::1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::3:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::2:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00::f03c:91ff:feb0:a56f f2:3c:91:b0:a5:6f em0 permanent R fe80::f03c:91ff:feb0:a56f%em0 f2:3c:91:b0:a5:6f em0 permanent R fe80::8678:acff:fe1c:ec41%em0 84:78:ac:1c:ec:41 em0 13s R R 2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f em0 permanent R root@roast:~ # ping6 kyoto.disillusion.net PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2605:6400:10:968:22:da15:28a6:c800 ^C --- kyoto.disillusion.net ping6 statistics --- 11 packets transmitted, 0 packets received, 100.0% packet loss It looks like the upstream router's neighbor cache entry expired and then it issued Neighbor Solicitation messages but the host ignored them. Can you please confirm that there're no firewall rules on your FreeBSD host blocking the NS messages? You can disable IPFW / PF / IPF and reboot to get a clean environment. Check PF: # pfctl -s Running For IPFW: # ipfw show And also check statistics for ICMP6: # netstat -sp icmp6 I do have these lines in my pf.conf, which have worked for the past many years and not changed. (I also double-checked by comparing to a backup from 2020.) icmp6_types="{ 2, 128 }" # packet too big, echo request (ping6) # Neighbor Discovery Protocol (NDP) (types 133-137): # Router Solicitation (RS), Router Advertisement (RA) # Neighbor Solicitation (NS), Neighbor Advertisement (NA) # Route Redirection icmp6_types_ext_if="{ 128, 133, 134, 135, 136, 137 }" pass in quick on $ext_if inet6 proto ipv6-icmp icmp6-type $icmp6_types keep state pass in quick on $ext_if inet6 proto ipv6-icmp from any to { $ext_if, ff02::1/16 } icmp6-type $icmp6_types_ext_if keep state Additionally, I turned off pf completely (via /etc/rc.conf, pf_enable="NO", and rebooted) -- no change. root@roast:~ # pfctl -d pf disabled root@roast:~ # ping6 kyoto.disillusion.net PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2605:6400:10:968:22:da15:28a6:c800 ^C --- kyoto.disillusion.net ping6 statistics --- 2 packets transmitted, 0 packets received, 100.0% packet loss root@roast:~ # date ; ndp -c ; ping6 -c1 -t2 fe80::1%em0 ; ndp -na ; echo ; ping6 -c1 -t2 fe80::8678:acff:fe1c:ec41%em0 ; ndp -na ; Wed Apr 20 16:05:40 CDT 2022 fe80::1%em0 (fe80::1%em0) deleted fe80::8678:acff:fe1c:ec41%em0 (fe80::8678:acff:fe1c:ec41%em0) deleted fe80::e6c7:22ff:fe10:9cc1%em0 (fe80::e6c7:22ff:fe10:9cc1%em0) deleted PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::1%em0 --- fe80::1%em0 ping6 statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss Neighbor Linklayer Address Netif Expire S Flags 2600:3c00:e000:137::1:1 f2:3c:91:b0:a5:6f em0 permanent R fe80::1%em0 00:05:73:a0:0f:ff em0 23h59m58s S R 2600:3c00:e000:137::1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::3:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::2:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00::f03c:91ff:feb0:a56f f2:3c:91:b0:a5:6f em0 permanent R fe80::f03c:91ff:feb0:a56f%em0 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f em0 permanent R PING6(56=40+8+8 bytes) fe80::f03c:91ff:feb0:a56f%em0 --> fe80::8678:acff:fe1c:ec41%em0 --- fe80::8678:acff:fe1c:ec41%em0 ping6 statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss Neighbor Linklayer Address Netif Expire S Flags 2600:3c00:e000:137::1:1 f2:3c:91:b0:a5:6f em0 permanent R fe80::1%em0 00:05:73:a0:0f:ff em0 23h59m56s S R 2600:3c00:e000:137::1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::3:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00:e000:137::2:1 f2:3c:91:b0:a5:6f em0 permanent R 2600:3c00::f03c:91ff:feb0:a56f f2:3c:91:b0:a5:6f em0 permanent R fe80::f03c:91ff:feb0:a56f%em0 f2:3c:91:b0:a5:6f em0 permanent R fe80::8678:acff:fe1c:ec41%em0 84:78:ac:1c:ec:41 em0 16s R R 2600:3c00:e000:137:cafe:8a2e:370:7334 f2:3c:91:b0:a5:6f em0 permanent R root@roast:~ # ping6 kyoto.disillusion.net PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2605:6400:10:968:22:da15:28a6:c800 ^C --- kyoto.disillusion.net ping6 statistics --- 6 packets transmitted, 0 packets received, 100.0% packet loss root@roast:~ # ping6 ipv6.google.com PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2607:f8b0:4023:1000::71 ^C --- ipv6.l.google.com ping6 statistics --- 5 packets transmitted, 0 packets received, 100.0% packet loss It seems it thinks there are lots of bad Neighbor Solicitation messages? Is there a way to understand why it thinks they're bad? root@roast:~ # netstat -sp icmp6 icmp6: 1717 calls to icmp6_error 0 errors not generated in response to an icmp6 message 0 errors not generated because of rate limitation Output histogram: unreach: 1717 echo: 82607 echo reply: 3 neighbor solicitation: 8200 neighbor advertisement: 1120 MLDv2 listener report: 4 0 messages with bad code fields 0 messages < minimum length 0 bad checksums 0 messages with bad length Input histogram: unreach: 1715 echo: 3 echo reply: 105 router advertisement: 485020 neighbor solicitation: 359208 neighbor advertisement: 8191 Histogram of error messages to be generated: 0 no route 0 administratively prohibited 0 beyond scope 0 address unreachable 1717 port unreachable 0 packet too big 0 time exceed transit 0 time exceed reassembly 0 erroneous header field 0 unrecognized next header 0 unrecognized option 0 redirect 0 unknown 3 message responses generated 0 messages with too many ND options 0 messages with bad ND options 357910 bad neighbor solicitation messages <----- 0 bad neighbor advertisement messages 0 bad router solicitation messages 0 bad router advertisement messages 0 bad redirect messages 0 default routers overflows 0 prefix overflows 0 neighbour entries overflows 0 redirect overflows 0 messages with invalid hop limit 0 path MTU changes (In reply to wcarson.bugzilla from comment #8) > It seems it thinks there are lots of bad Neighbor Solicitation messages? neighbor solicitation: 359208 357910 bad neighbor solicitation messages It is about 99.64% bad NS messages. Looks weird. > Is there a way to understand why it thinks they're bad? Yes, there is a sysctl knob 'net.inet6.icmp6.nd6_debug' which is default off. You can turn it on. # sysctl net.inet6.icmp6.nd6_debug=1 And then monitor the log from kernel: # tail -F /var/log/messages I’m not sure what to make of this, but it does seem to be what I described originally: Apr 21 00:39:53 roast kernel: nd6_ns_input: NS packet from non-neighbor Apr 21 00:39:53 roast kernel: nd6_ns_input: src=2600:3c00::8678:acff:fe1c:ec41 Apr 21 00:39:53 roast kernel: nd6_ns_input: dst=ff02:1::1:ffb0:a56f Apr 21 00:39:53 roast kernel: nd6_ns_input: tgt=2600:3c00::f03c:91ff:feb0:a56f However in the above debug message it added an extra :1 after ff02 in the destination that does not appear in the tcpdump. Is that normal? Also how does it decide what is a non-neighbor? The src & tgt look to be on the same /64 to me. (In reply to wcarson.bugzilla from comment #10) > However in the above debug message it added an extra :1 after ff02 in the > destination that does not appear in the tcpdump. Is that normal? It may be the embedded form of IPv6 link-local scoped address, see https://docs.freebsd.org/en/books/developers-handbook/ipv6/#ipv6-scope-index . If your host is not for production, try turn on 'net.inet6.icmp6.nd6_onlink_ns_rfc4861' to see if it helps. # sysctl net.inet6.icmp6.nd6_onlink_ns_rfc4861=1 Be aware that knob is to prevent CVE-2008-2476, see also https://www.freebsd.org/security/advisories/FreeBSD-SA-08:10.nd6.asc. The symptom is weird, and I could not reproduce it. Do you have multiple fibs? Check these: # sysctl net.fibs # sysctl net.add_addr_allfibs # ifconfig em0 | grep fib It will also be helpful if you provide traffic dumps. # tcpdump -nvi em0 'icmp6' -w dump.pcap and then # service rtsold restart && sleep 3 && ndp -c && ping6 ipv6.google.com (In reply to Zhenlei Huang from comment #11) I'm not sure if it is same as wcarson.bugzilla 's situation, I managed to reproduce 'nd6_ns_input: NS packet from non-neighbor' by setting up a router that advertising prefixes without 'on-link' flag. I'll confirm when @wcarson.bugzilla share the traffic dumps. I'm not sure how to answer your FIB questions, but here are the results of the commands: root@roast:~ # sysctl net.fibs net.fibs: 1 root@roast:~ # sysctl net.add_addr_allfibs net.add_addr_allfibs: 0 root@roast:~ # ifconfig em0 | grep fib root@roast:~ # Also as soon as I changed net.inet6.icmp6.nd6_onlink_ns_rfc4861 -> 1, it started working. I find this very surprising because I hadn't even provisioned this server yet in 2008, and IPv6 had worked for many years up until just recently. Is the on-link flag a setting my provider could have changed? Changing back net.inet6.icmp6.nd6_onlink_ns_rfc4861 -> 0 I expected it to break so I could take a packet capture, and ... well, even after more than 15 minutes it's still working. I did still have this in my scrollback, but I don't know if it has the data you're looking for since it's not a full capture (tcpdump -nnvvv): 00:43:51.408006 IP6 (class 0xe0, hlim 255, next-header ICMPv6 (58) payload length: 32) 2600:3c00::e6c7:22ff:fe10:9cc1 > ff02::1:ffb0:a56f: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2600:3c00::f03c:91ff:feb0:a56f source link-address option (1), length 8 (1): e4:c7:22:10:9c:c1 0x0000: e4c7 2210 9cc1 00:43:51.408416 IP6 (class 0xe0, hlim 255, next-header ICMPv6 (58) payload length: 32) 2600:3c00::e6c7:22ff:fe10:9cc1 > ff02::1:ffb0:a56f: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2600:3c00::f03c:91ff:feb0:a56f source link-address option (1), length 8 (1): e4:c7:22:10:9c:c1 0x0000: e4c7 2210 9cc1 I will keep a close eye on this and try a reboot to see if it persists. If it breaks again I will take a packet capture as described. Thank you so much for all your help thus far! Ok, after a reboot the problem comes back. It seems to work very briefly and stops. (This is with nd6_onlink_ns_rfc4861 set to 0.) root@roast:~ # service rtsold onerestart && sleep 3 && ndp -c && ping6 ipv6.google.com rtsold not running? (check /var/run/rtsold.pid). Starting rtsold. fe80::1%em0 (fe80::1%em0) deleted PING6(56=40+8+8 bytes) 2600:3c00::f03c:91ff:feb0:a56f --> 2607:f8b0:4000:805::200e 16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=0 hlim=121 time=74.918 ms 16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=1 hlim=121 time=1.429 ms 16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=2 hlim=121 time=1.257 ms 16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=3 hlim=121 time=1.309 ms 16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=4 hlim=121 time=1.316 ms 16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=5 hlim=121 time=1.328 ms 16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=6 hlim=121 time=1.376 ms 16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=7 hlim=121 time=1.335 ms 16 bytes from 2607:f8b0:4000:805::200e, icmp_seq=8 hlim=121 time=1.374 ms ^C --- ipv6.l.google.com ping6 statistics --- 25 packets transmitted, 9 packets received, 64.0% packet loss round-trip min/avg/max/std-dev = 1.257/9.516/74.918/23.123 ms root@roast:~ # Created attachment 233394 [details]
Packet capture
(In reply to wcarson.bugzilla from comment #15) From the pcap you provided, I see your upstream router is advertising prefix without the 'on-link' flag. It is basically same as my testing environment #12 except that your provider utilize HSRP to achieve first hop router failover. I can confirm CentOS 8 work greatly with such case. It seems Linux is not affected by CVE-2008-2476. The problem is a little complicated. If it is a good practice for network admin to advertise IPv6 prefix without 'on-link' flag, I think the problem will be common eventually. As `net.inet6.icmp6.nd6_onlink_ns_rfc4861` has some side effect, I wonder if there is better solution to resolve CVE-2008-2476. The problem affects 12.3, 13.0, 13.1-RC4, stable/13 and current. To work around it, set 'net.inet6.icmp6.nd6_onlink_ns_rfc4861' to none-zero. As IPv6 addresses is sufficient, most cloud providers provide at least a single dedicated /64 block to the customer. In this case I think CVE-2008-2476 could not happen, thus it is safe for providers to advertise prefixes with 'on-link' flag, or for FreeBSD users to change `net.inet6.icmp6.nd6_onlink_ns_rfc4861` to none-zero. @wcarson.bugzilla you can contact your provider to confirm if the prefix 2600:3c00::/64 is dedicated for your host. (In reply to Zhenlei Huang from comment #16) Hmm, I don't think 2600:3c00::/64 is dedicated to my host, however 2600:3c00:e000:0137::/64 is. Here is the response I got back from my provider: ##### To be perfectly clear: 2600:3c00::/64 is shared insofar as other customers have VMs with IPv6 addresses on the same IPv6 subnet. Your SLAAC-assigned IPv6 address - 2600:3c00::f03c:91ff:feb0:a56f - is a /128 range, which essentially makes it a single IPv6 address which only your VM can use. The addresses within your /64 range - 2600:3c00:e000:0137::/64 - can only be used by the VM it's routed towards. ##### (In reply to wcarson.bugzilla from comment #17) > To be perfectly clear: 2600:3c00::/64 is shared insofar as other customers have VMs > with IPv6 addresses on the same IPv6 subnet. Your SLAAC-assigned IPv6 address - > 2600:3c00::f03c:91ff:feb0:a56f - is a /128 range, which essentially makes it a > single IPv6 address which only your VM can use. If other customers do NOT send spoofed NS packets, or your provider has means to prevent spoofed NS packets, then it is safe to turn on 'net.inet6.icmp6.nd6_onlink_ns_rfc4861'. > The addresses within your /64 range - 2600:3c00:e000:0137::/64 - can only be used > by the VM it's routed towards. Since your provider provided routed /64 block, the upstream router should have route to this /64 block, then the SLAAC-assigned IPv6 address 2600:3c00::f03c:91ff:feb0:a56f is not required, and you can use 2600:3c00:e000:0137::/64 directly. You can keep 'nd6_onlink_ns_rfc4861' untouched and try this: # ping6 -S 2600:3c00:e000:0137:cafe:8a2e:0370:7334 ipv6.google.com or disable SLAAC: # service rtsold stop # ifconfig em0 inet6 -accept_rtadv # ifconfig em0 inet6 2600:3c00::f03c:91ff:feb0:a56f delete # route -6 get default || route -6 add default fe80::1%em0 # ping6 ipv6.google.com (In reply to Zhenlei Huang from comment #18) I've asked for them to describe any technology in place to mitigating spoofed NS messages, but I've not yet heard back. Unfortunately I think the dedicated /64 is routed to my SLAAC address, as it times out after awhile if I remove the address. I was able to confirm there is filtering in place to prevent NS spoofing, so at least for me the resolution is the sysctl tunable. Thank you very much for your help figuring this out! I do wonder if this will become a common issue, but you're much more capable of determining that and the appropriate resolution than I am :) If I can provide any more data, I'm certainly happy to. I have run into this issue now a few times. I have seen hosts try to ping my IPv6 hosts from off network and they just hang. Upon investigation I have found that the host is hung up on ignoring NS messages from the router, even though on the host we have NDP entries for both the link local and the global IP address. ie: # ndp -an | grep lagg1 | grep 00:09:0f 2607:f058:xx::1 00:09:0f:09:00:01 lagg1 23h34m17s S R fe80::209:fff:fe09:1%lagg1 00:09:0f:09:00:01 lagg1 23h33m53s S R PCAP shows NS messages from the router, but there is zero response, the host just ignores them. No firewall enabled at all.... I can ping from other hosts on the same subnet, that seems to work. The kicker is, if I ping6 FROM the host to the router it takes about 5 seconds (give or take) and then you're able to ping the gateway again. Once this happens, packets from remote are able to ping and traffic flows again. If I stop sending traffic and let things sit for about a minute, the process repeats again. NDP sol messages from the router are ignored again and remain broken until I ping the router from the host again. If I keep a continuous ping from a host off link, it will never fail. This seems to be some type of NDP timeout/cache issue. I have tried setting: net.inet6.icmp6.nd6_onlink_ns_rfc4861=1. but that does not seem to solve the problem. I am currently seeing this on 13.1-RELEASE-p9 which is on a TrueNAS host. While I realize 13.3 is current, TrueNAS seems to lag a little behind. I do have other TrueNAS hosts running this version that don't seem to experience this issue. (at least I've not reliably reproduced it on other machines) I have tried just rebooting the host, but I CAN reliably reproduce this issue. I have no other ND issues from the router to other hosts on this network. I have confirmed the host *is* receiving the NS messages; it just never replies.. Hello, I have got the same issue with the same symptoms as OP and using net.inet6.icmp6.nd6_onlink_ns_rfc4861=1 trick solved it too. I have got a router aka router1 announcing a prefix that freebsd (and other hosts) uses with SLAAC. There is another router aka router2 using another prefix only announcing a route to itself (its prefix is not announced since no SLAAC or DHCP is expected on that prefix). All devices (routers + host) are using the same segment. Freebsd host gets all RA and route table is properly populated. If I try a ping from freebsd host to router2 then router2 sends back a NS and freebsd host never replies. I have not checked the code but https://www.freebsd.org/security/advisories/FreeBSD-SA-08:10.nd6.asc says "The solution described below causes IPv6 Neighbor Discovery Neighbor Solicitation messages from non-neighbors to be ignored" -> It seems the patch misinterprets the definition of a neighbor since all hosts sharing a segment are neighbors. FYI no issues with Linux/Windows/Android stacks. Florent |