I'm not sure where to begin with this issue, but I'm noticing a strange behavior with IPv6 on FreeBSD 13.0-BETA1. Not sure if it's my driver or an issue with FreeBSD 13.0 itself. It seems after moments of IPv6 routing properly it suddenly stops. I found out what is missing when it's working and when it isn't. For one, I use RTSOL to get my IPv6 address. So I have the rtsold daemon running. Once IPv6 routing stops working, I can restart the daemon and routing works again, albeit momentarily. When I run netstat -r when IPv6 routing doesn't work: Internet6: Destination Gateway Flags Netif Expire dead:beef:x:x:: link#1 U re0 fe80::%re0/64 link#1 U re0 And when it works Internet6: Destination Gateway Flags Netif Expire default fe80::1:1%re0 UG re0 dead:beef:x:x:: link#1 U re0 fe80::%re0/64 link#1 U re0 I left out my loopback info, I can add it if necessary. So for whatever reason (and nothing shows in the logs), the default route vanishes for IPv6. During this time I still have my address and I can still ping6 hosts internally. uname -a FreeBSD towerDefense 13.0-BETA1 FreeBSD 13.0-BETA1 #8 c48cbd025: Wed Feb 10 10:30:48 PST 2021 root@towerDefense:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 I also compiled the realtek-re-kmod port to get networking. Intel Z490 motherboard, Intel 10700k. Here is my ethernet info from pciconf re0@pci0:2:0:0: class=0x020000 rev=0x04 hdr=0x00 vendor=0x10ec device=0x8125 subvendor=0x1462 subdevice=0x7c75 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8125 2.5GbE Controller' class = network subclass = ethernet
Is there any chance you could run `route -n monitor > log.txt` in the background, stopping it when the default route disappears?
Here is what I got. I did service rtsold restart and then waited until I lost my route. Here is the output of ping6 as the route got deleted 16 bytes from 2607:f8b0:400a:805::200e, icmp_seq=49 hlim=116 time=11.189 ms 16 bytes from 2607:f8b0:400a:805::200e, icmp_seq=50 hlim=116 time=12.354 ms ping6: sendmsg: No route to host ping6: wrote ipv6.l.google.com 16 chars, ret=-1 And here is the output of route -n monitor from before restarting rtsold and when the route got deleted got message of size 248 on Sat Feb 13 11:09:40 2021 RTM_ADD: Add Route: len 248, pid: 0, seq 0, errno 0, flags:<UP,GATEWAY,DONE> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> :: fe80::1:1%re0 :: got message of size 248 on Sat Feb 13 11:10:41 2021 RTM_DELETE: Delete Route: len 248, pid: 0, seq 0, errno 0, flags:<GATEWAY,DONE> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> :: fe80::1:1%re0 ::
I forgot to note, I found out that if I manually add the route, it does not go away sudo route -6 add default fe80::1:1%re0 So rtsol is removing the routes? Or something upstream is telling the routes to get removed? Everything in my environment is constant, the only variables are I had a hardware change going from intel to realtek (new mobo/processor) and I'm using FreeBSD 13.0 now instead of my prior 12.2
Okay, so it's actually not the userland who adds/removes the route, as the "PID" valued of the rtsock messages is 0. what is the output of "ndp -r" closer to the route expiration? How does "ndp -p" look? Also: does the router(s) on your network send RA messages periodically, or you have to explicitly rely on the rtsol?
I believe my router will send out RA messages periodically. I have just been using rtsol to see what changes. I have these in my rc.conf. Let me verify some of these things on my other 12.2 box so I can give better info in terms of RA. I'm using pfSense and to my understanding it's using RA. ifconfig_re0_ipv6="inet6 accept_rtadv" ipv6_activate_all_interfaces="YES" Note the behavior is no different whether I have rtsold enabled and running or not I can restart netif and get an IPv6 address without rtsol so I assume my interface is listening for RAs? > what is the output of "ndp -r" closer to the route expiration? How does "ndp -p" look? "ndp -p" before the expiration dead:beef:x:x::/64 if=re0 flags=LAO vltime=86400, pltime=14400, expire=23h59m30s, ref=1 advertised by fe80::1:1%re0 (reachable) fe80::%re0/64 if=re0 flags=LAO vltime=infinity, pltime=infinity, expire=Never, ref=1 No advertising router fe80::%lo0/64 if=lo0 flags=LAO vltime=infinity, pltime=infinity, expire=Never, ref=1 No advertising router "ndp -p" when I lose my route dead:beef:x:x::/64 if=re0 flags=LAO vltime=86400, pltime=14400, expire=23h58m53s, ref=1 No advertising router fe80::%re0/64 if=re0 flags=LAO vltime=infinity, pltime=infinity, expire=Never, ref=1 No advertising router fe80::%lo0/64 if=lo0 flags=LAO vltime=infinity, pltime=infinity, expire=Never, ref=1 No advertising router "ndp -r" before I lose my route fe80::1:1%re0 if=re0, flags=, pref=high, expire=32s and afterwards I get no output
I watched my systems more. My one that is having issues will let my...I can't think of the words, information(?) expire. Watching my 12.2 machine with ndp -r I see the expiration time refresh. It seems an advertisement is sent out every 20 seconds on my network. I can confirm that my desktop is getting those advertisements every 20 seconds too. So for some reason it just isn't applying them? Also, I have to have rtsold running on my systems.
Okay, so we're receiving RAs but for some reason they are ignored. Could you consider: 1) sharing netstat -sp icmp6 output? 2) turn on sysctl net.inet6.icmp6.nd6_debug=1 & check dmesg if there are any relevant messages?
netstat -sp icmp6 icmp6: 2 calls to icmp6_error 0 errors not generated in response to an icmp6 message 0 errors not generated because of rate limitation Output histogram: unreach: 2 echo: 99 router solicitation: 11 neighbor solicitation: 148 neighbor advertisement: 205 MLDv2 listener report: 19 0 messages with bad code fields 0 messages < minimum length 0 bad checksums 0 messages with bad length Input histogram: echo reply: 82 router advertisement: 20 neighbor solicitation: 205 neighbor advertisement: 136 Histogram of error messages to be generated: 0 no route 0 administratively prohibited 0 beyond scope 0 address unreachable 2 port unreachable 0 packet too big 0 time exceed transit 0 time exceed reassembly 0 erroneous header field 0 unrecognized next header 0 unrecognized option 0 redirect 0 unknown 0 message responses generated 0 messages with too many ND options 0 messages with bad ND options 0 bad neighbor solicitation messages 0 bad neighbor advertisement messages 0 bad router solicitation messages 0 bad router advertisement messages 0 bad redirect messages 0 default routers overflows 0 prefix overflows 0 neighbour entries overflows 0 redirect overflows 0 messages with invalid hop limit 0 path MTU changes And for dmesg I saw this: nd6_options: unsupported option 24 - option ignored
Good. So from ICMP6 input histogram, it can be seen we received these RA messages. Do you see multiple messages "nd6_options: unsupported option 24 - option ignored "? As far as I understand, RA messages are received by the nd6_ra_input(), processed halfway there (as nd6_options() is called and likely does not fail). The next part it can fail is enabled forwarding (net.inet6.ip6.forwarding). Do you have it turned off (or have net.inet6.ip6.rfc6204w3 turned on)? If forwarding is turned off, do you mind sharing "ndp -i re0" and showing input RA message? e.g. "tcpdump -i re0 -lnpvvs0 icmp6"
(In reply to courtney.hicks1 from comment #0) To me this sounds that unsolicited RAs (as send out periodically by your router) don't make it while solicited (rtsol "asking") work. That may imply that there are filters somewhere not working properly or multicast is not properly received. If you do see the unsolicited RAs from your router in tcpdump -ln -s0 -i re0 -vvv icmp6 then can you do the following: (0) confirm no firewall active on your local system? (a) wait for the default route to go away (b) do not restart anything! (c) start tcpdump as per above (d) when you see an unsolicited RA coming back in, check (in a 2nd terminal) if your default route is back or not. (d1) if it is not; please report (d2) if it is back then stop tcpdump, do an ifconfig re0 promisc; keep watching if your default route goes away again or not during the next 25 minutes and report back. Otherwise and/or in addition sysctl net.inet6.icmp6.nd6_debug=0xff may also help Alexander to further debug this.
For the nd6_options message, I only see it twice. Looking at it, it looks to come around the completion of DAD for my re0 link local and autoconf IPv6 address. Sorry if some of my terminology is poor, my IPv6 knowledge has gotten rusty. inet.inet6.ip6.forwarding = 0 inet.inet6.ip6.rfc.6204w3 = 0 The output of ndp -i rs0: linkmtu=1500, maxmtu=1500, curhlim=64, basereachable=30s0ms, reachable=18s, retrans=1s0ms Flags: nud accept_rtadv auto_linklocal For the tcpdump, I see nothing showing up. I have tried with my firewall both on and off. With promisc mode enabled I am getting router solicitation packets. Actually, what's also interesting is when I enable promiscuous mode the RAs work and I don't lose my routes. Then if I turn promiscuous mode off I lose my route again. I see there are nd6 updates available in releng/13.0 branch. So I'm going to create a new boot environment and see if I have these issues after the update. I'll be keeping my existing boot environment with the issues. I'm getting src updates via gitup. For the future: # Have: c48cbd0254dedd363ab569692ddf3395b6214412 # Want: 1e76911d62ed4b66bc21cfc22101ef6b20cd6630 I'll update this bug report after things compile and such.
Just updated to the latest updates in releng/13.0, I did see nd6 updates but no dice on fixing the issue: FreeBSD towerDefense 13.0-BETA2 FreeBSD 13.0-BETA2 #9 1e76911d6: Mon Feb 15 20:52:00 PST 2021 root@towerDefense:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
It looks like Bjoern's diagnostics was right. So far it looks like a potential problem w.r.t programming multicast groups in the driver. Could you consider sharing `ifmcstat` output? Is there any chance you can potentially try it with different NIC?
Poo, that sucks. I could probably take the extra card out of my server tonight. It's an Intel 82580. Here is the ifmcstat for now re0: inet 192.168.10.201 igmpv3 rv 2 qi 125 qri 10 uri 3 group 224.0.0.1 mode exclude mcast-macaddr 01:00:5e:00:00:01 inet6 fe80::2ef0:5dff:fecc:4ed7%re0 scopeid 0x1 mldv2 flags=2<USEALLOW> rv 2 qi 125 qri 10 uri 3 group ff01::1%re0 scopeid 0x1 mode exclude mcast-macaddr 33:33:00:00:00:01 group ff02::2:2c1c:8e10%re0 scopeid 0x1 mode exclude mcast-macaddr 33:33:2c:1c:8e:10 group ff02::2:ff2c:1c8e%re0 scopeid 0x1 mode exclude mcast-macaddr 33:33:ff:2c:1c:8e group ff02::1%re0 scopeid 0x1 mode exclude mcast-macaddr 33:33:00:00:00:01 group ff02::1:ffcc:4ed7%re0 scopeid 0x1 mode exclude mcast-macaddr 33:33:ff:cc:4e:d7 lo0: inet 127.0.0.1 igmpv3 rv 2 qi 125 qri 10 uri 3 group 224.0.0.1 mode exclude inet6 fe80::1%lo0 scopeid 0x2 mldv2 flags=2<USEALLOW> rv 2 qi 125 qri 10 uri 3 group ff01::1%lo0 scopeid 0x2 mode exclude group ff02::2:2c1c:8e10%lo0 scopeid 0x2 mode exclude group ff02::2:ff2c:1c8e%lo0 scopeid 0x2 mode exclude group ff02::1%lo0 scopeid 0x2 mode exclude group ff02::1:ff00:1%lo0 scopeid 0x2 mode exclude
I can confirm this is an issue with the driver. Popped in my PCIe NIC and my problem is gone: igb3@pci0:3:0:3: class=0x020000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x150e subvendor=0x8086 subdevice=0x12a2 vendor = 'Intel Corporation' device = '82580 Gigabit Network Connection' class = network subclass = ethernet ifmcstat: igb3: inet 192.168.10.204 igmpv3 rv 2 qi 125 qri 10 uri 3 group 224.0.0.1 mode exclude mcast-macaddr 01:00:5e:00:00:01 inet6 fe80::21b:21ff:fed7:dbab%igb3 scopeid 0x4 mldv2 flags=2<USEALLOW> rv 2 qi 125 qri 10 uri 3 group ff01::1%igb3 scopeid 0x4 mode exclude mcast-macaddr 33:33:00:00:00:01 group ff02::2:2c1c:8e10%igb3 scopeid 0x4 mode exclude mcast-macaddr 33:33:2c:1c:8e:10 group ff02::2:ff2c:1c8e%igb3 scopeid 0x4 mode exclude mcast-macaddr 33:33:ff:2c:1c:8e group ff02::1%igb3 scopeid 0x4 mode exclude mcast-macaddr 33:33:00:00:00:01 group ff02::1:ffd7:dbab%igb3 scopeid 0x4 mode exclude mcast-macaddr 33:33:ff:d7:db:ab Not sure if there is anymore info you need, or what to do about a driver issue in ports. I suppose I could open an issue with the port, but it seems more like it's something simply pulled down from upstream? I know that FreeBSD has some sort of realtek driver in base already. Would there be any plans to update it?
jmg@ also has reported very similar problem in the net@. John-Mark, can you share what hw do you use?
Try this. The patch should be applied on top of all patches in the realtek driver port. It might be easier to apply it by hands diff --git a/if_re.c b/if_re.c index 47466f9..d8f0176 100644 --- a/if_re.c +++ b/if_re.c @@ -8663,7 +8663,7 @@ struct re_softc *sc; /* now program new ones */ #if OS_VER >= VERSION(13,0) - if_foreach_llmaddr(ifp, re_hash_maddr, hashes); + mcnt = if_foreach_llmaddr(ifp, re_hash_maddr, hashes); #else #if OS_VER >= VERSION(12,0) if_maddr_rlock(ifp);
Looks like it was bge. It may also affect ure as well, but the testing that I was doing for the thread was bge: https://docs.freebsd.org/cgi/mid.cgi?20210112213707.GP31099@funkthat.com and later I was able to reproduce w/ epair as well: https://docs.freebsd.org/cgi/mid.cgi?20210114193429.GT31099@funkthat.com
Hey everyone, Sorry for leaving this issue hanging. I had a lot of changes in life happen and I ended up forgetting about this. Since I posted last, I took an Intel 4x1Gbps card out of my server and put it in my desktop, but I want the card back in my server. I'm using FreeBSD 13.0-RELEASE right now, the issue has gotten both worse and better since I tried last. Currently, the driver doesn't appear to pay any attention to any rtadvs, rtsold does nothing, and at boot time I don't get any address via DHCP. Here is part of my dmesg pertaining to re0 re0: <Realtek PCIe 2.5GbE Family Controller> port 0x3000-0x30ff mem 0xb3600000-0xb360ffff,0xb3610000-0xb3613fff at device 0.0 on pci5 re0: Using Memory Mapping! re0: Using 1 MSI-X message re0: ASPM disabled re0: version:1.96.04 re0: Ethernet address: <macaddr> This product is covered by one or more of the following patents: US6,570,884, US6,115,776, and US6,327,625. re0: Ethernet address: <macaddr> re0: link state changed to UP I have to issue a dhclient once logged in to get an IPv4 address, then we're fine. Also interestingly, soon after manually adding an IPv6 address via ifconfig as well as an IPv6 default route, my device starts accepting rtadvs and I get an autoconf address as well as a temporary (as I've configured). Previously, my default router entries would expire. I could see the output of ndp -r count down to 0. Now I see the expire time in the upper 20 minute range and always resetting the counter eache router adv. So, IPv4 has regressed. I now have to configure it via dhclient when I log in. Maybe it'll expire, don't know yet. IPv6 works once I set a static IPv6 address and route, then from there everything rtadv-related seems to work.