Created attachment 146727 [details] Wireshark The system doesn'nt send correct ndp neighbor solicitation messages for ipv6. For these messages, carp uses the hardware-mac address of the interface and not the virtual mac for the carp-ip. This ends up in packet-loss for the virtual ipv6 address. On ipv4, the arp cache shows the virtual mac entry. (correct) On ipv6, the ndp cache shows the hardware mac entry. <--problem The interface is configured with two ipv4 virtual addresses and one ipv6 virtual address. The ipv4 addresses work like a charm. The Wireshark attachement shows that the server sends the hardware-mac-address instead of the virtual mac address ( 00:00:5e:00:01:01 ) system-info: FreeBSD XXHOSTNAMEXX 10.0-RELEASE-p7 FreeBSD 10.0-RELEASE-p7 #0: Tue Jul 8 06:37:44 UTC 2014 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
I will take a look into this issue.
Any news here?!
We need a solution as soon as possible. Please investigate... Do you need any further information?
Is there any progress on this subject, or better yet: has it been fixed in the upcoming 10.2-RELEASE? I'm having the same issues as the OP, how I can I help fixing this issue?
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
This bug is still present in FreeBSD 12.0 (-p5).
Still seems to be present on 13.0-RELEASE-p4
I also hit this issue today, very frustrating problem! This was on a fresh 13-STABLE from a few days ago. This issue basically makes carp for ipv6 useless :( An observation is that it seems to go away with preempt disabled, meaning net.inet.carp.preempt=0. In my current setup I am able to run without preempt because I have full routing between the nodes. But other setups (incuding others I manage) are not so lucky. I wonder what triggers it, because I've used carp and ipv6 together in many setups over the years (and still do), and I've never come across this issue before. Maybe we can share some overall information about our setups and see if we can find some common thing which triggers it. I hit this issue on a pair of BGP (bird2) routers when I tried to make the bgp source IP be a CARP IP. Worked fine for ipv4, but not for ipv6. Things which might be a factor?: - NIC driver is igb Things which are probably not a factor: - VLAN tagging - lagg(4) lacp I see the same issue on a customer facing interface which is VLAN tagged on top of a lagg(4) interface. So VLAN and lagg(4) do not appear to matter. I will update this issue if I think of more information which could be relevant.
OK so with a lot of help from my friends at semaphor.dk I managed to get quite a bit further with this issue today. The issue is, as the original bug report correctly identified, that some ndp neighbor solicitaition messages are sent out with the wrong source mac when the source IP is a carp IP. This is a big deal because it breaks carp completely for v6. To trigger it requires some software, say ping(1) on the BACKUP node to initiate traffic to something, say the default gateway, with the ping source IP set to the shared CARP ip. This will break ndp for the shared IP. If you just need a workaround then stop doing that (maybe use devd to start whatever makes outbound connections when the node becomes MASTER). Because the ndp cache is empty the ping packet will trigger a neighbor solicitation packet, which will have the shared CARP ip as source IP (as per RFC), but the packet incorrectly has the NIC real mac address as source mac rather than the shared virtual mac. It gets a bit long to try to describe everything so I have created a kyua (1) testcase to illustrate the problem. The testcase creates three jails, two of them with a shared IP, and then runs ping -6 towards the third in both jails, checks the exit codes, and finally checks to make sure the ndp table on the third jail contains the virtual mac for the shared IP. The testcase is attached, along with a patch which appears to fix the issue in my end. The issue and the patch: The codepath for sending neighbor *advertisements* nd6_na_output_fib() checks to see if the IP it is advertising is a carp ip and sets the source mac accordingly in carp_macmatch6_p() - this works. This check is missing in the codepath for sending neighbor *solicitations* nd6_ns_output_fib(). This means the mbuf tag PACKET_TAG_CARP is not set and carp_output never changes the source mac. The attached patch attempts to fix this by calling carp_macmatch6_p() from nd6_ns_output_fib() if it is a carp IP. The patch works and appears to be stable but it comes with a big warning, this needs someone who knows the code better than we do to make sure it doesn't break everything (or is just plain wrong). Thanks! ps. this lovely dtrace snippet helped to understand how carp_output is called from nd6_ns_output which was very useful. dtrace is fantastic dtrace -n 'fbt:kernel:nd6_ns_output:entry{this->ja=1} fbt:carp:carp_output:entry /this->ja/ {this->ifp=args[0];this->m=args[1]; this->sa=args[2]; this->loc=1;stack();} fbt:carp:carp_output:return /this->ja/ {this->loc=0;this->eh=(struct ether_header*)this->m->m_data;this->s=(u_char*)this->eh->ether_shost;printf("%u, %x %s %02x%02x%02x%02x%02x%02x sa_fam:%u ifp->t %u master=%u", args[1], args[0], this->ifp->if_xname,this->s[0],this->s[1],this->s[2],this->s[3],this->s[4],this->s[5], this->sa->sa_family, this->ifp->if_type,this->ifp->if_carp->cif_vrs.tqh_first->sc_state);tracemem(this->m->m_data,86); printf(" tags_head:%p", this->m->m_pkthdr.tags.slh_first);} fbt:kernel:m_tag_locate:return /this->loc/ {printf("PACKET_TAG_CARP %p,%u", args[1], args[1]->m_tag_len)} fbt:kernel:nd6_ns_output:return{this->ja=0}'
Created attachment 230926 [details] kyua testcase to demonstrate the problem
Created attachment 230927 [details] ndp neighbor solicitation carp source mac fix
A couple of things I forgot to mention: - net.inet.carp.preempt=0 or 1 has no effect on this. - This patch is against stable/13-n248794-802ff7fcee2 but I suspect it is also present in -current since it has been with us at least since the carp rewrite in 10. - I failed to mention that the wrong mac is also in source link-layer address option inside the icmp6 ndp NS packet, which I guess is the mac that really matters. I am happy to answer questions, I am available on IRC as Tykling
Thomas Steen Rasmussen, thanks a lot for your thorough analysis! Very much appreciated. I don't feel myself ultimate IPv6 expert, but to me your patch looks correct. I have put it on reviews board and I will work on getting IPv6 experts attention. In either way I'm going to commit it in a week, just trusting yours and mine judgement. Feel free to register at reviews board and improve/edit/comment your changes before they hit git: https://reviews.freebsd.org/D33858 https://reviews.freebsd.org/D33859
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4a178afb4aa9876094c19faf6d3bf065a5ebe163 commit 4a178afb4aa9876094c19faf6d3bf065a5ebe163 Author: Thomas Steen Rasmussen <thomas@gibfest.dk> AuthorDate: 2022-01-25 05:02:47 +0000 Commit: Gleb Smirnoff <glebius@FreeBSD.org> CommitDate: 2022-01-25 05:02:47 +0000 tests/netinet: add test for IPv6 NS and CARP PR: 193280 Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D33859 tests/sys/netinet/carp.sh | 64 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=bc6abdd97e951b54294d331698317a607246255d commit bc6abdd97e951b54294d331698317a607246255d Author: Thomas Steen Rasmussen <thomas@gibfest.dk> AuthorDate: 2022-01-25 05:02:47 +0000 Commit: Gleb Smirnoff <glebius@FreeBSD.org> CommitDate: 2022-01-25 05:02:47 +0000 nd6: use CARP link level address in SLLAO for NS sent out When sending an NS, check if we are using a IPv6 CARP address and if we do, then put proper CARP link level address into ND_OPT_SOURCE_LINKADDR option and also put PACKET_TAG_CARP tag on the packet. The latter will enforce CARP link level address at the data link layer too, which might be necessary for broken implementations. The code really follows what NA sending code has been doing since introduction of carp(4). While here, bring to style(9) the whole block of code. PR: 193280 Differential revision: https://reviews.freebsd.org/D33858 sys/netinet6/nd6_nbr.c | 38 ++++++++++++++++++++++++-------------- 1 file changed, 24 insertions(+), 14 deletions(-)
Thank you Gleb. Any chance of an MFC to 13? Otherwise I would say this can be closed.
Thank you, Thomas! Let's wait two weeks since commit date for MFC.
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d2e24c54ef8311a053eddb05a0ce336daf890abb commit d2e24c54ef8311a053eddb05a0ce336daf890abb Author: Thomas Steen Rasmussen <thomas@gibfest.dk> AuthorDate: 2022-01-25 05:02:47 +0000 Commit: Gleb Smirnoff <glebius@FreeBSD.org> CommitDate: 2022-02-07 18:55:54 +0000 nd6: use CARP link level address in SLLAO for NS sent out When sending an NS, check if we are using a IPv6 CARP address and if we do, then put proper CARP link level address into ND_OPT_SOURCE_LINKADDR option and also put PACKET_TAG_CARP tag on the packet. The latter will enforce CARP link level address at the data link layer too, which might be necessary for broken implementations. The code really follows what NA sending code has been doing since introduction of carp(4). While here, bring to style(9) the whole block of code. PR: 193280 Differential revision: https://reviews.freebsd.org/D33858 (cherry picked from commit bc6abdd97e951b54294d331698317a607246255d) sys/netinet6/nd6_nbr.c | 38 ++++++++++++++++++++++++-------------- 1 file changed, 24 insertions(+), 14 deletions(-)
MARKED AS SPAM