|Summary:||svn rev 303171 breaks Layer 2 with IPv6 on the freebsd.org cluster|
|Product:||Base System||Reporter:||Peter Wemm <peter>|
|Component:||kern||Assignee:||Mike Karels <karels>|
|Severity:||Affects Some People||CC:||delphij, des, emaste, gnn, karels, peter, re|
|Priority:||---||Keywords:||needs-qa, patch, regression|
Description Peter Wemm 2016-08-17 07:58:45 UTC
After rev 303171 we are seeing multiple network stack problems. The most urgent is that Layer-2 routing is broken and packets are being sent to the wrong address. In the paste below, 0c:c4:7a:49:48:70 = halo 00:25:90:30:d7:48 = ns1 00:00:5e:00:01:64 = default gateway 04:18:06.156769 0c:c4:7a:49:48:70 > 00:25:90:30:d7:48, ethertype IPv6 (0x86dd), length 102: halo.40045 > ns1.domain: 26984+% [1au] DS? freebsd.org. (40) 04:18:06.156942 00:25:90:30:d7:48 > 00:00:5e:00:01:64, ethertype IPv6 (0x86dd), length 313: ns1.domain > halo.40045: 26984 2/0/1 DS, RRSIG (251) You can see the reply is being incorrectly sent to the default gateway. We have confirmed with tcpdump that the gateway actually is receiving the packets and it isn't a display error. From the broken machine we can see it known to ndp: # ndp -n halo Neighbor Linklayer Address Netif Expire S Flags 2610:1c1:1:6002::16:12 0c:c4:7a:49:48:70 em0 23h59m45s S And the default gateway is also present in ndp: # ndp -an ... fe80::1%em0 00:00:5e:00:01:64 em0 23h59m57s S R ... A 'route get' shows the correct answer on the affected machine. # route -n get -inet6 halo route to: 2610:1c1:1:6002::16:12 destination: 2610:1c1:1:6002:: mask: ffff:ffff:ffff:ffff:: interface: em0 flags: <UP,DONE> Reverting 303171 locally restores correct behavior where both machines are able to communicate directly on the same ethernet segment again. When the packets arrive at the router it (understandably) refuses to route it back out the same interface it arrived on. I am aware that 303171 has been mfc'ed to 11-stable. It appears to work when we tried stable/11 temporarily but I cannot explain why. (These are in redundant paired machines, one runs 11, the other runs 12.) The machine does have jails. ns1 is a dual-stack jail. The host runs an em0 interface, but we have also seen it on igb, bce, bge and vlan. Jail addresses: 127.0.1.8 22.214.171.124 2610:1c1:1:6002::100 Interface addresses: inet 126.96.36.199 netmask 0xffffffe0 broadcast 188.8.131.52 inet 184.108.40.206 netmask 0xffffffff broadcast 220.127.116.11 inet 18.104.22.168 netmask 0xffffffff broadcast 22.214.171.124 inet6 2610:1c1:1:6002::1004 prefixlen 64 inet6 2610:1c1:1:6002::7b:1 prefixlen 128 inet6 2610:1c1:1:6002::100 prefixlen 128
Comment 1 Mark Linimon 2016-08-17 16:32:44 UTC
Mike, this seems to have been via one of your commits?
Comment 2 Peter Wemm 2016-08-17 18:15:53 UTC
Argh, I appear to have mixed up a test last night. I can confirm that it *is* broken on stable/11 now as well. From stable/11 r304269: 18:11:56.768302 0c:c4:7a:49:48:70 > 00:25:90:30:da:0e, ethertype IPv6 (0x86dd), length 102: halo.33215 > ns2.domain: 46162+% [1au] TXT? freebsd.org. (40) 18:11:56.768432 00:25:90:30:da:0e > 00:00:5e:00:01:64, ethertype IPv6 (0x86dd), length 833: ns2.domain > halo.33215: 46162$ 2/4/1 TXT "v=spf1 redirect=_spf.freebsd.org", RRSIG (771) Replies are going to the default gateway rather than the machine on the local network. The behavior is now the same in stable/11 as with head after patch 303171. Of note it has been merged to releng/11 as well. I'm going to try a local backout of r303698 to get the freebsd.org cluster working again.
Comment 3 Peter Wemm 2016-08-17 22:01:20 UTC
A backout of r304086 on stable/11 and releng/11.0 fixes the problem of packets going to the wrong MAC address.
Comment 4 Peter Wemm 2016-08-17 22:34:18 UTC
On a hunch, I changed one of the machines from old-style IPv6 jail / alias configuration to something more modern. With the following changes: ifconfig_em0="inet 126.96.36.199/27 -tso -vlanhwtso" -ifconfig_em0_ipv6="inet6 2610:1c1:1:6002::1005/64" +ifconfig_em0_ipv6="inet6 2610:1c1:1:6002::1005/64 prefer_source" -ifconfig_em0_alias0="inet6 2610:01c1:0001:6002::7b:2/128" +ifconfig_em0_alias0="inet6 2610:01c1:0001:6002::7b:2/64" ifconfig_em0_alias1="inet 188.8.131.52/32" ... jail_ns2_hostname="ns2.nyi.freebsd.org" -jail_ns2_ip="lo1|127.0.1.9,184.108.40.206,2610:01c1:0001:6002::200" +jail_ns2_ip="lo1|127.0.1.9,220.127.116.11,2610:01c1:0001:6002::200/64" The problem no longer manifests. The test scenario I was seeing packets going to the default gateway was for packets between: 2610:1c1:1:6002::16:12 <-> 2610:01c1:0001:6002::200, both in the same /64. Note that it is *still* using /32 aliases for ipv4 for the jails and it works as expected there still. The problem was using the ipv6 equivalent - /128.
Comment 5 Peter Wemm 2016-08-17 22:38:47 UTC
I should have pasted the actual configuration. $ ifconfig | grep inet inet 18.104.22.168 netmask 0xffffffe0 broadcast 22.214.171.124 inet 126.96.36.199 netmask 0xffffffff broadcast 188.8.131.52 inet 184.108.40.206 netmask 0xffffffff broadcast 220.127.116.11 inet6 2610:1c1:1:6002::1005 prefixlen 64 prefer_source inet6 2610:1c1:1:6002::7b:2 prefixlen 64 inet6 2610:1c1:1:6002::200 prefixlen 64 The addresses tested: 18.104.22.168 <-> 22.214.171.124 (jail alias) 2610:1c1:1:6002::16:12 <-> 2610:1c1:1:6002::200 (jail alias)
Comment 6 Mike Karels 2016-08-18 00:50:00 UTC
(In reply to Mark Linimon from comment #1) Yes, and I'm looking at this.
Comment 8 Mike Karels 2016-08-18 08:02:02 UTC
This appears to be a dup of 211872. I'm attaching a proposed patch that Peter Wemm is testing, with good initial results.
Comment 9 Glen Barber 2016-08-18 18:48:28 UTC
I'll follow up with peter@ internally, but as this update is only 10 hours old, wanted to follow up with you on the status.
Comment 10 Dag-Erling Smørgrav 2016-08-19 07:19:53 UTC
The casts in the patch are unnecessary (and a style(9) violation).
Comment 11 Kubilay Kocak 2016-08-19 13:36:43 UTC
Annotate / bring up to date. @Mike if/when you're confident this issue and bug 211872 are duplicates, please close one as a duplicate (using 'Mark as Duplicate'). Ideally close the newer as the dupe, but failing that the one with the most context/activity/content.
Comment 12 Mike Karels 2016-08-19 23:52:28 UTC
(In reply to Dag-Erling SmÃ¸rgrav from comment #10) Thanks, I'll remember that. This is actually fairly old code. I don't think this patch is going in now, but a simpler one (TBD).
Comment 13 commit-hook 2016-08-20 20:47:17 UTC
A commit references this bug: Author: karels Date: Sat Aug 20 20:46:54 UTC 2016 New revision: 304545 URL: https://svnweb.freebsd.org/changeset/base/304545 Log: Disable L2 caching for UDP over IPv6 The ip6_output routine is missing L2 cache invalication as done in ip_output. Even with that code, some problems with UDP over IPv6 have been reported. Diabling L2 cache for that problem works around the problem for now. PR: 211872 211926 Reviewed by: gnn Approved by: gnn (mentor) MFC after: immediate Changes: head/sys/netinet6/udp6_usrreq.c
Comment 14 commit-hook 2016-08-20 20:57:22 UTC
A commit references this bug: Author: karels Date: Sat Aug 20 20:56:37 UTC 2016 New revision: 304546 URL: https://svnweb.freebsd.org/changeset/base/304546 Log: MFC r304545: Disable L2 caching for UDP over IPv6 The ip6_output routine is missing L2 cache invalication as done in ip_output. Even with that code, some problems with UDP over IPv6 have been reported. Diabling L2 cache for that problem works around the problem for now. PR: 211872 211926 Reviewed by: gnn Approved by: gnn (mentor) Tested by: peter@, Mike Andrews MFC after: immediate Changes: _U stable/11/ stable/11/sys/netinet6/udp6_usrreq.c
Comment 15 Mike Karels 2016-08-21 00:45:11 UTC
(In reply to Kubilay Kocak from comment #11) I am reasonably confident that both bugs are essentially the same. There is a fair amount of history on both. My inclination is to mark this bug as a dup of 211872 if no one objects.
Comment 16 commit-hook 2016-08-22 22:30:42 UTC
A commit references this bug: Author: karels Date: Mon Aug 22 22:29:57 UTC 2016 New revision: 304642 URL: https://svnweb.freebsd.org/changeset/base/304642 Log: MFC r304546: Disable L2 caching for UDP over IPv6 The ip6_output routine is missing L2 cache invalication as done in ip_output. Even with that code, some problems with UDP over IPv6 have been reported. Diabling L2 cache for that problem works around the problem for now. PR: 211872 211926 Reviewed by: gnn Approved by: gnn (mentor) Approved by: re (gjb) Tested by: peter@, Mike Andrews Changes: _U releng/11.0/ releng/11.0/sys/netinet6/udp6_usrreq.c