Summary: | [carp] carp breaks the network | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Steven Hartland <smh> | ||||
Component: | kern | Assignee: | Steven Hartland <smh> | ||||
Status: | Closed FIXED | ||||||
Severity: | Affects Many People | CC: | glebius | ||||
Priority: | --- | ||||||
Version: | 10.0-RELEASE | ||||||
Hardware: | Any | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Steven Hartland
2014-07-12 02:43:32 UTC
The problem occurs when we reboot one of the machines which have jails with
supporting carp IP's.
An example jail.conf entry:-
== machine01 ==
test01 {
host.hostname = "test01a";
ip4.addr = "10.10.10.5";
ip4.addr += "10.10.10.11";
ip4.addr += "10.10.10.12";
exec.prestart += "/sbin/ifconfig igb0 vhid 1 pass testpass alias 10.10.10.11/32";
exec.prestart += "/sbin/ifconfig igb0 vhid 2 pass testpass alias 10.10.10.12/32";
}
== machine02 ==
test01 {
host.hostname = "test01b";
ip4.addr = "10.10.10.6";
ip4.addr += "10.10.10.11";
ip4.addr += "10.10.10.12";
exec.prestart += "/sbin/ifconfig igb0 vhid 1 pass testpass advskew 100 alias 10.10.10.11/32";
exec.prestart += "/sbin/ifconfig igb0 vhid 2 pass testpass advskew 100 alias 10.10.10.12/32";
}
On reboot the machine02 the machines will complain about their IP's in use e.g.
Jul 12 01:12:50 machine01 kernel: Trying to mount root from zfs:tank/root []...
Jul 12 01:12:51 machine01 ntpd[1136]: ntpd 4.2.4p5-a (1)
Jul 12 01:12:51 machine01 kernel: .
Jul 12 01:12:53 machine01 kernel:
Jul 12 01:12:53 machine01 kernel: arp: 00:00:5e:00:01:02 is using my IP address 10.10.10.12 on igb0!
Jul 12 01:12:53 machine01 kernel: igb0: promiscuous mode enabled
Jul 12 01:12:53 machine01 kernel: carp: VHID 1@igb0: INIT -> BACKUP
Jul 12 01:12:54 machine01 kernel: arp: 00:00:5e:00:01:01 is using my IP address 10.10.10.11 on igb0!
-----------
Jul 12 01:12:53 machine02 kernel: arp: 10.10.10.10 moved from 00:00:5e:00:01:01 to 00:25:90:79:67:9a on igb0
In our particular case we have 6 carp interfaces on each machine, but I don't
believe that's a factor.
The machines are both connected to Cisco 6509 routers and when this happens
the Ciscos end up with an ARP entry for the carp IP's pointing to the physical
nic MAC instead of the CARP MAC e.g.
> sh ip arp 10.10.10.11
> Protocol Address Age (min) Hardware Addr Type Interface
> Internet 10.10.10.11 78 0025.9079.679a ARPA Vlan10
We also have the following settings in sysctl.conf:
net.inet.carp.preempt=1
net.inet.carp.senderr_demotion_factor=0
The first setting is as we want the main master to stay master if its running.
The second setting is for when we've used CARP on top of LAGG to prevent CARP
breaking while LAGG negotiates, after which it will never recover. This however
is not the case here as these machines aren't using LAGG.
I'm not really familiar with the network code flow but tracing through from the arp "is using my IP address" warnings I'm wondering if the issue is a race condition in sys/netinet/in.c:in_ifinit where it adds the address to ia->ia_addr.sin_addr.s_addr before it calls carp attach. Does this mean its possible for the address to respond before it knows its a carp address and hence the problem? Also on the machines we're seeing the issue they are hosting very busy sites on the carp addresses so this could be a requirement for reproduction. Created attachment 145182 [details]
Shows CARP MAC address conflict in progress
After adding a DELAY between adding the address to ia_hash and calling carp_attach_p in in_ifinit I've confirmed that we do indeed have a race condition between the address being available and it being attached to carp.
This means that ARP requests for the IP can result in a response using the interface MAC instead of the CARP MAC.
When this happens communication to the machines participating in the CARP are disrupted with packets destined for the already running MASTER being sent to the initialising BACKUP.
This can be clearly seen in the attached trace.
It can also be seen via ifconfig
After add to ia_hash but before CARP attach:
igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO>
ether f0:4d:a2:75:41:5a
inet 10.10.1.240 netmask 0xffffff00 broadcast 10.10.1.255
inet6 fe80::f24d:a2ff:fe75:415a%igb0 prefixlen 64 scopeid 0x1
inet 10.10.1.241 netmask 0xffffffff
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
carp: INIT vhid 1 advbase 1 advskew 0
----
After CARP attach completes
igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO>
ether f0:4d:a2:75:41:5a
inet 10.10.1.240 netmask 0xffffff00 broadcast 10.10.1.255
inet6 fe80::f24d:a2ff:fe75:415a%igb0 prefixlen 64 scopeid 0x1
inet 10.10.1.241 netmask 0xffffffff broadcast 10.10.1.241 vhid 1
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
carp: MASTER vhid 1 advbase 1 advskew 0
Fix for this has been committed to head as r269340 http://svnweb.freebsd.org/changeset/base/269340 This currently requires gleb's in_control rewrite as without it panic's so and MFC of this would require all of those dependencies. Given how this essentially breaks CARP, this should seriously be considered. In addition to the race condition between IP allocation and CARP attachment it turns out that while jail exec.prestart runs before the prison is created it runs after any IP alias creation. This explains why we're seeing gratuitous arp happening for the IP from with interface MAC instead of the CARP MAC. Changes to jail are hence required to allow CARP backed IP's to be used in jails. Changes required for jails to properly support CARP is being reviewed here: https://phabric.freebsd.org/D528 A commit references this bug: Author: smh Date: Mon Aug 4 16:32:09 UTC 2014 New revision: 269522 URL: http://svnweb.freebsd.org/changeset/base/269522 Log: Added support for extra ifconfig args to jail ip4.addr & ip6.addr params This allows for CARP interfaces to be used in jails e.g. ip4.addr = "em0|10.10.1.20/32 vhid 1 pass MyPass advskew 100" Before this change using exec.prestart to configure a CARP address would result in the wrong MAC being broadcast on startup as jail creates IP aliases to support ip[4|6].addr before exec.prestart is executed. PR: 191832 Reviewed by: jamie MFC after: 1 week X-MFC-With: r269340 Phabric: D528 Sponsored by: Multiplay Changes: head/usr.sbin/jail/command.c head/usr.sbin/jail/config.c head/usr.sbin/jail/jail.8 |