Bug 240106 - VNET issue with ARP and routing sockets in jails
Summary: VNET issue with ARP and routing sockets in jails
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-jail mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-08-25 19:37 UTC by John Westbrook
Modified: 2019-10-09 11:43 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Westbrook 2019-08-25 19:37:31 UTC
I'm experiencing an intermittent connectivity issue running FreeBSD 12.0 with jail using VNET, which appears to be related to lost ARP replies.

There are several discussion threads on forums that appear related:

https://forums.freebsd.org/threads/vnet-arp-replies-are-lost.71082
https://www.ixsystems.com/community/threads/arp-replies-loss-in-vnet.77027
https://www.ixsystems.com/community/threads/jails-eero.59477

One insightful comment from the first thread:

"""On step #2 the reply is mistakenly padded with 14 bytes which is exactly the number of bytes beyond the 18 bytes in the request (the request was padded with 32 bytes). I bet this is part of the bug. By looking at FreeBSD ARP reply code it actually creates the reply by editing the request bytes in place. For some reason it removes only 18 bytes from the request padding. However, this happens only on VNET interface as noted above."""

I was able to see ARP traffic using tcpdump, but (arp -a) doesn't contain updated ARP entries. Also, in an affected jail, I can't add static arp entries:

# arp -s 10.0.0.1 XX:XX:XX:XX:XX:XX
arp: writing to routing socket: Cannot allocate memory

whereas, in an unaffected jail the arp command succeeds. Jails are should have access to routing sockets by default, so perhaps the problem is related to accessing routing sockets in VNET jails?

The test setup where I'm observing this is using an SR-IOV VF (Chelsio cxlv0) passed into the jail (via vnet.interface in jail.conf). The test setup has two jails each on two direct attached hosts. I observe the problem on both hosts, but it comes and goes with reboots.
Comment 1 Andrey V. Elsukov freebsd_committer 2019-08-27 10:51:18 UTC
Can you describe the steps required to reproduce the problem on the 12.0/13.0 system?
Comment 2 John Westbrook 2019-08-29 16:54:20 UTC
I have SR-IOV configured as described in this thread:

https://forums.freebsd.org/threads/sr-iov-chelsio-error-in-guest.70653

such that cxlv[0-3] are shown in ifconfig. The jail.conf is:

vnet;
vnet.interface = "vnet0";
exec.prestart  = "ifconfig ${vnet0} name vnet0";
exec.poststop  = "ifconfig vnet0 name ${vnet0}";

exec.start += "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown";
exec.consolelog = "/var/log/${name}.log";
host.hostname = "${name}";
path = "/jail/${name}";

j1 {
   $vnet0 = "cxlv1";
}

j2 {
   $vnet0 = "cxlv2";
}

There are two hosts direct connected via cxl0. The problem is visible when pinging (1) between jails on the same host and (2) from an affected jail on host 1 to host 2. On an unaffected host both of these operations succeed.

Using tcpdump on the physical (cxl) and virtual (cxlv) interfaces shows the ARP requests and responses, but in an affected jail the ARP tables aren't updated.
Comment 3 Alexander Lunev 2019-10-09 11:43:04 UTC
I think that bug that I wanted to report is somewhat similar, all main actors - VNET, jails and ARP - are the same.

So I have a problem with network connectivity between jails and host when using jails with VNET and VLANs. 

I've written about it to freebsd-net@ mailing list: 

threads: 
https://lists.freebsd.org/pipermail/freebsd-net/2019-September/054391.html
https://lists.freebsd.org/pipermail/freebsd-net/2019-October/054437.html

There's a topic on FreeBSD forums, which confirms this and once again explain the configuration with which this problem occuring, and in in great detail, but author has "solved" his problem by simply not using configuration when you bridge physical interface with jail's VNET interface and not using jail's VNET interface with VLANs. 

https://forums.freebsd.org/threads/bridge-epair-not-passing-through-tagged-vlan-traffic-between-host-and-vnet-jail.71646/

I'll add some more observation here. I recreated configuration in a virtual machine, as i wrote in my last message to freebsd-net@ here: https://lists.freebsd.org/pipermail/freebsd-net/2019-October/054475.html. Jail's vlan interface IP is 10.15.15.2 and host's vlan interface IP is 10.15.15.1. Both jail and host have no ARP entries about each other addresses. 

So I ping from 10.15.15.2 to 10.15.15.1. 

1. in initial configuration, I see this on em0: 

HOST# tcpdump -i em0 -e | grep 10.15.15
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 262144 bytes
08:57:52.051429 02:95:ce:33:dc:0b (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 22, p 0, ethertype ARP, Request who-has 10.15.15.1 tell 10.15.15.2, length 28
08:57:53.071451 02:95:ce:33:dc:0b (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 22, p 0, ethertype ARP, Request who-has 10.15.15.1 tell 10.15.15.2, length 28
08:57:54.101515 02:95:ce:33:dc:0b (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 22, p 0, ethertype ARP, Request who-has 10.15.15.1 tell 10.15.15.2, length 28

2. then I added ARP entry in jail: 

JAIL# arp -s 10.15.15.1 00:0c:29:2f:6c:08

HOST# tcpdump -i em0 -e | grep 10.15.15
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 262144 bytes
09:07:10.321257 00:0c:29:2f:6c:08 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 22, p 0, ethertype ARP, Request who-has 10.15.15.2 tell 10.15.15.1, length 28
09:07:11.391300 00:0c:29:2f:6c:08 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 22, p 0, ethertype ARP, Request who-has 10.15.15.2 tell 10.15.15.1, length 28
09:07:12.415232 00:0c:29:2f:6c:08 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 22, p 0, ethertype ARP, Request who-has 10.15.15.2 tell 10.15.15.1, length 28

3. then I added jail ARP entry to host: 

HOST# arp -s 10.15.15.2 02:95:ce:33:dc:0b

and ICMP requests started to pass from jail to host, and vlan22 interface on host receiving packets and sending replies: 

HOST# tcpdump -i vlan22 -e | grep 10.15.15
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vlan22, link-type EN10MB (Ethernet), capture size 262144 bytes
09:37:11.517054 02:95:ce:33:dc:0b (oui Unknown) > 00:0c:29:2f:6c:08 (oui Unknown), ethertype IPv4 (0x0800), length 98: 10.15.15.2 > 10.15.15.1: ICMP echo request, id 25864, seq 0, length 64
09:37:11.517063 00:0c:29:2f:6c:08 (oui Unknown) > 02:95:ce:33:dc:0b (oui Unknown), ethertype IPv4 (0x0800), length 98: 10.15.15.1 > 10.15.15.2: ICMP echo reply, id 25864, seq 0, length 64

but i don't see them on host's epair0a interface, bridged with em0 in bridge0, there are only requests on epair0a: 

HOST# tcpdump -i epair0a -e | grep 10.15.15
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on epair0a, link-type EN10MB (Ethernet), capture size 262144 bytes
09:40:44.178363 02:95:ce:33:dc:0b (oui Unknown) > 00:0c:29:2f:6c:08 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 22, p 0, ethertype IPv4, 10.15.15.2 > 10.15.15.1: ICMP echo request, id 32264, seq 0, length 64
09:40:45.221713 02:95:ce:33:dc:0b (oui Unknown) > 00:0c:29:2f:6c:08 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 22, p 0, ethertype IPv4, 10.15.15.2 > 10.15.15.1: ICMP echo request, id 32264, seq 1, length 64
09:40:46.253079 02:95:ce:33:dc:0b (oui Unknown) > 00:0c:29:2f:6c:08 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 22, p 0, ethertype IPv4, 10.15.15.2 > 10.15.15.1: ICMP echo request, id 32264, seq 2, length 64

and on em0 i see only replies:

HOST# tcpdump -i em0 -e | grep 10.15.15
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 262144 bytes
09:41:11.092092 00:0c:29:2f:6c:08 (oui Unknown) > 02:95:ce:33:dc:0b (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 22, p 0, ethertype IPv4, 10.15.15.1 > 10.15.15.2: ICMP echo reply, id 34568, seq 0, length 64
09:41:12.096310 00:0c:29:2f:6c:08 (oui Unknown) > 02:95:ce:33:dc:0b (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 22, p 0, ethertype IPv4, 10.15.15.1 > 10.15.15.2: ICMP echo reply, id 34568, seq 1, length 64
09:41:13.121890 00:0c:29:2f:6c:08 (oui Unknown) > 02:95:ce:33:dc:0b (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 22, p 0, ethertype IPv4, 10.15.15.1 > 10.15.15.2: ICMP echo reply, id 34568, seq 2, length 64

and on bridge interface nor requests nor replies are shown. 

HOST# tcpdump -i bridge0 -e | grep 10.15.15
... silince ...

Is it normal and I'm doing something wrong? 
I wanted to make jails act as the normal freebsd host with one dedicated VNET interface with VLANs.