Summary: | ixv driver in 11.0-CURRENT(10.1 & 10.2 RELEASE) doesn't pass traffic using XEN hypervisor(AWS EC2) | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Jarrod Petz <jlpetz> | ||||
Component: | kern | Assignee: | freebsd-net (Nobody) <net> | ||||
Status: | Closed FIXED | ||||||
Severity: | Affects Some People | CC: | davdunc, erj, jeffrey.e.pieper, jlpetz, meyer.sydney | ||||
Priority: | --- | Keywords: | IntelNetworking | ||||
Version: | CURRENT | ||||||
Hardware: | amd64 | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Jarrod Petz
2015-09-09 06:06:20 UTC
We are able to reproduce this, but we will need logs from the host during the time of the failure. We are able to reproduce this with a KVM hypervisor. This occurs when the PF has MTU set to 9000 and the VF has MTU set to default (1500). We are investigating. After some additional testing using KVM, I've found that if the MTU on the PF is set to 9000 BEFORE the VF is created, ixv behaves as expected. If the MTU is changed AFTER the VF is created and then attached to ixv, then the issue is reproducible. We are continuing to investigate. Have had feedback from other engineers who confirmed this patch fixes the issue. https://reviews.freebsd.org/D4186 However there was some small issues with it. As detailed below. ------------------------------------------------------------------------------------- I applied the changes from https://reviews.freebsd.org/D4186 to 11.0-CURRENT (which among other things adds the missing VF-PF API renegotiation on the reset path) and saw packets arriving in the instance, but tagged with vlan 2048. # tcpdump -i ixv0 -e -vvv tcpdump: listening on ixv0, link-type EN10MB (Ethernet), capture size 262144 bytes 10:39:07.551985 12:8d:18:b1:e5:6b (oui Unknown) > 12:39:94:73:0b:1d (oui Unknown), ethertype 802.1Q (0x8100), length 60: vlan 2048, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has ip-10-0-3-114.ec2.internal tell ip-10-0-3-1.ec2.internal, length 42 10:39:08.552133 12:8d:18:b1:e5:6b (oui Unknown) > 12:39:94:73:0b:1d (oui Unknown), ethertype 802.1Q (0x8100), length 60: vlan 2048, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has ip-10-0-3-114.ec2.internal tell ip-10-0-3-1.ec2.internal, length 42 After creating a vlan0 interface with ID 2048 on top of ixv0, I saw traffic passing and DHCP worked. # ifconfig vlan0 create # ifconfig vlan0 vlan 2048 vlandev ixv0 # tcpdump -i vlan0 -vvv -s65534 -n tcpdump: listening on vlan0, link-type EN10MB (Ethernet), capture size 65534 bytes 10:42:00.342629 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from 12:39:94:73:0b:1d, length 300, xid 0x5d968cbb, Flags [none] (0x0000) Client-Ethernet-Address 12:39:94:73:0b:1d Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover Client-ID Option 61, length 7: ether 12:39:94:73:0b:1d Hostname Option 12, length 13: "ip-10-0-0-203" Parameter-Request Option 55, length 9: Subnet-Mask, BR, Time-Zone, Classless-Static-Route Default-Gateway, Domain-Name, Domain-Name-Server, Hostname Option 119 END Option 255, length 0 PAD Option 0, length 0, occurs 21 10:42:00.342916 IP (tos 0x10, ttl 16, id 0, offset 0, flags [none], proto UDP (17), length 337) 10.0.3.1.67 > 10.0.3.114.68: [udp sum ok] BOOTP/DHCP, Reply, length 309, xid 0x5d968cbb, Flags [none] (0x0000) Your-IP 10.0.3.114 Client-Ethernet-Address 12:39:94:73:0b:1d Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 10.0.3.1 Lease-Time Option 51, length 4: 3600 Subnet-Mask Option 1, length 4: 255.255.255.0 BR Option 28, length 4: 10.0.3.255 Default-Gateway Option 3, length 4: 10.0.3.1 Domain-Name Option 15, length 12: "ec2.internal" Domain-Name-Server Option 6, length 4: 10.0.0.2 Hostname Option 12, length 13: "ip-10-0-3-114" END Option 255, length 0 10:42:02.365085 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from 12:39:94:73:0b:1d, length 300, xid 0x5d968cbb, Flags [none] (0x0000) Client-Ethernet-Address 12:39:94:73:0b:1d Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Request Server-ID Option 54, length 4: 10.0.3.1 Requested-IP Option 50, length 4: 10.0.3.114 Client-ID Option 61, length 7: ether 12:39:94:73:0b:1d Hostname Option 12, length 13: "ip-10-0-0-203" Parameter-Request Option 55, length 9: Subnet-Mask, BR, Time-Zone, Classless-Static-Route Default-Gateway, Domain-Name, Domain-Name-Server, Hostname Option 119 END Option 255, length 0 PAD Option 0, length 0, occurs 9 10:42:02.365274 IP (tos 0x10, ttl 16, id 0, offset 0, flags [none], proto UDP (17), length 337) 10.0.3.1.67 > 10.0.3.114.68: [udp sum ok] BOOTP/DHCP, Reply, length 309, xid 0x5d968cbb, Flags [none] (0x0000) Your-IP 10.0.3.114 Client-Ethernet-Address 12:39:94:73:0b:1d Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: ACK Server-ID Option 54, length 4: 10.0.3.1 Lease-Time Option 51, length 4: 3600 Subnet-Mask Option 1, length 4: 255.255.255.0 BR Option 28, length 4: 10.0.3.255 Default-Gateway Option 3, length 4: 10.0.3.1 Domain-Name Option 15, length 12: "ec2.internal" Domain-Name-Server Option 6, length 4: 10.0.0.2 Hostname Option 12, length 13: "ip-10-0-3-114" END Option 255, length 0 10:42:02.370732 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.3.114 tell 10.0.3.114, length 28 10:42:16.345260 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.3.114 tell 10.0.3.1, length 42 10:42:16.345280 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.3.114 is-at 12:39:94:73:0b:1d, length 28 ^C So I added the following patch to the VF driver in the instance to force the VF into stripping VLAN tags on RX and now the instance is able to acquire a DHCP lease and pass traffic on the interface. diff --git a/dev/ixgbe/if_ixv.c b/dev/ixgbe/if_ixv.c index bd06492..a90b4f2 100644 --- a/dev/ixgbe/if_ixv.c +++ b/dev/ixgbe/if_ixv.c @@ -1700,6 +1700,7 @@ ixv_initialize_receive_units(struct adapter *adapter) /* Do the queue enabling last */ rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(i)); rxdctl |= IXGBE_RXDCTL_ENABLE; + rxdctl |= IXGBE_RXDCTL_VME; IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(i), rxdctl); for (int k = 0; k < 10; k++) { if (IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(i)) & All this with an unmodified host driver. The patch probably breaks VLANs inside the instance in some way. ------------------------------------------------------------------------------------- I didn't see any action from other on this, so have submitted a diff for review. https://reviews.freebsd.org/D4788 This allows DHCP to work and obtain a lease. Be mindful though that if you use this that you should ensure the MTU is set correctly for AWS instances. See my diff for details on why. I am resolving this, the commits below have fixed FreeBSD CURRENT/11 and 10 will be fixed if these get MFC'ed. https://reviews.freebsd.org/D4186 https://reviews.freebsd.org/rS292674 https://reviews.freebsd.org/D4788 https://reviews.freebsd.org/rS293338 |