Bug 229435 - ixv SR-IOV currently broken (affects EC2)
Summary: ixv SR-IOV currently broken (affects EC2)
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Colin Percival
URL:
Keywords:
Depends on:
Blocks: 228911
  Show dependency treegraph
 
Reported: 2018-06-30 21:17 UTC by Colin Percival
Modified: 2018-09-11 18:46 UTC (History)
5 users (show)

See Also:


Attachments
AWS c4.xlarge ixv hang on boot screenshoot (697.97 KB, image/jpeg)
2018-08-05 03:28 UTC, pete
no flags Details
c4.xlarge verbose boot (7.67 KB, text/plain)
2018-08-05 03:48 UTC, pete
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Colin Percival freebsd_committer 2018-06-30 21:17:13 UTC
SR-IOV support in the ixv driver is currently broken.  As a result, FreeBSD 12.0 is unable to access the network on EC2 instances where this hardware is used.

If this is not fixed before the release, we should remove the --sriov flag from src/release/Makefile.ec2 so that the AMIs will be marked as not supporting the ixv hardware (at which point those EC2 instances will provide a Xen virtual network device instead).
Comment 1 Ed Maste freebsd_committer 2018-07-31 16:16:05 UTC
Do we have any more details / reproduction recipe? Are the Intel driver folks on it?
Comment 2 pete 2018-08-05 03:28:44 UTC
Created attachment 195875 [details]
AWS c4.xlarge ixv hang on boot screenshoot
Comment 3 pete 2018-08-05 03:31:28 UTC
this is how i can reproduce the problem:

1) deploy FreeBSD 11.2-RELEASE AMI (for us-west-2 ami-206a2158)
- choose a non-ixv EC2 instance type.  For example: t2.xlarge (4core, 16g, xn NIC)

2) build 12-CURRENT on this system

3) after you have booted into this system and verified it works as expected, stop the instance and change the instance type to one that will expose an ixv NIC for example a c4.xlarge

4) the instance will hang on startup.  by viewing the instance screenshot (Actions -> Instance Settings -> Get Instance Screenshot in the AWS web console)
you should see the instance hung.

i've attached a screen shot showing where in boot is has hung to this ticket.

You can stop the instance and revert it back to a t2.xlarge to update things and test again.  I am happy to test out any patches or any other things that may help get this resolved.  I'm currently using 11.2-REL in my infrastructure but want to make sure 12 works as soon as its ready.
Comment 4 pete 2018-08-05 03:48:48 UTC
Created attachment 195876 [details]
c4.xlarge verbose boot

this doesn't seem to get to the point of loading the ixv interface, not sure if this is helpful.
Comment 5 Colin Percival freebsd_committer 2018-08-08 21:30:07 UTC
I've narrowed this down to between r326378 and r326622, based on launching weekly snapshot AMIs on c4.large EC2 instances.
Comment 6 Kevin Bowling freebsd_committer 2018-08-11 05:08:17 UTC
https://reviews.freebsd.org/D16429
Comment 7 Colin Percival freebsd_committer 2018-09-11 18:46:56 UTC
Fixed by r338593