Bug 226086

Summary: [nvme] Intel Optane 900P kernel panic when device is passed through ESXi (6.5)
Product: Base System Reporter: Wessel van Norel <wessel.van.norel>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed Not A Bug    
Severity: Affects Some People CC: 35gh
Priority: ---    
Version: CURRENT   
Hardware: arm64   
OS: Any   
Attachments:
Description Flags
Screenshot of the kernel panic none

Description Wessel van Norel 2018-02-21 09:24:00 UTC
Created attachment 190851 [details]
Screenshot of the kernel panic

I'm trying to build a FreeNAS system on an ESXi host with an Intel Optane 900p as ZIL/ZLOG device via passthrough. But when I boot the vm it crashes with a kernel panic. Only when I remove the Intel Optane 900p from the VM it boots correctly. 

After finding this I found a bug on the FreeNAS forums about this issue:

https://redmine.ixsystems.com/issues/26508

In the bug there is a comment (Warner is Warner Losh <imp@FreeBSD.org>):

Yes, Warner replied to my emails.

His latest suggestion was
"There is a small chance https://reviews.freebsd.org/D14053 fixes this."


So I tried to boot the most recent nightly ISO:

FreeBSD-12.0-CURRENT-amd64-20180215-r329338-disc1.iso

And that failed with the same kernel panic (I assume review D14053 is in this since the revision number of the ISO is higher then the revision where review D14053 is added). 

When I boot the system with from an USB stick with the nightly ISO (so without ESXi) the kernel panic does not occur and the nvme device is detected correctly.

The kernel panic I'm getting is:

Fatal trap 12: page fault while in kernel mode. I've attached a screenshot of the exact kernel panic.

When I only add my Samsung 960 PRO to the VM it works. So it's the Intel Optane 900P that is causing this issue. But only when being passed through.


How can I help to find the root cause?
Comment 1 Wessel van Norel 2018-02-26 14:29:50 UTC
I've updated the ESXi drivers for the nvme device and that solved the issue.

https://my.vmware.com/group/vmware/details?downloadGroup=DT-ESX65-INTEL-INTEL-NVME-1328&productId=614

With this driver installed the FreeBSD current ISO boots.

So this issue can be closed.
Comment 2 35gh 2018-06-15 11:35:49 UTC
This bug needs to be re-opened, as issue is not fixed.

Tested configuration:
- Intel 900P 280Gb U2
- SuperMicro X10SDV, 64Gb RAM

- ESXi 6.5 2018-05-03 (Update 2) (Imageprofile ESXi-6.5.0-20180502001-standard (Build 8294253)
- ESXi 6.7 2018-04-17 (GA) (Imageprofile ESXi-6.7.0-8169922-standard (Build 8169922))
- Tested with and without Intel NVMe driver (intel-nvme-1.3.2.4-1OEM.650.0.0.4598673.x86_64.vib)

- FreeBSD 11.1 prod (FreeBSD-11.1-STABLE-amd64-20171227-r327234)
- FreeBSD 12 (FreeBSD-12.0-CURRENT-amd64-20180125-r328383)

Apparently, this is related to PCI devices order in VM, see https://redmine.ixsystems.com/issues/26508