Created attachment 161991 [details]
Crashinfo file for NVME kernel panic
We are running a ZFS filer server using 10.2-RELEASE. We recently encountered a kernel panic from the NVME driver under heavy load with an Intel DC P3700 NVME SSD board as a ZIL and an L2ARC. The core.txt file from the crash dump is attached. All NVME sysctl params were left at the defaults. The stack trace from the crash dump is:
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:219
#1 0xffffffff80948642 in kern_reboot (howto=260)
#2 0xffffffff80948a25 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:758
#3 0xffffffff809488b3 in panic (fmt=0x0)
#4 0xffffffff80d4aadb in trap_fatal (frame=<value optimized out>,
eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:851
#5 0xffffffff80d4addd in trap_pfault (frame=0xfffffe3fd0568870,
usermode=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:674
#6 0xffffffff80d4a47a in trap (frame=0xfffffe3fd0568870)
#7 0xffffffff80d307f2 in calltrap ()
#8 0xfffff8345ad7aaa8 in ?? ()
#9 0xffffffff80dedc03 in nvme_bio_child_done (arg=<value optimized out>,
cpl=<value optimized out>) at /usr/src/sys/dev/nvme/nvme_ns.c:235
#10 0xffffffff80dee55e in nvme_qpair_complete_tracker (
cpl=<value optimized out>, print_on_error=<value optimized out>)
#11 0xffffffff80dee370 in nvme_qpair_process_completions (
qpair=0xfffff801268b9900) at /usr/src/sys/dev/nvme/nvme_qpair.c:433
#12 0xffffffff8091482b in intr_event_execute_handlers (
p=<value optimized out>, ie=0xfffff801268bf900)
#13 0xffffffff80914c76 in ithread_loop (arg=0xfffff801268d0380)
#14 0xffffffff8091244a in fork_exit (
callout=0xffffffff80914be0 <ithread_loop>, arg=0xfffff801268d0380,
frame=0xfffffe3fd0568ac0) at /usr/src/sys/kern/kern_fork.c:1018
#15 0xffffffff80d30d2e in fork_trampoline ()
CC'ing Jim who works on nvme(4)
I've received a similar report from a second party offline. I am actively debugging this. In the e-mail to -current, it was suggested that this filer was using NVMe on 10.1 without any issues, and this issue was only observed after upgrading to 10.2. Can you confirm?
The NVME boards were not installed very long before the upgrade to 10.2. I don't think there was enough run time to say this wasn't an issue in 10.1.
BTW, we have two more crash dumps from panics in the driver if it would help.
Thanks for the quick reply. Yes - please post the additional crashinfo files.
Created attachment 162398 [details]
Crash info file
Created attachment 162399 [details]
Crash info file
Attached two more crash info files.
A commit references this bug:
Date: Fri Oct 30 16:06:34 UTC 2015
New revision: 290198
nvme: fix race condition in split bio completion path
Fixes race condition observed under following circumstances:
1) I/O split on 128KB boundary with Intel NVMe controller.
Current Intel controllers produce better latency when
I/Os do not span a 128KB boundary - even if the I/O size
itself is less than 128KB.
2) Per-CPU I/O queues are enabled.
3) Child I/Os are submitted on different submission queues.
4) Interrupts for child I/O completions occur almost
5) ithread for child I/O A increments bio_inbed, then
immediately is preempted (rendezvous IPI, higher priority
6) ithread for child I/O B increments bio_inbed, then completes
parent bio since all children are now completed.
7) parent bio is freed, and immediately reallocated for a VFS
or gpart bio (including setting bio_children to 1 and
8) ithread for child I/O A resumes processing. bio_children
for what it thinks is the parent bio is set to 1, so it
thinks it needs to complete the parent bio.
Result is either calling a NULL callback function, or double freeing
the bio to its uma zone.
Reported by: Drew Gallatin <email@example.com>,
Marc Goroff <firstname.lastname@example.org>
Tested by: Drew Gallatin <email@example.com>
MFC after: 3 days
Sponsored by: Intel
Issue should be fixed with SVN r290198 - waiting for confirmation from submitter.
Marc confirmed offline that this bug can be closed.