Bug 211713 - NVME controller failure: resetting (Samsung SM961 SSD Drives)
Summary: NVME controller failure: resetting (Samsung SM961 SSD Drives)
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Many People
Assignee: freebsd-bugs mailing list
URL: https://reviews.freebsd.org/D20873
Keywords: needs-qa, patch
Depends on:
Blocks:
 
Reported: 2016-08-10 05:51 UTC by IPTRACE
Modified: 2019-09-06 00:07 UTC (History)
31 users (show)

See Also:
koobs: mfc-stable10?
koobs: mfc-stable11?
lwhsu: mfc-stable12?


Attachments
Boot capture with uname and "resetting controller" (600.05 KB, image/bmp)
2017-03-14 20:50 UTC, Terry Kennedy
no flags Details
dd fails with error, but made some progress (600.05 KB, image/bmp)
2017-03-14 20:51 UTC, Terry Kennedy
no flags Details
newfs output - also makes some progress (600.05 KB, image/bmp)
2017-03-14 20:51 UTC, Terry Kennedy
no flags Details
Log of failed patch application on 11-STABLE (621 bytes, text/plain)
2018-03-16 08:22 UTC, Terry Kennedy
no flags Details
dmesg output (17.08 KB, text/plain)
2018-11-06 22:24 UTC, David
no flags Details
A patch trying to fix the missing interrupt issue on SM961. (1.10 KB, text/plain)
2019-06-30 05:02 UTC, Ka Ho Ng
no flags Details
Fix SM961 issue (1.24 KB, patch)
2019-07-06 11:16 UTC, Ka Ho Ng
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description IPTRACE 2016-08-10 05:51:43 UTC
Dear all!

I've encoutered the problem with nvme0 controller driver on FreeBSD.
The driver/system works properly with Samsung 950 Pro 512GB nvme.

Unfortunately, during installation from USB stick (FreeBSD 10.3-RELEASEp0) of Samsung SM961 1TB nvme drive, I have the following statements:

 nvme0: resetting controller
 nvme0: aborting outstanding i/o
 nvme0: WRITE sqid:8 cid:127 nsid:1 lba:5131264 len:64
 nvme0: ABORTED - BY REQUEST (00/07) sqid:8: cid:127 cdw0:0
or
 nvme0: resetting controller
 nvme0: aborting outstanding i/o
 nvme0: WRITE sqid:8 cid:127 nsid:1 lba:2047 len:1
 nvme0: ABORTED - BY REQUEST (00/07) sqid:8: cid:127 cdw0:0

The above statemets repeats.
It occurs more often than not during mounting partitions and before copying installation files.
I can even partition and format the disk. I can use disklabel, diskinfo and fdisk to check the drive but sometimes that occurs during system boot.

The drive works properly on Windows 10 and has no errors.

Thanks for help.
Comment 1 Warner Losh freebsd_committer 2016-08-10 06:08:15 UTC
Can you post more of the dmesg to give proper context to when this happens?
Comment 2 IPTRACE 2016-08-10 07:24:37 UTC
Copyright (c) 1992-2016 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.3-RELEASE #0 r297264: Fri Mar 25 02:10:02 UTC 2016
    root@releng1.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
VT(efifb): resolution 1280x1024
CPU: Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz (2300.05-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306f2  Family=0x6  Model=0x3f  Stepping=2
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffefbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x21<LAHF,ABM>
  Structured Extended Features=0x37ab<FSGSBASE,TSCADJ,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,PQM,NFPUSG>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics
real memory  = 274869518336 (262136 MB)
avail memory = 267073318912 (254700 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <GBT    GBTUACPI>
FreeBSD/SMP: Multiprocessor System Detected: 40 CPUs
FreeBSD/SMP: 2 package(s) x 10 core(s) x 2 SMT threads
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
 cpu4 (AP): APIC ID:  4
 cpu5 (AP): APIC ID:  5
 cpu6 (AP): APIC ID:  6
 cpu7 (AP): APIC ID:  7
 cpu8 (AP): APIC ID:  8
 cpu9 (AP): APIC ID:  9
 cpu10 (AP): APIC ID: 16
 cpu11 (AP): APIC ID: 17
 cpu12 (AP): APIC ID: 18
 cpu13 (AP): APIC ID: 19
 cpu14 (AP): APIC ID: 20
 cpu15 (AP): APIC ID: 21
 cpu16 (AP): APIC ID: 22
 cpu17 (AP): APIC ID: 23
 cpu18 (AP): APIC ID: 24
 cpu19 (AP): APIC ID: 25
 cpu20 (AP): APIC ID: 32
 cpu21 (AP): APIC ID: 33
 cpu22 (AP): APIC ID: 34
 cpu23 (AP): APIC ID: 35
 cpu24 (AP): APIC ID: 36
 cpu25 (AP): APIC ID: 37
 cpu26 (AP): APIC ID: 38
 cpu27 (AP): APIC ID: 39
 cpu28 (AP): APIC ID: 40
 cpu29 (AP): APIC ID: 41
 cpu30 (AP): APIC ID: 48
 cpu31 (AP): APIC ID: 49
 cpu32 (AP): APIC ID: 50
 cpu33 (AP): APIC ID: 51
 cpu34 (AP): APIC ID: 52
 cpu35 (AP): APIC ID: 53
 cpu36 (AP): APIC ID: 54
 cpu37 (AP): APIC ID: 55
 cpu38 (AP): APIC ID: 56
 cpu39 (AP): APIC ID: 57
random: <Software, Yarrow> initialized
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
ioapic2 <Version 2.0> irqs 48-71 on motherboard
module_register_init: MOD_LOAD (vesa, 0xffffffff80dc6500, 0) error 19
kbd0 at kbdmux0
acpi0: <GBT GBTUACPI> on motherboard
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
cpu4: <ACPI CPU> on acpi0
cpu5: <ACPI CPU> on acpi0
cpu6: <ACPI CPU> on acpi0
cpu7: <ACPI CPU> on acpi0
cpu8: <ACPI CPU> on acpi0
cpu9: <ACPI CPU> on acpi0
cpu10: <ACPI CPU> on acpi0
cpu11: <ACPI CPU> on acpi0
cpu12: <ACPI CPU> on acpi0
cpu13: <ACPI CPU> on acpi0
cpu14: <ACPI CPU> on acpi0
cpu15: <ACPI CPU> on acpi0
cpu16: <ACPI CPU> on acpi0
cpu17: <ACPI CPU> on acpi0
cpu18: <ACPI CPU> on acpi0
cpu19: <ACPI CPU> on acpi0
cpu20: <ACPI CPU> on acpi0
cpu21: <ACPI CPU> on acpi0
cpu22: <ACPI CPU> on acpi0
cpu23: <ACPI CPU> on acpi0
cpu24: <ACPI CPU> on acpi0
cpu25: <ACPI CPU> on acpi0
cpu26: <ACPI CPU> on acpi0
cpu27: <ACPI CPU> on acpi0
cpu28: <ACPI CPU> on acpi0
cpu29: <ACPI CPU> on acpi0
cpu30: <ACPI CPU> on acpi0
cpu31: <ACPI CPU> on acpi0
cpu32: <ACPI CPU> on acpi0
cpu33: <ACPI CPU> on acpi0
cpu34: <ACPI CPU> on acpi0
cpu35: <ACPI CPU> on acpi0
cpu36: <ACPI CPU> on acpi0
cpu37: <ACPI CPU> on acpi0
cpu38: <ACPI CPU> on acpi0
cpu39: <ACPI CPU> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x71,0x74-0x77 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 350
Event timer "HPET1" frequency 14318180 Hz quality 340
Event timer "HPET2" frequency 14318180 Hz quality 340
Event timer "HPET3" frequency 14318180 Hz quality 340
Event timer "HPET4" frequency 14318180 Hz quality 340
Event timer "HPET5" frequency 14318180 Hz quality 340
Event timer "HPET6" frequency 14318180 Hz quality 340
Event timer "HPET7" frequency 14318180 Hz quality 340
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> on acpi0
pci255: <ACPI PCI bus> on pcib0
pcib1: <ACPI Host-PCI bridge> on acpi0
pci127: <ACPI PCI bus> on pcib1
pcib2: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> irq 26 at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> irq 32 at device 2.0 on pci0
pci2: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> irq 32 at device 2.2 on pci0
pci3: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> irq 40 at device 3.0 on pci0
pci4: <ACPI PCI bus> on pcib6
pcib7: <ACPI PCI-PCI bridge> irq 40 at device 3.2 on pci0
pci5: <ACPI PCI bus> on pcib7
pci0: <unknown> at device 17.0 (no driver attached)
xhci0: <XHCI (generic) USB 3.0 controller> mem 0xc7200000-0xc720ffff irq 19 at device 20.0 on pci0
xhci0: 32 bytes context size, 64-bit DMA
usbus0 on xhci0
pci0: <simple comms> at device 22.0 (no driver attached)
pci0: <simple comms> at device 22.1 (no driver attached)
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xc7214000-0xc72143ff irq 18 at device 26.0 on pci0
usbus1: EHCI version 1.0
usbus1 on ehci0
pcib8: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci6: <ACPI PCI bus> on pcib8
pcib9: <ACPI PCI-PCI bridge> irq 18 at device 28.2 on pci0
pci7: <ACPI PCI bus> on pcib9
pcib10: <ACPI PCI-PCI bridge> at device 0.0 on pci7
pci8: <ACPI PCI bus> on pcib10
vgapci0: <VGA-compatible display> port 0x6000-0x607f mem 0xc6000000-0xc6ffffff,0xc7000000-0xc701ffff irq 16 at device 0.0 on pci8
vgapci0: Boot video device
pcib11: <ACPI PCI-PCI bridge> irq 16 at device 28.4 on pci0
pci9: <ACPI PCI bus> on pcib11
nvme0: <Generic NVMe Device> mem 0xc7100000-0xc7103fff irq 16 at device 0.0 on pci9
ehci1: <EHCI (generic) USB 2.0 controller> mem 0xc7213000-0xc72133ff irq 18 at device 29.0 on pci0
usbus2: EHCI version 1.0
usbus2 on ehci1
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
ahci0: <Intel Wellsburg AHCI SATA controller> port 0x7050-0x7057,0x7040-0x7043,0x7030-0x7037,0x7020-0x7023,0x7000-0x701f mem 0xc7212000-0xc72127ff irq 16 at device 31.2 on pci0
ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahcich2: <AHCI channel> at channel 2 on ahci0
ahcich3: <AHCI channel> at channel 3 on ahci0
ahciem0: <AHCI enclosure management bridge> on ahci0
pcib12: <ACPI Host-PCI bridge> on acpi0
pci128: <ACPI PCI bus> on pcib12
pcib13: <ACPI PCI-PCI bridge> irq 50 at device 1.0 on pci128
pci129: <ACPI PCI bus> on pcib13
igb0: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xf020-0xf03f mem 0xfbd20000-0xfbd3ffff,0xfbd44000-0xfbd47fff irq 50 at device 0.0 on pci129
igb0: Using MSIX interrupts with 9 vectors
igb0: Ethernet address: 40:8d:5c:6d:c1:81
igb0: Bound queue 0 to cpu 0
igb0: Bound queue 1 to cpu 1
igb0: Bound queue 2 to cpu 2
igb0: Bound queue 3 to cpu 3
igb0: Bound queue 4 to cpu 4
igb0: Bound queue 5 to cpu 5
igb0: Bound queue 6 to cpu 6
igb0: Bound queue 7 to cpu 7
igb1: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xf000-0xf01f mem 0xfbd00000-0xfbd1ffff,0xfbd40000-0xfbd43fff irq 52 at device 0.1 on pci129
igb1: Using MSIX interrupts with 9 vectors
igb1: Ethernet address: 40:8d:5c:6d:c1:82
igb1: Bound queue 0 to cpu 8
igb1: Bound queue 1 to cpu 9
igb1: Bound queue 2 to cpu 10
igb1: Bound queue 3 to cpu 11
igb1: Bound queue 4 to cpu 12
igb1: Bound queue 5 to cpu 13
igb1: Bound queue 6 to cpu 14
igb1: Bound queue 7 to cpu 15
pcib14: <ACPI PCI-PCI bridge> irq 56 at device 2.0 on pci128
pci130: <ACPI PCI bus> on pcib14
pcib15: <ACPI PCI-PCI bridge> at device 0.0 on pci130
pci131: <ACPI PCI bus> on pcib15
pcib16: <PCI-PCI bridge> at device 2.0 on pci131
pci132: <PCI bus> on pcib16
igb2: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe020-0xe03f mem 0xfbc20000-0xfbc3ffff,0xfb800000-0xfbbfffff,0xfbc44000-0xfbc47fff irq 61 at device 0.0 on pci132
igb2: Using MSIX interrupts with 9 vectors
igb2: Ethernet address: 90:e2:ba:06:6a:d8
igb2: Bound queue 0 to cpu 16
igb2: Bound queue 1 to cpu 17
igb2: Bound queue 2 to cpu 18
igb2: Bound queue 3 to cpu 19
igb2: Bound queue 4 to cpu 20
igb2: Bound queue 5 to cpu 21
igb2: Bound queue 6 to cpu 22
igb2: Bound queue 7 to cpu 23
igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe000-0xe01f mem 0xfbc00000-0xfbc1ffff,0xfb000000-0xfb3fffff,0xfbc40000-0xfbc43fff irq 62 at device 0.1 on pci132
igb3: Using MSIX interrupts with 9 vectors
igb3: Ethernet address: 90:e2:ba:06:6a:d9
igb3: Bound queue 0 to cpu 24
igb3: Bound queue 1 to cpu 25
igb3: Bound queue 2 to cpu 26
igb3: Bound queue 3 to cpu 27
igb3: Bound queue 4 to cpu 28
igb3: Bound queue 5 to cpu 29
igb3: Bound queue 6 to cpu 30
igb3: Bound queue 7 to cpu 31
pcib17: <PCI-PCI bridge> at device 4.0 on pci131
pci133: <PCI bus> on pcib17
igb4: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xd020-0xd03f mem 0xfa820000-0xfa83ffff,0xfa400000-0xfa7fffff,0xfa844000-0xfa847fff irq 56 at device 0.0 on pci133
igb4: Using MSIX interrupts with 9 vectors
igb4: Ethernet address: 90:e2:ba:06:6a:dc
igb4: Bound queue 0 to cpu 32
igb4: Bound queue 1 to cpu 33
igb4: Bound queue 2 to cpu 34
igb4: Bound queue 3 to cpu 35
igb4: Bound queue 4 to cpu 36
igb4: Bound queue 5 to cpu 37
igb4: Bound queue 6 to cpu 38
igb4: Bound queue 7 to cpu 39
igb5: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xd000-0xd01f mem 0xfa800000-0xfa81ffff,0xf9c00000-0xf9ffffff,0xfa840000-0xfa843fff irq 60 at device 0.1 on pci133
igb5: Using MSIX interrupts with 9 vectors
igb5: Ethernet address: 90:e2:ba:06:6a:dd
igb5: Bound queue 0 to cpu 0
igb5: Bound queue 1 to cpu 1
igb5: Bound queue 2 to cpu 2
igb5: Bound queue 3 to cpu 3
igb5: Bound queue 4 to cpu 4
igb5: Bound queue 5 to cpu 5
igb5: Bound queue 6 to cpu 6
igb5: Bound queue 7 to cpu 7
acpi_button0: <Power Button> on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff on isa0
ppc0: cannot reserve I/O port range
est0: <Enhanced SpeedStep Frequency Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
est2: <Enhanced SpeedStep Frequency Control> on cpu2
est3: <Enhanced SpeedStep Frequency Control> on cpu3
est4: <Enhanced SpeedStep Frequency Control> on cpu4
est5: <Enhanced SpeedStep Frequency Control> on cpu5
est6: <Enhanced SpeedStep Frequency Control> on cpu6
est7: <Enhanced SpeedStep Frequency Control> on cpu7
est8: <Enhanced SpeedStep Frequency Control> on cpu8
est9: <Enhanced SpeedStep Frequency Control> on cpu9
est10: <Enhanced SpeedStep Frequency Control> on cpu10
est11: <Enhanced SpeedStep Frequency Control> on cpu11
est12: <Enhanced SpeedStep Frequency Control> on cpu12
est13: <Enhanced SpeedStep Frequency Control> on cpu13
est14: <Enhanced SpeedStep Frequency Control> on cpu14
est15: <Enhanced SpeedStep Frequency Control> on cpu15
est16: <Enhanced SpeedStep Frequency Control> on cpu16
est17: <Enhanced SpeedStep Frequency Control> on cpu17
est18: <Enhanced SpeedStep Frequency Control> on cpu18
est19: <Enhanced SpeedStep Frequency Control> on cpu19
est20: <Enhanced SpeedStep Frequency Control> on cpu20
est21: <Enhanced SpeedStep Frequency Control> on cpu21
est22: <Enhanced SpeedStep Frequency Control> on cpu22
est23: <Enhanced SpeedStep Frequency Control> on cpu23
est24: <Enhanced SpeedStep Frequency Control> on cpu24
est25: <Enhanced SpeedStep Frequency Control> on cpu25
est26: <Enhanced SpeedStep Frequency Control> on cpu26
est27: <Enhanced SpeedStep Frequency Control> on cpu27
est28: <Enhanced SpeedStep Frequency Control> on cpu28
est29: <Enhanced SpeedStep Frequency Control> on cpu29
est30: <Enhanced SpeedStep Frequency Control> on cpu30
est31: <Enhanced SpeedStep Frequency Control> on cpu31
est32: <Enhanced SpeedStep Frequency Control> on cpu32
est33: <Enhanced SpeedStep Frequency Control> on cpu33
est34: <Enhanced SpeedStep Frequency Control> on cpu34
est35: <Enhanced SpeedStep Frequency Control> on cpu35
est36: <Enhanced SpeedStep Frequency Control> on cpu36
est37: <Enhanced SpeedStep Frequency Control> on cpu37
est38: <Enhanced SpeedStep Frequency Control> on cpu38
est39: <Enhanced SpeedStep Frequency Control> on cpu39
random: unblocking device.
usbus0: 5.0Gbps Super Speed USB v3.0
Timecounters tick every 1.000 msec
usbus1: 480Mbps High Speed USB v2.0
ugen0.1: <0x8086> at usbus0
uhub0: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
usbus2: 480Mbps High Speed USB v2.0
ugen1.1: <Intel> at usbus1
uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
ugen2.1: <Intel> at usbus2
uhub2: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
nvd0: <SAMSUNG MZVKW1T0HMLH-00000> NVMe namespace
nvd0: 976762MB (2000409264 512 byte sectors)
ses0 at ahciem0 bus 0 scbus4 target 0 lun 0
ses0: <AHCI SGPIO Enclosure 1.00 0001> SEMB S-E-S 2.00 device
ses0: SEMB SES Device
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <HGST HDN724040ALE640 MJAOA5E0> ATA8-ACS SATA 3.x device
ada0: Serial Number PK1334PEJH5ULS
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 3815447MB (7814037168 512 byte sectors)
ada0: Previously was known as ad4
ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
ada1: <HGST HDN724040ALE640 MJAOA5E0> ATA8-ACS SATA 3.x device
ada1: Serial Number PK1334PEJH1VBS
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 3815447MB (7814037168 512 byte sectors)
ada1: Previously was known as ad6
ada2 at ahcich2 bus 0 scbus2 target 0 lun 0
ada2: <HGST HDN724040ALE640 MJAOA5E0> ATA8-ACS SATA 3.x device
uhub1: 2 ports with 2 removable, self powered
ada2: Serial Number PK1334PEJGY25S
uhub2: 2 ports with 2 removable, self powered
ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 3815447MB (7814037168 512 byte sectors)
ada2: Previously was known as ad8
ada3 at ahcich3 bus 0 scbus3 target 0 lun 0
ada3: <HGST HDN724040ALE640 MJAOA5E0> ATA8-ACS SATA 3.x device
ada3: Serial Number PK1334PEJH1UZS
ada3: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 3815447MB (7814037168 512 byte sectors)
uhub0: 21 ports with 21 removable, self powered
ada3: Previously was known as ad10
SMP: AP CPU #1 Launched!
SMP: AP CPU #4 Launched!
SMP: AP CPU #30 Launched!
SMP: AP CPU #5 Launched!
SMP: AP CPU #34 Launched!
SMP: AP CPU #7 Launched!
SMP: AP CPU #20 Launched!
SMP: AP CPU #25 Launched!
SMP: AP CPU #16 Launched!
SMP: AP CPU #31 Launched!
SMP: AP CPU #17 Launched!
SMP: AP CPU #38 Launched!
SMP: AP CPU #11 Launched!
SMP: AP CPU #33 Launched!
SMP: AP CPU #32 Launched!
SMP: AP CPU #14 Launched!
SMP: AP CPU #21 Launched!
SMP: AP CPU #6 Launched!
SMP: AP CPU #22 Launched!
SMP: AP CPU #10 Launched!
SMP: AP CPU #35 Launched!
SMP: AP CPU #8 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #23 Launched!
SMP: AP CPU #12 Launched!
SMP: AP CPU #18 Launched!
SMP: AP CPU #27 Launched!
SMP: AP CPU #36 Launched!
SMP: AP CPU #39 Launched!
SMP: AP CPU #19 Launched!
SMP: AP CPU #15 Launched!
SMP: AP CPU #29 Launched!
SMP: AP CPU #24 Launched!
SMP: AP CPU #26 Launched!
SMP: AP CPU #37 Launched!
SMP: AP CPU #28 Launched!
SMP: AP CPU #9 Launched!
SMP: AP CPU #13 Launched!
Timecounter "TSC-low" frequency 1150026386 Hz quality 1000
Root mount waiting for: usbus2 usbus1 usbus0
ugen0.2: <no manufacturer> at usbus0
uhub3: <no manufacturer Gadget USB HUB, class 9/0, rev 2.00/0.00, addr 1> on usbus0
ugen2.2: <vendor 0x8087> at usbus2
uhub4: <vendor 0x8087 product 0x8002, class 9/0, rev 2.00/0.05, addr 2> on usbus2
ugen1.2: <vendor 0x8087> at usbus1
uhub5: <vendor 0x8087 product 0x800a, class 9/0, rev 2.00/0.05, addr 2> on usbus1
uhub5: 6 ports with 6 removable, self powered
uhub4: 8 ports with 8 removable, self powered
uhub3: 5 ports with 5 removable, self powered
Root mount waiting for: usbus0
ugen0.3: <Avocent> at usbus0
ukbd0: <Keyboard> on usbus0
kbd1 at ukbd0
ugen0.4: <Dell> at usbus0
ukbd1: <EP1 Interrupt> on usbus0
kbd2 at ukbd1
Root mount waiting for: usbus0
Root mount waiting for: usbus0
ugen0.5: <SanDisk> at usbus0
umass0: <SanDisk Extreme, class 0/0, rev 3.00/0.10, addr 4> on usbus0
umass0:  SCSI over Bulk-Only; quirks = 0x0100
umass0:5:0:-1: Attached to scbus5
Trying to mount root from ufs:/dev/da0p3 [rw,noatime]...
mountroot: waiting for device /dev/da0p3 ...
da0 at umass-sim0 bus 0 scbus5 target 0 lun 0
da0: <SanDisk Extreme 0001> Removable Direct Access SPC-4 SCSI device
da0: Serial Number AA011021141312316547
da0: 400.000MB/s transfers
da0: 61057MB (125045424 512 byte sectors)
da0: quirks=0x2<NO_6_BYTE>
WARNING: / was not properly dismounted
igb0: link state changed to UP
igb1: link state changed to UP
igb2: link state changed to UP
ums0: <Mouse> on usbus0
ums0: 3 buttons and [Z] coordinates ID=0
ums1: <Mouse REL> on usbus0
ums1: 3 buttons and [XYZ] coordinates ID=0
Comment 3 IPTRACE 2016-08-10 07:25:15 UTC
crw-r-----  1 root  operator   0x53 Aug 10 06:24 /dev/nvd0
crw-r-----  1 root  operator   0x5a Aug 10 06:24 /dev/nvd0s1
crw-------  1 root  wheel      0x27 Aug 10 06:24 /dev/nvme0
crw-------  1 root  wheel      0x51 Aug 10 06:24 /dev/nvme0ns1

/dev/nvd0       512     1024209543168   2000409264      512     0

******* Working on device /dev/nvd0 *******
parameters extracted from in-core disklabel are:
cylinders=124519 heads=255 sectors/track=63 (16065 blks/cyl)

Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=124519 heads=255 sectors/track=63 (16065 blks/cyl)

Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 7 (0x07),(NTFS, OS/2 HPFS, QNX-2 (16 bit) or Advanced UNIX)
    start 2048, size 2000404480 (976760 Meg), flag 0
        beg: cyl 0/ head 32/ sector 33;
        end: cyl 1023/ head 254/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>

******* Working on device /dev/nvd0 *******
parameters extracted from in-core disklabel are:
cylinders=124519 heads=255 sectors/track=63 (16065 blks/cyl)

Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=124519 heads=255 sectors/track=63 (16065 blks/cyl)

Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
    start 63, size 2000397672 (976756 Meg), flag 80 (active)
        beg: cyl 0/ head 1/ sector 1;
        end: cyl 614/ head 254/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>
=>        34  2000409197  nvd0  GPT  (954G)
          34        2014     1  efi  (1.0M)
        2048    10485760     2  freebsd-ufs  (5.0G)
    10487808   104857600     3  freebsd-ufs  (50G)
   115345408    20971520     4  freebsd-ufs  (10G)
   136316928    10485760     5  freebsd-ufs  (5.0G)
   146802688    10485760     6  freebsd-ufs  (5.0G)
   157288448    10485760     7  freebsd-ufs  (5.0G)
   167774208  1832635023     8  freebsd-ufs  (874G)
Comment 4 frederic.lassel 2016-11-04 17:46:14 UTC
I've got the same problem with a Samsung SSD SM961 256GB and FreeBSD 11.0. The SSD works fine with Ubuntu and Windows.

my equipment:
Xeon E3-1245v5
Gigabyte GA-X170-Extreme ECC
2x Kingston KVR21E15D8K2/16I = 32GB
Samsung SSD SM961 256GB, M.2 (MZVPW256HEGL-00000)
Comment 5 Terry Kennedy 2016-11-28 01:22:51 UTC
I have the same problem on 10-STABLE (r309209) with a 128GB SM961 and can also reproduce it on CURRENT (20161117 snapshot). Even attempting to read the device with dd causes the error:

(0:21) pool13:/sysprog/terry# dd if=/dev/nvme0ns1 of=/dev/null count=1
nvme0: resetting controller
nvme0: aborting outstanding i/o
nvme0: READ sqid:8 cid:127 nsid:1 lba:3 len:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:8 cid:127 cdw0:0

nvmecontrol reports that the drive has never experienced an error, even after the above:

(0:30) pool13:/sysprog/terry# nvmecontrol logpage -p 1 nvme0
Error Information Log
=====================
No error entries found

smartctl similarly reports no problems.

Booting Arch Linux 2016.11.01 lets me read and write to the drive with no problems, so I don't think this is a hardware problem.

I have the module in a dedicated test system and can provide a https:// remote console to the developer(s) if it would help to pin down the problem. Let me know if you'd like any further information.
Comment 6 Robin Randhawa 2016-12-18 16:02:49 UTC
Hi all.

I have a similar issue. My hardware is a Lenovo Thinkpad P50 with a 512GB nvme SSD running current (synced yesterday) with a custom kernel configuration that differs from GENERIC only in that VESA is disabled (nvidia issues while resuming from ACPI S3 if not).

$ nvcontrol devlist
 nvme0: SAMSUNG MZVKV512HAJH-000L1
    nvme0ns1 (488386MB)

The particular manifestation I see is on resume when disk access seems to stall for 5-10 seconds. The relevant messages in the kernel log buffer are a scary stream of:

nvme0: READ sqid:6 cid:124 nsid:1 lba:697441256 len:8
nvme0: ABORTED - BY REQUEST (00/07) sqid:6 cid:124 cdw0:0
nvme0: aborting outstanding i/o
nvme0: READ sqid:8 cid:109 nsid:1 lba:399615616 len:40
nvme0: ABORTED - BY REQUEST (00/07) sqid:8 cid:109 cdw0:0

Please let me now if you need any more specifics. Thanks.
Comment 7 Terry Kennedy 2016-12-20 09:14:49 UTC
(In reply to Robin Randhawa from comment #6)

That certainly looks like the same issue. I'm working with the driver developer on this issue and we have reproduced it on my system. I've sent some debugging info, but may need to run an specially-instrumented version of the driver to track down where the fault is being triggered.

If you need this working (with possibly VERY reduced performance), you can add:

hw.nvme.per_cpu_io_queues=0

to your /boot/loader.conf file. Note that this is a workaround, not a fix.
Comment 8 Terry Kennedy 2016-12-20 09:18:57 UTC
(In reply to Terry Kennedy from comment #7)

On second thought, if yours only happens on resume, it may not be the same bug at all - it may be something in the suspend/resume code that isn't restoring the NVMe state until the request times out and the recovery code kicks in.

Plus, I think your module is a 951 and people are successfully using that module - all the other reports in this thread are on the 961.
Comment 9 Robin Randhawa 2017-01-03 12:49:15 UTC
(In reply to Terry Kennedy from Comment #8)

Hi Terry.

Thanks for the responses. I concur with your view that this is likely something to do with the suspend-resume pathway and missing context that isn't restored correctly (or at all).

BTW I was incorrectly stating a ~5 second 'stall'. This is in fact closer to ~30 seconds. So after a resume from ACPI S3, there is a ~30 second freeze before the abort messages and the system becoming usable again. Most irritating.

I'm trying to grok the nvme driver source to see if there is some basic suspend resume callbacks that need fleshing out. I will update if I make any headway.

I also worry that there is some ACPI <-> NVMe overlaps that may be to blame. This is a fairly new laptop and there are a whole lot of ACPI related messages appearing in the kernel log buffer at suspend-resume time.

Cheers.
Comment 10 MacLemon 2017-02-24 12:42:20 UTC
I'm having this issue on a fairly recent SuperMicro Xeon-D 1541 (SuperServer 5018D-FN4T) system as soon as I put _two_ NVMe SSD in. When using only the onboard M.2/M-Key Slot on the motherboard I can install FreeBSD 11.0-RELEASE p1 just fine.
As soon as I put another NVMe SSD in, via PCIe Card adapter, I geh the resetting controller messages that IPTRACE already mentioned. There is no difference whether I put both on the PCIe card or one on the motherboard and one on the PCIe card.

I'm using the SuperMicro AOC-SLG3-2M2 for adding the second NVMe SSD which are Samsung SM961/256GB models.

Since I basically cannot put the system into production due to this bug at the moment, I'm happy to test everything that may help to track this down and fix it. If you need any more details of the hardware or BIOS/UEFI settings used I'm happy to provide them.
I'll also investigate Terry Kennedy's loader.conf hint and check if it helps and also how much of a performance hit the system takes.

Best regards
MacLemon
Comment 11 tkurmann 2017-03-14 10:58:32 UTC
Any news on this bug?

Im having the same error with 2x SM961 1TB nvme drives on PCIe adapters in a Dell R720 running FreeNAS 10 RC1.
Comment 12 Warner Losh freebsd_committer 2017-03-14 16:42:02 UTC
(In reply to tkurmann from comment #11)
I just pushed a few changes into -head that may help. Any chance you can try booting a snapshot?
Comment 13 Terry Kennedy 2017-03-14 20:50:31 UTC
Created attachment 180826 [details]
Boot capture with uname and "resetting controller"
Comment 14 Terry Kennedy 2017-03-14 20:51:20 UTC
Created attachment 180827 [details]
dd fails with error, but made some progress
Comment 15 Terry Kennedy 2017-03-14 20:51:43 UTC
Created attachment 180828 [details]
newfs output - also makes some progress
Comment 16 Terry Kennedy 2017-03-14 20:54:04 UTC
Looks like my text got lost when I added the images. This is a boot with 12-CURRENT 20170309 (the latest snapshot available). It looks like the driver is making a bit more progress with the nvd0: device, but still ends up in a "resetting controller" loop.

Sorry about the .BMP attachments, that's the way my remote management card does screenshots.
Comment 17 MacLemon 2017-03-14 22:59:25 UTC
I've done some testing now on FreeBSD 11.0p3-RELEASE and 12.0-CURRENT.
I've tested two of these Samsung SM961/256GB in a SuperServer 5018D-FN4T with SuperMicro AOC-SLG3-2M2.
I've done all possible combinations of the two SSDs in the 3 available M.2 slots of that combo.
I've also tried pretty much every combination of Legacy/BIOS/UEFI setting for the PCIe slots to no avail.
The SSDs are recognized at all every few reboots, I never managed to get both to show up. Using the oboard slot or the 2M2 card doesn't make any difference.
I did not manage to successfully initialize these SSDs let alone create a ZFS Mirror from them to boot from.

Just for completeness I've tested them running Debian/Sid and they behave just as unusable there.

I'll be returning these Samsung SSDs and try to get Toshiba X3 instead.
Comment 18 Terry Kennedy 2017-03-14 23:11:31 UTC
(In reply to MacLemon from comment #17)
I'd suggest testing with a single card first, to rule out some potential PCIe bifurcation* problems. My single SM961 works as expected under Linux (some random LiveCD I downloaded), but gives the "resetting controller" message under FreeBSD.

A lot of stuff makes an assumption about there being a single "thing" in a PCIe slot. There seem to be 2 types of PCIe / NVMe multi-module adapters. The first just takes the 4 lanes from each NVMe and puts them on the PCIe bus - so a 2 * M.2 adapter uses 8 PCIe lanes, 4 for each NVMe. The other type has a PCIe / PCIe bridge on the board.
Comment 19 Warner Losh freebsd_committer 2017-03-15 06:05:30 UTC
My samsung 960 PRO works great. We have other (hundreds) drives at work that are doing close to 3.8GB/s steady for hours.... So it can work... Let's dig down a level...

So the 'reset' messages that Terry is seeing in the two screen shots he just posted are either the result of some prankster doing an nvmecontrol reset (quite unlikely), or the result of the driver calling reset internally. It does this only when it gets a timeout for a command. Assuming for the moment that the timeout code is good, there's a command that's coming back bad and we wind up here:

nvme_timeout(void *arg)
...
        /* Read csts to get value of cfs - controller fatal status. */
        csts.raw = nvme_mmio_read_4(ctrlr, csts);

        if (ctrlr->enable_aborts && csts.bits.cfs == 0) {
                /*                                                                                                                                                                                          
                 * If aborts are enabled, only use them if the controller is                                                                                                                                
                 *  not reporting fatal status.                                                                                                                                                             
                 */
                nvme_ctrlr_cmd_abort(ctrlr, tr->cid, qpair->id,
                    nvme_abort_complete, tr);
        } else
                nvme_ctrlr_reset(ctrlr);

so we read the CSTS (the controller status) and if we've enabled aborts (which you can do by setting the tunable hw.nvme.enable_aborts=1 (it defaults to 0, so that's the path we may be taking unless you've found this already), so we do a reset.

The reset turns out to be unsuccessful, and we drive off the road into the ditch with the follow-on errors.

So, maybe try to set the tunable and try again. I'd normally ask about all the stupid issues: is power good, are the connections good, are you seeing PCIe errors (pciconf -lbace nvmeX), etc here, but I kinda assume with so many reports that's unlikely to be fruitful to everybody.

Maybe I'll try to find a Samsung 950 Pro 512GB (which form factor do you have?) and try as well, but that process will take about a week or two since I have an offsite soon and I don't think I can get one here before then.
Comment 20 Warner Losh freebsd_committer 2017-03-15 06:26:43 UTC
In addition, you can control the timeout period with the sysctl dev.nvme.X.timeout=S where S is the number of seconds. The min is 5, max is 120. The default is 30. It might be helpful to see if setting it lower causes this to happen more often or if setting it higher causes it to happen less often.

It's possible that we're missing a completion interrupt, or that we get one and somehow take a code path that doesn't cancel the timeout (though given there were actual I/Os that were aborted, that seems unlikely).

Disabling TRIM might also make things not suck so badly. But that wouldn't help Terry's simple newfs. We had issues with insanely slow TRIMs for a drive for a drive we were evaluating under NDA that might be relevant. We had no issues with newfs, nor with the drive itself once we turned TRIM off in ufs.

I couldn't find a 950 PRO, but was able to find a SM961. I'll see if I can recreate this issue on my NUC6.
Comment 21 Terry Kennedy 2017-03-15 06:47:39 UTC
(In reply to Warner Losh from comment #19)
It is a MZVPW128HEGM-00000 which is a SM961. This has happened in multiple systems (identical hardware configs - see below) which otherwise are operating flawlessly.

In previous correspondence with jimharris@, he had me try setting hw.nvme.per_cpu_io_queues=0 which concealed the problem but resulted in abysmal I/O performance (as expected). He also had me set some debug loader tunables and post the results. Those are at https://www.glaver.org/transient/nvme

I just moved the card (SM961 on generic PCIe slot adapter) to a Dell PowerEdge R710. It had been in a Supermicro X8DTH-iF. It works fine in the R710, even on 10.3-STABLE. Both of those systems are as similar as I could make them - both use the Intel 5520 chipset, both have 2 * X5680 CPU, both have 48GB of RAM (same part number in both systems). So it seems (at least in my case) to be related to the system hardware.

I can try ordering some other M.2 NVMe module to see if this issue is specific to the SM961, or if it is a problem with any NVMe on the X8DTH-iF board.
Comment 22 tkurmann 2017-03-15 21:59:22 UTC
I can confirm the finding of Terry Kennedy, with the latest snapshot 12-CURRENT 20170309, dd can read and write for a couple of kybtes until a reset is requested. Further, after the 5 "aborting outstanding i/o messages" (30s apart), the drive can no longer read / write using dd. Also nvmecontrol reset seems to have no effect, identify and devlist still work though. 

Here is my exact hardware:
Dell PowerEdge R720
2x Xeon E5-2670 
192 Gb RAM
SM961 on ASUS PCIe x4 to m2 breakout 

I will also try the drive on a desktop tomorrow with the snapshot and report back.
Comment 23 tkurmann 2017-03-17 16:53:08 UTC
Progess!
I updated the bios of the R720 to 2.5.4 and rebooted FreeBSD with the latest snapshot 12-CURRENT 20170309. To my surprise reading using dd works with any block size. The speed is capped at 2.0Gbytes/s, but it works. Writing on the other hand seems to only work up to a bs of around 512k (didn't find the threshold yet) and then it would timeout again. The speed cap made me suspicious and I checked to see what pcie version the card was running with (pciconf -lbace nvme0) and of course it was version 2.0. Under ubuntu 16.04 the card was reported with version 3.0 and the speed limit was 3.2 Gbytes/s. I assume this is related somehow, any thoughts?
Comment 24 Warner Losh freebsd_committer 2017-03-17 20:18:47 UTC
Gen2 PCIe is limited to 2GB/s for that setup. That's your problem. and likely an indicator of the solution...

When you say 'under ubuntu' is that on the same physical hardware or a different system? If it is just a reboot between the two performance profiles, that tells me one thing. If it is in a physically separate box, that tells me something else.

At work we have some drives that are defective (bad resistors that need to be swapped out) because they can't keep the link established at x4 PCIe3 speeds. Either they fall back to x1 PCIe3 speeds or x2 and/or PCIe2 speeds. And when they do, they aren't super reliable, in addition to being slow.

FreeBSD currently does a poor job of dealing with PCIe errors, so links can get into crazy states where they perform horribly. Maybe Linux is better able to reset the links on errors. If so, then that's up the alley of some uncommitted AER / Link retrain code I've been working on.
Comment 25 Terry Kennedy 2017-03-17 20:47:58 UTC
(In reply to Warner Losh from comment #24)
I think there may be multiple bugs all getting lumped into this PR.

On my Dell R710 (same exact CPUs, memory modules, and system chipset as the Supermicro X8DTH-iF where I get the hangs) with FreeBSD 10.3-STABLE, I get:

(0:4) pool20:/sysprog/terry# dd if=/dev/nvd0 of=/dev/null bs=16m
7631+1 records in
7631+1 records out
128035676160 bytes transferred in 86.524463 secs (1479762738 bytes/sec)
(0:5) pool20:/sysprog/terry# dd if=/dev/zero of=/dev/nvd0 bs=16m
dd: /dev/nvd0: short write on character device
dd: /dev/nvd0: end of device
7632+0 records in
7631+1 records out
128035676160 bytes transferred in 164.568004 secs (778010750 bytes/sec)
(1:6) pool20:/sysprog/terry# pciconf -lbace nvme0
nvme0@pci0:5:0:0:       class=0x010802 card=0xa801144d chip=0xa804144d rev=0x00 hdr=0x00
    bar   [10] = type Memory, range 64, base rxdf2fc000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 32 messages, 64 bit 
    cap 10[70] = PCI-Express 2 endpoint max data 256(256) FLR RO NS link x4(x4)
                 speed 5.0(8.0) ASPM disabled(L1)
    cap 11[b0] = MSI-X supports 33 messages, enabled
                 Table in map 0x10[0x3000], PBA in map 0x10[0x2000]
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 0 corrected
    ecap 0003[148] = Serial 1 0000000000000000
    ecap 0004[158] = Power Budgeting 1
    ecap 0019[168] = PCIe Sec 1 lane errors 0
    ecap 0018[188] = LTR 1
    ecap 001e[190] = unknown 1

This is in a PCIe 2 x4 slot, but the write speed matches the SM961 datasheet for the 128GB version - "700MB/Sec sequential write". Sequential read is probably being limited by PCIe 2 speeds, as the datasheet specifies "Up to 3100MB/Sec". These numbers are from "Samsung Rev 1.0, June 2016". I can't post the whole thing as it is marked Company Confidential.

On the Supermicro X8DTH-iF where FreeBSD gets the controller resets, Arch Linux 2016.11.01 reads and writes the SM961 at approximately the same speeds as on the Dell R710, with no resets.
Comment 26 Warner Losh freebsd_committer 2017-03-18 00:10:50 UTC
Terry: Is this one supermicro that you've booted FreeBSD, then booted linux and observed the difference? Or is it two different supermicros that are otherwise identical?
Comment 27 Terry Kennedy 2017-03-18 00:35:06 UTC
(In reply to Warner Losh from comment #26)
Exact same system. Wasn't even power-cycled.
Comment 28 Warner Losh freebsd_committer 2017-03-18 19:30:04 UTC
OK. It isn't the SM961 generally. I have one in a NUC6 on my desk now (it arrived last night). With 5 dd's reading it, I can get 2.5GB/s. More dd's don't help. Data sheet says up to 3.1GB/s. But I'm seeing no errors with my card, though it has AER reporting. I was able to newfs the 256GB version w/o issue.

I suspect, but cannot prove, this may be a signal issue causing errors. Linux seems to cope better maybe? Or maybe it knows to push it less hard? Not sure. I need to polish off the AER code I've written to monitor other devices I have.
Comment 29 Terry Kennedy 2017-03-19 02:47:11 UTC
(In reply to Warner Losh from comment #28)
As I said, my SM961 works fine in a Dell PowerEdge R710 with the same model of CPU, same amount and model of memory chips, and same platform controller as the Supermicro X8DTH-iF where it doesn't work and gets the controller resets. So the Dell and the Supermicro are doing something differently that the drive and / or card don't like.

As I mentioned a few replies further up, I can try purchasing some other brand of NVMe card to see if this issue is specific to the SM961 or something more generic.
Comment 30 tkurmann 2017-03-28 10:16:08 UTC
(In reply to Warner Losh from comment #24)

Same machine same hardware. Read performance (single dd if.. of=/dev/zero) on Ubuntu is 3.0 GB/s on the R720 and 2.0 GB/s with FreeBSD (snapshot 20170323). On a single CPU system I measured around 3.2 GB/s. Read speeds are faster and writes have no errors on Linux, whereas FreeBSD fails at writing with block sizes > 1K.
Comment 31 Greg 2017-05-06 17:35:03 UTC
Intel Skull Canyon NUC with WDC WDS512G1X0C-00ENX0 on VEN_15B7 DEV_5001 on C230 Chipset

Linux / Win10 - fine

FreeBSD / TrueOS (FreeBSD 12 Current) recognize card but soon after boot card goes offline and stays offline until machine is powered off. (Reboot alone won't clear the state. Reboot into other OS not possible. UEFI can not see drive until powered off and back on.)

Played with BIOS PCI power settings to little avail.

- G
Comment 32 Warner Losh freebsd_committer 2017-07-11 14:17:34 UTC
So something is hanging the card so that posted transactions don't complete. There's a small chance this is some other run-a-way thread in a different driver (we see that at work), but it would be useful to know what the transactions that are pending prior to it hanging.
Comment 33 Terry Kennedy 2017-07-11 14:24:41 UTC
(In reply to Warner Losh from comment #32)

Back when I first ran into this, I sent Jim Harris a bunch of "sysctl dev.nvme.0.ioq*.dump_debug=1" traces that he requested. He had me configure the driver with "hw.nvme.per_cpu_io_queues=0" which caused the card to work, but also crippled performance.

I recently obtained an Optane 16GB NVMe stick, and that one does work in the Supermicro board (where the PM961 didn't). I don't know if that proves anything.
Comment 34 stb 2017-07-21 18:54:58 UTC
I'm having the same problem on an SuperMicro SYS-5019S-M with a Samsung SM961 128GB.

Right now, the boot does not complete, I guess due to ZFS probing the disk.

I'll see if hw.nvme.per_cpu_io_queues=0 will make the kernel complete booting.
Comment 35 stb 2017-07-22 10:56:29 UTC
(In reply to stb from comment #34)

Setting hw.nvme.per_cpu_io_queues=0 works.
Comment 36 IPTRACE 2017-07-22 18:46:26 UTC
(In reply to stb from comment #35)
Can you provide more information about reduced performance?

# diskinfo -t /dev/nvme0ns1
Comment 37 stb 2017-07-22 22:24:36 UTC
[root@foo ~]# diskinfo -t /dev/nvd0
/dev/nvd0
	512         	# sectorsize
	128035676160	# mediasize in bytes (119G)
	250069680   	# mediasize in sectors
	0           	# stripesize
	0           	# stripeoffset
	S347NY0HB01730	# Disk ident.

Seek times:
	Full stroke:	  250 iter in   0.014551 sec =    0.058 msec
	Half stroke:	  250 iter in   0.015022 sec =    0.060 msec
	Quarter stroke:	  500 iter in   0.029067 sec =    0.058 msec
	Short forward:	  400 iter in   0.015134 sec =    0.038 msec
	Short backward:	  400 iter in   0.015675 sec =    0.039 msec
	Seq outer:	 2048 iter in   0.063374 sec =    0.031 msec
	Seq inner:	 2048 iter in   0.057973 sec =    0.028 msec

Transfer rates:
	outside:       102400 kbytes in   0.094174 sec =  1087349 kbytes/sec
	middle:        102400 kbytes in   0.089065 sec =  1149722 kbytes/sec
	inside:        102400 kbytes in   0.089141 sec =  1148742 kbytes/sec

I think I should be getting 2.2GB/s. With 4 concurrent dd's, gstat shows:

[root@foo ~]# gstat -I60s -f '^....$'
dT: 60.002s  w: 60.000s  filter: ^....$
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    4  13578  13578 1737975    0.3      0      0    0.0  100.0| nvd0

[root@foo ~]# for i in 0 1 2 3; do dd if=/dev/nvd0 of=/dev/null bs=1m count=100k & done; wait; echo 'done'
[1] 41520
[2] 44578
[3] 46696
[4] 47833
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 192.262522 secs (558476927 bytes/sec)
[1]   Done                    dd if=/dev/nvd0 of=/dev/null bs=1m count=100k
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 241.421031 secs (444759026 bytes/sec)
[2]   Done                    dd if=/dev/nvd0 of=/dev/null bs=1m count=100k
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 241.552144 secs (444517613 bytes/sec)
[3]-  Done                    dd if=/dev/nvd0 of=/dev/null bs=1m count=100k
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 241.559861 secs (444503412 bytes/sec)
[4]+  Done                    dd if=/dev/nvd0 of=/dev/null bs=1m count=100k
done

So I'm guessing the penalty is not too big.  The 128 GB model has a significantly lower write speed compared to the 256GB and 512GB models (around 800MB/s I believe), so I didn't test that.
Comment 38 stb 2017-07-22 22:30:48 UTC
(In reply to stb from comment #34)

One more detail: the SM961 support PCIe 3.0 with 4 lanes, but the M.2 socket on the X11SSH-F provides only two lanes.  I have no idea whether this should make a difference in function or not, but it will limit the theoretical performance to about 2 GB/s.

With that in mind, my numbers look like its using the available bandwidth fully.  I guess there appears to be no performance penalty, at least for large, linear transfers.
Comment 39 stb 2017-07-22 22:45:36 UTC
This is what I get from pciconf:

[root@foo ~]# pciconf -lBbcevV nvme0@pci0:3:0:0
nvme0@pci0:3:0:0:	class=0x010802 card=0xa801144d chip=0xa804144d rev=0x00 hdr=0x00
    vendor     = 'Samsung Electronics Co Ltd'
    device     = 'NVMe SSD Controller SM961/PM961'
    class      = mass storage
    subclass   = NVM
    bar   [10] = type Memory, range 64, base rxdf100000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 32 messages, 64 bit 
    cap 10[70] = PCI-Express 2 endpoint max data 256(256) FLR NS
                 link x2(x4) speed 8.0(8.0) ASPM L1(L1)
    cap 11[b0] = MSI-X supports 33 messages, enabled
                 Table in map 0x10[0x3000], PBA in map 0x10[0x2000]
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 0003[148] = Serial 1 0000000000000000
    ecap 0004[158] = Power Budgeting 1
    ecap 0019[168] = PCIe Sec 1 lane errors 0
    ecap 0018[188] = LTR 1
    ecap 001e[190] = unknown 1
  PCI-e errors = Correctable Error Detected
                 Unsupported Request Detected
     Corrected = Advisory Non-Fatal Error
Comment 40 lzd 2017-10-05 16:30:22 UTC
Having same issue. Using Lenovo 4xb0m52449 256gb-Nvme-M.2 SSD. 
VMware is operating the SSD fine, but if I'm trying to clean install FreeBSD on it, im stuck in the "resetting controller" loop.
Comment 41 Martin Stafford 2017-12-18 01:44:17 UTC
Same problem here.
Works fine under Debian Stretch 9.2 with same hardware.

Supermicro X10DRL-I-O motherboard
ASUS Hyper M.2 x16 NVMe card
2 X Samsung PM961 256gb
1 X Samsung PM961 128gb

I've disabled the hw.nvme.per_cpu_io_queues and it's working but slow I think.

Here's some dmesg lines with hw.nvme.per_cpu_io_queues=0:

FreeBSD 11.1-STABLE #0 r321665+d4625dcee3e(freenas/11.1-stable): Wed Dec 13 16:33:42 UTC 2017
CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz (2100.04-MHz K8-class CPU)
FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs
FreeBSD/SMP: 2 package(s) x 8 core(s) x 2 hardware threads
nvme0: <Generic NVMe Device> mem 0xc7800000-0xc7803fff irq 40 at device 0.0 numa-domain 0 on pci6
nvme1: <Generic NVMe Device> mem 0xc7700000-0xc7703fff irq 40 at device 0.0 numa-domain 0 on pci7
nvme2: <Generic NVMe Device> mem 0xc7600000-0xc7603fff irq 40 at device 0.0 numa-domain 0 on pci8
nvd0: <SAMSUNG MZVLW128HEGR-00000> NVMe namespace
nvd0: 122104MB (250069680 512 byte sectors)
nvd1: <SAMSUNG MZVLW256HEHP-000L7> NVMe namespace
nvd1: 244198MB (500118192 512 byte sectors)
nvd2: <SAMSUNG MZVLW256HEHP-000L7> NVMe namespace
nvd2: 244198MB (500118192 512 byte sectors)
Comment 42 igor.zenyuk 2018-01-02 12:19:51 UTC
same problem during FreeBSD installation on Lenovo Thinkpad t470p
(StorageSamsung PM961 NVMe MZVLW512HMJP, 512 GB, M.2 SSD)

 nvme0: resetting controller
 nvme0: aborting outstanding i/o
 nvme0: READ sqid:8 cid:127 nsid:1 lba:1000215153 len:4
 nvme0: ABORTED - BY REQUEST (00/07) sqid:8: cid:127 cdw0:0
Comment 43 bambivr98 2018-02-08 00:13:16 UTC
There are several additional complications while using Lenovo (IdeaPad 700).  The most frustrating one is that sometimes the system will decide the nvme disk is okay with no timeouts, and the next boot times-out continually.  Aack!  Out of desperation, I built a system disk on an external drive.  Most of the time that works with no use of nvme and starts without an issue but if the timeouts occur during boot, it may take several hours to complete the error checking.

Stupid Question:  Is the driver detecting the proper device and selecting the correct flavor of the driver?  The messages look right, but the inconsistency would suggest that something of that sort is happening.  How would I verify this?
Comment 44 Luka Boulagnon 2018-02-11 16:52:28 UTC
I have exactly the same issue on my SSD: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961
The issue happened with the 12.0-CURRENT installation media, and also with the system installed.

I am able to solve this issue with kern.smp.disabled=1, so I can boot; however it also disable the MultiProcessor feature (and I end up with a single core).
(The “Fail Safe” option at boot includes this fix)

Hope this help!
Comment 45 Warner Losh freebsd_committer 2018-02-11 17:15:29 UTC
Talked to Jim Harris the other day... 
What might be going on here is a lost interrupt, so we timeout.
I'm going to modify the timeout code to check completions before doing a reset. If we find any, we'll complete the I/Os and continue, otherwise we'll reset the card. This may help.
Comment 46 strangeqargo 2018-03-01 15:17:41 UTC
same as here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713#c43

dumps on heavy load during update, Lenovo Ideapad 700
Comment 47 Tommi Pernila 2018-03-07 13:00:59 UTC
(In reply to Warner Losh from comment #45)
This is also my conclusion of the problem.

I have managed to overcome the interrupt timeout matters by disabling the pci express MSI interrupt signalling in loader.conf with 'hw.pci.enable_msi="0".


Disabling this globally solves this issue but this will cause issues with other pci express devices that do not fully function with MSI-X interrupt signalling.

Should these specific pci express controllers be added to a quirk list or what would be the correct way of solving the issue?


What this tunable means?
excerpt from man pci:
hw.pci.enable_msi (Defaults to 1)
 Enable support for Message Signalled Interrupts (MSI).  MSI
 interrupts can be disabled by setting this tunable to 0.

some additional details on pci express interrupts:
https://en.wikipedia.org/wiki/Message_Signaled_Interrupts
https://electronics.stackexchange.com/questions/76867/what-do-the-different-interrupts-in-pcie-do-i-referring-to-msi-msi-x-and-intx
Comment 48 bambivr98 2018-03-09 00:00:16 UTC
(In reply to tommi.pernila from comment #47)
I wish this really fixed the problem, but it doesn't.  It did, however, reduce the frequency of occurrence.

nvme0: resetting controller
nvme0: aborting outstanding i/o
nvme0: WRITE sqid:8 cid:127 nsid:1 lba:2064 len:64
nvme0: ABORTED - BY REQUEST (00/07) sqid:8: cid:127 cdw0:0

This is from a cold boot.  Room temperature is about 24 C

I doubt this is heat-related.

loader.conf.local contains:

kern.cam.boot.delay=10000
kern.cam.scsi.delay=10000
vfs.root.mountfrom="ufs:/dev/da0p2"
hw.pci.enable_msi="0"
Comment 49 Terry Kennedy 2018-03-15 21:43:09 UTC
(In reply to tommi.pernila from comment #47)

I found that the same hardware (exact same NVMe card, not just same model) works fine when moved to a Dell PowerEdge R710, even though it shows the problem in a Supermicro X8DTH-iF (see comment 21).

I also found that although FreeBSD has problems with it on that Supermicro board, some random Linux distro does not (see comment #18).

This makes me think that it is something we're not handling properly, either timing-related or motherboard chipset-related.
Comment 50 commit-hook freebsd_committer 2018-03-16 05:24:29 UTC
A commit references this bug:

Author: imp
Date: Fri Mar 16 05:23:49 UTC 2018
New revision: 331046
URL: https://svnweb.freebsd.org/changeset/base/331046

Log:
  Try polling the qpairs on timeout.

  On some systems, we're getting timeouts when we use multiple queues on
  drives that work perfectly well on other systems. On a hunch, Jim
  Harris suggested I poll the completion queue when we get a timeout.
  This patch polls the completion queue if no fatal status was
  indicated. If it had pending I/O, we complete that request and
  return. Otherwise, if aborts are enabled and no fatal status, we abort
  the command and return. Otherwise we reset the card.

  This may clear up the problem, or we may see it result in lots of
  timeouts and a performance problem. Either way, we'll know the next
  step. We may also need to pay attention to the fatal status bit
  of the controller.

  PR: 211713
  Suggested by: Jim Harris
  Sponsored by: Netflix

Changes:
  head/sys/dev/nvme/nvme_private.h
  head/sys/dev/nvme/nvme_qpair.c
Comment 51 Warner Losh freebsd_committer 2018-03-16 05:28:46 UTC
You might try hw.nvme.enable_aborts=1 in loader.conf. This will enable aborting the command on timeouts when there's no fatal error indicated. This might help.

Also, r331046 has a workaround suggested by Jim Harris. IF there's no fatal error signaled, we'll poll the completion queue. If that works, we move on (with a loud printf that will likely have a performance issue, but we'll see it). If not, and no fatal error signaled and aborts are enabled, we'll abort the command. Otherwise we'll reset the card (the current behavior).

I could never recreate this problem, despite buying the exact card (I think) that others have reported as being bad. So, if you can reproduce this problem, please try r331046 or later and let me know if that helps or not.
Comment 52 Warner Losh freebsd_committer 2018-03-16 05:30:54 UTC
If the timeout 'fixes' the issue, Jim thinks it might mean that we have a MSIX interrupt mapping issue, or similar, to track down. Either by the driver making bad assumptions, it getting fed bad data, or some issue in the msix code. I'm skeptical, but we'll know after the retesting.
Comment 53 Terry Kennedy 2018-03-16 05:49:15 UTC
(In reply to Warner Losh from comment #52)

The system I was using to test this has been off and in storage - when I booted it just now to update it, it said it was 11.1-PRERELEASE 8-}. I'm in the process of updating it to 11-STABLE and will try the Samsung NVMe device again with this patch (assuming I can apply it to 11-STABLE). Right now the box has an Intel Optane NVMe drive in it, so I can also test for regressions with a known-working module.

If nobody else tries this and reports back in a few days, ping me to make sure I'm still working on testing.

Thanks!
Comment 54 Terry Kennedy 2018-03-16 08:22:48 UTC
Created attachment 191543 [details]
Log of failed patch application on 11-STABLE
Comment 55 Terry Kennedy 2018-03-16 08:25:24 UTC
(In reply to Terry Kennedy from comment #54)

The previous comment seems to be missing my comment text...

I tried applying the patch to 11-STABLE (r331049) and it didn't apply cleanly. Before I dig into this, would be possible to get a version for 11?
Comment 56 stan 2018-03-24 16:06:42 UTC
Exact same issue when trying to install freeBSD on a brand new Lenovo E480. I tried `FreeBSD-12.0-CURRENT-amd64-20180322-r331345` and `FreeBSD-11.1-STABLE-amd64-20180322-r331337`. Both failed with continuously logging nvme0 failure. Help welcome ! I can provide more information if needed.
Comment 57 stan 2018-03-31 06:35:20 UTC
[update] : trying to install FreeBSD-12.0-CURRENT-amd64-20180329-r331740 in 'normal' mode on LENOVO E480 with samsung ssd MZVLW256HEHP-000L7.

output : `nvme0: missing interrupt` many times, then graphical installer displays. I select to install Auto (ZFS) on nvd0. During installation, I still see `nvme0: missing interrupt` messages showing up few times again.  

Then installation failed with error displaying in graphical window : `Error: gpart provider: Device not configured`.

Booting in 'safe mode' ended with the same gpart error.

Hope it helps !
Comment 58 stan 2018-03-31 07:04:15 UTC
following my comment #57, here more debug info in another context with same hardware : 
I am able to boot TrueOS-Desktop-201803131015 with `hw.nvme.per_cpi_io_queues="0"` set in /boot/loader.conf. Everything works well, BUT fatal error comes when trying to resume after s3 suspend mode : 

I see kernel messages ending with : 

```
(…) kernel: WARN_ON(…stripped…) CSR SSP Base Not fine
(…) kernel: CSR HTP Not fine
(…) kernel: WARN_ON(…stripped…) Clearing unexpected auxiliary request for power well 2
```

then :

```
nvme0: resetting controller
nvme0: controller ready did not become 0 within 30000 ms
nvme0: failing queued i/o
nvme0: READ sqid:1 cid:0 nsid: 1 lba:324015968 len:20
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:0 cdw0:0
```

and similar errors repeated a dozen times, 

then the fatal :

```
nvd0: lost device - 0 outstanding
nvd0: removing device entry
nvme0: WRITE sqid:1 cid:0 nsid:1 lba:4416948 len:48
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:0 cdw0:0

Fatal trap 12: page fault while in kernel mode
cpuid = 4; apic id = 04
fault virtual address   = 0x8
fault code			   = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80a3b141
stack pointer           = 0x28:0xfffffe0000545820
frame pointer           = 0x28:0xfffffe0000545860
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (nvme taskq)
[ thread pid 0 tid 100077 ]
stopped at      g_disk_done+0xc1:       movq   0x8(%rax),%rdi
db>
```
Comment 59 clutton 2018-04-07 11:17:10 UTC
```
nvme0: resetting controller
nvme0: controller ready did not become 0 within 30000 ms
nvme0: failing queued i/o
nvme0: READ sqid:1 cid:0 nsid: 1 lba:324015968 len:20
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:0 cdw0:0
```

I can observe the same on my host on resume, everything works except resuming.
Sometime it manages to reset the bloody controller after some time 5-30 sec - then it works properly.
Comment 60 clutton 2018-04-07 11:27:08 UTC
Here's actually from my system, it had woken up successfully this morning.

nvme0: Resetting controller due to a timeout.
nvme0: Resetting controller due to a timeout.
nvme0: resetting controller
nvme0: Resetting controller due to a timeout.
nvme0: aborting outstanding i/o
nvme0: WRITE sqid:1 cid:124 nsid:1 lba:302574390 len:72
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:124 cdw0:0
nvme0: aborting outstanding i/o
nvme0: WRITE sqid:1 cid:127 nsid:1 lba:308798193 len:8
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:127 cdw0:0
nvme0: aborting outstanding i/o
nvme0: READ sqid:1 cid:90 nsid:1 lba:146704418 len:5
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:90 cdw0:0
nvme0: aborting outstanding i/o
nvme0: READ sqid:2 cid:126 nsid:1 lba:436423099 len:54
nvme0: ABORTED - BY REQUEST (00/07) sqid:2 cid:126 cdw0:0
nvme0: aborting outstanding i/o
nvme0: READ sqid:3 cid:125 nsid:1 lba:785815849 len:14
nvme0: ABORTED - BY REQUEST (00/07) sqid:3 cid:125 cdw0:0
nvme0: aborting outstanding i/o
nvme0: READ sqid:3 cid:75 nsid:1 lba:859171570 len:2
nvme0: ABORTED - BY REQUEST (00/07) sqid:3 cid:75 cdw0:0
nvme0: aborting outstanding i/o
nvme0: WRITE sqid:4 cid:100 nsid:1 lba:306185450 len:2
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:100 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:79 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:79 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:119 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:119 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:118 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:118 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:95 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:95 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:117 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:117 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:101 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:101 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:109 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:109 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:75 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:75 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:107 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:107 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:123 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:123 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:110 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:110 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:93 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:93 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:115 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:115 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:98 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:98 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:72 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:72 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:65 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:65 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:111 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:111 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:108 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:108 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:74 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:74 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:92 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:92 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:87 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:87 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:96 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:96 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:94 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:94 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:77 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:77 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:104 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:104 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:113 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:113 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:66 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:66 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:120 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:120 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:67 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:67 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:71 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:71 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:88 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:88 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:106 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:106 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:116 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:116 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:121 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:121 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:126 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:126 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:84 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:84 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:70 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:70 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:76 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:76 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:99 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:99 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:124 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:124 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:69 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:69 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:91 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:91 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:81 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:81 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:103 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:103 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:114 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:114 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:89 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:89 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:127 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:127 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:85 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:85 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:125 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:125 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:73 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:73 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:83 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:83 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:68 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:68 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:86 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:86 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:82 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:82 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:102 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:102 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:78 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:78 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:122 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:122 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:90 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:90 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:112 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:112 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:105 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:105 cdw0:0
nvme0: aborting outstanding i/o
nvme0: DATASET MANAGEMENT sqid:4 cid:80 nsid:1
nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:80 cdw0:0
Comment 61 Len White 2018-04-29 15:14:05 UTC
I've recently hit the error nvme0: Missing interrupt on 10.3 and 12.0-current.  I discovered a workaround that may help lead to a proper fix, though I don't know for certain if the Missing interrupt is related to resetting controller message.

If I have nvme_load="YES" and nvd_load="YES" in /boot/loader.conf, I get missing interrupt every time, the more the device gets used the more of the messages show up.  Each time that message is shown it seems the read/write operation fails as the end result is corruption (AND oddly half the time my intel ix card doesn't work properly when this happens it will spit out ix0: TX(0) desc avail = 34, pidx = 87, link status stays no carrier)

If I load nvme/nvd AFTER the system finishes booting, it behaves normally and doesn't affect ix.

So it seems loading nvme/nvd early in the boot process causes some kind of interrupt conflict with other driver(s).
Comment 62 Ali Abdallah 2018-06-19 06:02:31 UTC
Same problem on FreeBSD 11.2-RC3 on a Thinkpad T480. After resuming from suspend the system is not usable for 10-30 seconds, showing the following messages 

nvme0: Resetting controller due to a timeout.
nvme0: Resetting controller due to a timeout.
nvme0: resetting controller
nvme0: Resetting controller due to a timeout.
nvme0: aborting outstanding i/o
nvme0: WRITE sqid:1 cid:124 nsid:1 lba:302574390 len:72
.....
.....
Comment 63 JMN 2018-10-17 05:08:46 UTC
Experiencing similar suspend/resume issue with Samsung NVME PCIE 960 EVO.
I am running freebsd 11.2
When comes back from a suspend/resume cycle, recieve a long list of ABORTs on pending writes to the NVME. i have also verified that after each such resume, the "unsafe shutdowns" count in the NVME increments by 1. i was running the same NVME with a windows OS for some months, doing many suspend/resumes and that count had not incremented, so i do not beleive it is an issue with the NVME but with freebsd. 
The unsafe shutdown count does NOT increment when freebsd shuts down (e.g. shutdown -p now).

i can simulate a similar list of pending io actions in the queue by using nvmecontrol to send a reset to the nvme device. but in that circumstance it repopulates the queue, instead of aborting them. 

Also note that it is very frequent to find newly missing data fragments in the nvme partition when using fsck after the aborted io queue is reported. 

I SUSPECT THAT AN ABILITY TO AT LEAST SEND a FLUSH COMMAND to the NVME would allow us to avoid the lost/corrupted data by putting such an action in the /etc/rc.suspend file.  but i have not discovered a way to send that flush command. using nvmecontrol to set a very low power level on the nvme during rc.suspend does not prevent the behavior. 
Possible that a "shutdown" command sent to the NVME during suspend would provide same results. 

nvmecontrol does not seem to expose flush or shutdown functionality. FreshPorts appears to have a nvme-cli port that has FLUSH, but is flagged as broken for 11.2. I have not attemped to update/test under freebsd 12.
Comment 64 Warner Losh freebsd_committer 2018-10-17 05:12:39 UTC
'sync' will force all the dirty buffers to be scheduled in the nvme controller and won't return until they are complete. There are no other 'flush' operations needed as the errors are that we suspend while we have pending I/O in the nvme controller. That might need to be attended to, but it isn't currently.

But a suspend / resume bug is very different than this bug. Please file a new bug to track that. This bug is, during normal operations, something bad happens, and we stop being able to talk to the NVMe drive and error recovery is insufficient to cope.
Comment 65 JMN 2018-10-17 05:16:49 UTC
(In reply to Warner Losh from comment #64)
have attmpted to sync in many combinations during suspend, and it doesnt change the behavior. 
the flush i am referring to is a command defined in NVME spec to force the NVME to flush its internal buffer.
Comment 66 Warner Losh freebsd_committer 2018-10-17 13:53:14 UTC
The drive should likely be properly shutdown before suspend / resume. I agree. That's a different bug. There's code to do this on shutdown. The FLUSH command won't help because that has to be integrated into the driver to be useful (since I/O can happen after the sync before things suspend).

The errors people are seeing from pending commands, however, are a different issue.

Both of which are different issues from this bug. This isn't an omnibus NVME error bug.
Comment 67 David 2018-11-06 22:24:26 UTC
Created attachment 199031 [details]
dmesg output
Comment 68 David 2018-11-06 22:25:44 UTC
I'm testing FreeBSD 12.0-BETA3 r340039 GENERIC, and I have an PM961 PCIe NVMe m.2 1TB drive that came with my Lenovo ThinkPad P50.
P/N: MZSLW1T0HMLH-000L1 Produced Oct 2016

That drive is recognized by FreeBSD 12, but is not usable whatsoever (can't read/write to it).  I've used this drive with Debian testing since 2016 without trouble on my ThinkPad P50.

I installed FreeBSD 12 on an internal 2TB HDD in the ThinkPad in order to test FreeBSD, but the PM961 continued to cause boot delays -- I would see "nvme0: Missing interrupt" messages until the system finally gave up and continued with the boot process.

I attempted to install FreeBSD 11 on the 2TB HDD but the install failed when it had trouble recognizing the nvme drive.

Initially I thought the missing interrupt problem with FreeBSD was caused by the LUKS encryption on the nvme drive because I had not formatted that drive yet since I was dual booting. So I purchased another Samsung NVMe SSD 960 PRO m.2 1TB drive P/N: MZVKP1T0HMJP, and that drive works with FreeBSD 12. The new nvme was installed in the ThinkPad along with the original nvme and HDD drive.  The 2TB HDD and the new 1TB nvme drives are dedicated to FreeBSD using ZFS.  I attempted to create a ZFS mirror using the two nvme drives and FreeBSD successfully wrote to the original nvme drive (because it overwrote my Linux partitions) but the overall `zfs_create_diskpart` process failed and I had to start over using only the new nvme drive, which worked.  I eventually removed the original nvme drive from my laptop because of the constant "missing interrupt" delays.

However, after removing the original nmve drive and while installing a virtual machine in VirtualBox on my new nvme, my laptop went into (what seemed to be) ACPI S3 suspended mode, and after I woke the machine the laptop rebooted itself.  Thinking the problem was VirtualBox, I removed that software and setup Bhyve instead.  During a virtual machine install in Bhyve, the laptop went into an S3-style suspended mode again, and this time when I woke the machine I noticed the nvme0 resetting controller, write, read, and aborted-by-request messages in `dmesg` (output attached above).

For the most part, the new nvme device seems stable with FreeBSD 12.  I haven't test it with FreeBSD 11.  I don't know if KDE's baloo service crashing and creating a 256GB core dump every single time I login is part of the problem using this drive.  Today I disabled Baloo file indexing and installed another virtual machine using Bhyve and the system hasn't reported any problems with the nvme.  I also used `dd` to create some 10GB and 100GB files using input from /dev/urandom, and that didn't cause any issues so far.

Lastly, cold boots on the new nvme (without the old nvme installed in the laptop) are normal. However, reboots can take literally 2 minutes to complete.  This includes an extended delay on the BIOS screen before reaching the GELI password prompt, and a delay after loading the kernel before moving on to the ---<<BOOT>>--- screen, and the entire boot process is sluggish until finally reaching the login prompt.  I've never experienced this with Debian testing and I suspect the FreeBSD nvme driver is leaving the system in a weird state.

IIRC setting hw.nvme.enable_aborts=1 while the original nvme drive is still in the laptop causes a kernel panic while booting.  I haven't tried setting hw.nvme.per_cpu_io_queues=0 since the system is usable and not completely instable.

Hardware details:

# nvmecontrol devlist
 nvme0: SAMSUNG MZSLW1T0HMLH-000L1
    nvme0ns1 (976762MB)
 nvme1: Samsung SSD 960 PRO 1TB
    nvme1ns1 (976762MB)

# pciconf -lbace nvme0
nvme0@pci0:2:0:0:	class=0x010802 card=0xa801144d chip=0xa804144d rev=0x00 hdr=0x00
    bar   [10] = type Memory, range 64, base rxd4400000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 32 messages, 64 bit 
    cap 10[70] = PCI-Express 2 endpoint max data 256(256) FLR RO NS
                 link x4(x4) speed 8.0(8.0) ASPM L1(L1)
    cap 11[b0] = MSI-X supports 33 messages, enabled
                 Table in map 0x10[0x3000], PBA in map 0x10[0x2000]
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 0003[148] = Serial 1 0000000000000000
    ecap 0004[158] = Power Budgeting 1
    ecap 0019[168] = PCIe Sec 1 lane errors 0
    ecap 0018[188] = LTR 1
    ecap 001e[190] = unknown 1
  PCI-e errors = Correctable Error Detected
                 Unsupported Request Detected
     Corrected = Advisory Non-Fatal Error

# pciconf -lbace nvme1
nvme1@pci0:62:0:0:	class=0x010802 card=0xa801144d chip=0xa804144d rev=0x00 hdr=0x00
    bar   [10] = type Memory, range 64, base rxd4200000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 32 messages, 64 bit 
    cap 10[70] = PCI-Express 2 endpoint max data 256(256) FLR RO NS
                 link x4(x4) speed 8.0(8.0) ASPM L1(L1)
    cap 11[b0] = MSI-X supports 8 messages, enabled
                 Table in map 0x10[0x3000], PBA in map 0x10[0x2000]
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 0003[148] = Serial 1 0000000000000000
    ecap 0004[158] = Power Budgeting 1
    ecap 0019[168] = PCIe Sec 1 lane errors 0
    ecap 0018[188] = LTR 1
    ecap 001e[190] = unknown 1
  PCI-e errors = Correctable Error Detected
                 Unsupported Request Detected
     Corrected = Advisory Non-Fatal Error

# diskinfo -t /dev/nvme0ns1
/dev/nvme0ns1
	512         	# sectorsize
	1024209543168	# mediasize in bytes (954G)
	2000409264  	# mediasize in sectors
	0           	# stripesize
	0           	# stripeoffset
	No          	# TRIM/UNMAP support
	Unknown     	# Rotation rate in RPM

Seek times:
	Full stroke:^C
Nov  6 18:58:09 fenixbsd kernel: nvme0: Missing interrupt
Nov  6 18:58:39 fenixbsd syslogd: last message repeated 1 times

# diskinfo -t /dev/nvme1ns1
/dev/nvme1ns1
	512         	# sectorsize
	1024209543168	# mediasize in bytes (954G)
	2000409264  	# mediasize in sectors
	0           	# stripesize
	0           	# stripeoffset
	No          	# TRIM/UNMAP support
	Unknown     	# Rotation rate in RPM

Seek times:
	Full stroke:	  250 iter in   0.011499 sec =    0.046 msec
	Half stroke:	  250 iter in   0.010018 sec =    0.040 msec
	Quarter stroke:	  500 iter in   0.015302 sec =    0.031 msec
	Short forward:	  400 iter in   0.013087 sec =    0.033 msec
	Short backward:	  400 iter in   0.012144 sec =    0.030 msec
	Seq outer:	 2048 iter in   0.041548 sec =    0.020 msec
	Seq inner:	 2048 iter in   0.042294 sec =    0.021 msec

Transfer rates:
	outside:       102400 kbytes in   0.066412 sec =  1541890 kbytes/sec
	middle:        102400 kbytes in   0.064908 sec =  1577618 kbytes/sec
	inside:        102400 kbytes in   0.064534 sec =  1586760 kbytes/sec
Comment 69 Wolf Noble 2018-11-28 00:37:31 UTC
I can confirm this bug rears it head on a dell r510 running freenas11.2 (which follows freebsd11.2)    
if I use a synology m2d18 https://www.synology.com/en-us/products/M2D18  dual NVME pci card. I cannot get diskinfo to run cleanly without the target going to lunch. Setting hw.nvme.enable_aborts=1 via a loader.conf tunable (and rebooting) had no impact in my case.

disabling or enabling hyperthreading does not seem to impact this hang, although having hyperthreading enabled seems to prevent MSIX from being used.

if I take the nvme devices out of the synology dual card and put them into a single nvme->pci adapter, I do not (yet) seem to have the problem.

happy to provide whatever details might be useful in making progress, but figured that just the additional info might be sufficiently useful without superfluous noise.
Comment 70 Daniel Duerr 2018-12-24 16:00:55 UTC
I'm having the same reset controller issue on 11.2-RELEASE with an SM961.  I tried it on 2 different SuperMicro systems: one very new system with a mobo-based m.2 slot, and one older system with a PCIe m.2 adapter.  Let me know if I can be of assistance with troubleshooting.  Would really love to be able to use this hardware.
Comment 71 Ka Ho Ng 2019-06-30 05:02:02 UTC
Created attachment 205429 [details]
A patch trying to fix the missing interrupt issue on SM961.

A patch trying to fix the missing interrupt issue on SM961.
Comment 72 Ka Ho Ng 2019-06-30 05:03:11 UTC
(In reply to Ka Ho Ng from comment #71)

For anyone being affected by this bug can you try whether the patch works for you?
Comment 73 Ka Ho Ng 2019-06-30 05:04:40 UTC
(In reply to Ka Ho Ng from comment #72)

One more to add, the patch is to be applied on FreeBSD 12.0-RELEASE, but trying this on FreeBSD 11 should also be trivial.
Comment 74 Ka Ho Ng 2019-06-30 05:21:19 UTC
(In reply to Ka Ho Ng from comment #72)
wait. please wait for the next revision...
Comment 75 Ka Ho Ng 2019-06-30 15:48:47 UTC
(In reply to Ka Ho Ng from comment #71)
This patch is only a workaround to the issue with cpu thread number <= 8 with its own issues. It is not a fix so don't try it out.
Comment 76 Ka Ho Ng 2019-07-06 11:16:56 UTC
Created attachment 205541 [details]
Fix SM961 issue

For people using FreeBSD 11.3 or FreeBSD 12.0 please try if this patch fixes the issue instead.
Comment 77 Warner Losh freebsd_committer 2019-07-06 16:12:03 UTC
Why do you need to change pci_mask_msix and pci_unmask_msix? Surely that can't be right?
The nmve patches look good, I think, but that one seems like a non-starter to do unconditionally.
Comment 78 Ka Ho Ng 2019-07-06 17:20:29 UTC
(In reply to Warner Losh from comment #77)

The commit message of the patch is actually inside this commit:
https://github.com/khng300/freebsd/commit/c75f08495fde5dee08e4b24f399f2d70a77254a6

To put simply, some controllers give zeroes for MMIO read on certain regions, and subsequently leading to MSI-X not being enabled at all (the interrupts will first be masked and it will be in effect, however the subsequent unmask will not work at all). As a result the corresponding bit in PBA will be set by the controller since the interrupt is actually not enabled after being masked (recall that the read of vector control word will always give zero). Such modification is actually made by taking a reference at the interrupt unmask implementation of Illumos kernel, by not considering the existing content of vector control bit but simply overwriting the word.
Comment 79 Ka Ho Ng 2019-07-06 17:24:28 UTC
(In reply to Warner Losh from comment #77)

The NVME patch was a mistake I made that I thought the corresponding feature was 1-based, which in fact is zero. The behavior will be that number of IO queues will be number of CPU's minus one and that is undesirable. Although this will sort of fix the behavior on machines with ncpus less than or equal to 8, it will still cause trouble if ncpus is greater than 8.
Comment 80 Luka Boulagnon 2019-07-06 21:23:17 UTC
(In reply to Ka Ho Ng from comment #76)

I just rebuild the installer with a kernel including your patch.
That's amazing! It works!

Thank you for your work :)
Comment 81 Terry Kennedy 2019-07-07 21:15:21 UTC
My only potential concern with this patch is that in my original testing, I found that the NVMe drive worked on some systems and not others (under FreeBSD; under Linux I could not get it to fail anywhere). Is it possible that we're seeing a difference in the way the BIOS sets things up? If so, is the proposed patch the way to go, or should we do further diagnosis to see if we can find what the actual BIOS initialization differences are?

OTOH, if nobody else thinks there are issues with the patch, good to go. I'm just concerned about changing this behavior globally as opposed to just on the NVMe device.
Comment 82 Tomasz "CeDeROM" CEDRO 2019-08-07 21:14:34 UTC
Hello World :-)

The same problem still here in 12.0-RELEASE AMD64 on Panasonic Toughbook CF-MX4 with M.2 SSD SAMSUNG MZNTE256HMP (Model MZ-NTE2560)! Cannot install nor use FreeBSD on this machine.. I contantly get lots of:

CAM status: Uncorrectable parity/CRC error
Retrying command, N more tries remain
SEND_FPDMA_QUEUED DATA SET MANAGEMENT. ACB 61 04 00 ....

The worst thing everything works fine on Linux and Windoze :-(
Comment 83 Ka Ho Ng 2019-08-08 05:45:53 UTC
(In reply to Tomasz "CeDeROM" CEDRO from comment #82)

Well it seems that this SSD is using AHCI which may be unrelated to this ticket.
Comment 84 commit-hook freebsd_committer 2019-09-05 23:54:59 UTC
A commit references this bug:

Author: imp
Date: Thu Sep  5 23:54:45 UTC 2019
New revision: 351915
URL: https://svnweb.freebsd.org/changeset/base/351915

Log:
  MFC r349845:

    Work around devices which return all zeros for reads of existing MSI-X table
    VCTRL registers.

  Note: This is confirmed to fix the nvme lost interrupt issues, seen on both
  virtual and real cards.
  PR: 211713

Changes:
_U  stable/12/
  stable/12/sys/dev/pci/pci.c
Comment 85 commit-hook freebsd_committer 2019-09-06 00:07:11 UTC
A commit references this bug:

Author: imp
Date: Fri Sep  6 00:06:55 UTC 2019
New revision: 351917
URL: https://svnweb.freebsd.org/changeset/base/351917

Log:
  MFC r349845:

    Work around devices which return all zeros for reads of existing MSI-X table
    VCTRL registers.

  PR: 211713

Changes:
_U  stable/11/
  stable/11/sys/dev/pci/pci.c