274974 – nvme: resetting controller after mounting a partition

Bug 274974 - nvme: resetting controller after mounting a partition

Summary: nvme: resetting controller after mounting a partition

Status:	New

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	14.0-RELEASE
Hardware:	Any Any

Importance:	--- Affects Only Me
Assignee:	freebsd-bugs (Nobody)

URL:
Keywords:

Depends on:
Blocks:

Reported:	2023-11-08 21:01 UTC by Piotr Kubaj
Modified:	2023-11-12 23:14 UTC (History)
CC List:	4 users (show)

See Also:

Attachments
dmesg.boot (19.44 KB, text/plain) 2023-11-10 06:58 UTC, Piotr Kubaj	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Piotr Kubaj freebsd_committer

2023-11-08 21:01:54 UTC

Having previously used an M.2 disk in a simple pass-through adapter card, I bought a Sonnet McFiver card so plug in a 2nd disk. However, when the card is used, simply mounting any partition on any of the disks plugged to the card, results in:
nvme1: Resetting controller due to a timeout and possible hot unplug.
nvme1: resetting controller
nvme1: failing outstanding i/o
nvme1: READ sqid:5 cid:124 nsid:1 lba:73 len:8
nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:1 p:0 sqid:5 cid:124 cdw0:0
(nda1:nvme1:0:0:1): READ. NCB: opc=2 fuse=0 nsid=1000000 prp1=0 prp2=0 cdw=49000000 0 7000000 0 0 0
(nda1:nvme1:0:0:1): CAM status: Unknown (0x420)
(nda1:nvme1:0:0:1): Error 5, Retries exhausted
g_vfs_done():nda1p1[READ(offset=4608, length=4096)]error = 5
nda1 at nvme1 bus 0 scbus5 target 0 lun 1
nda1: <Samsung SSD 980 1TB 3B4QFXO7 S649NS0W619970N> s/n S649NS0W619970N detached
(nda1:nvme1:0:0:1): Periph destroyed


On Linux accessing both drives works fine. The card and its devices are detected on Linux as:
0000:01:00.0 PCI bridge: PLX Technology, Inc. PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA (rev ca)
0000:02:01.0 PCI bridge: PLX Technology, Inc. PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA (rev ca)
0000:02:02.0 PCI bridge: PLX Technology, Inc. PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA (rev ca)
0000:02:08.0 PCI bridge: PLX Technology, Inc. PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA (rev ca)
0000:02:09.0 PCI bridge: PLX Technology, Inc. PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA (rev ca)
0000:02:0a.0 PCI bridge: PLX Technology, Inc. PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA (rev ca)
0000:03:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
0000:04:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller 980
0000:0a:00.0 USB controller: ASMedia Technology Inc. ASM2142/ASM3142 USB 3.1 Host Controller
0000:0b:00.0 Ethernet controller: Aquantia Corp. AQC113CS NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 03)

Comment 1 Warner Losh freebsd_committer

2023-11-08 22:26:03 UTC

So, the unknown statis is my fault: Forgot to add something to a table. Ignore that.

But second, this error message happens when we post a transaction to the card, and then we don't get an interrupt and we timeout. When we timeout, we go look at the status registers for the card and find that the card is gone (reads as 0xffffffff) (that's the possible hotplug part). With the card reading that, there's no hope: it's game over.

So the question is presumably it wasn't like that when we booted the system. It had to have read these registers, and a lot of others to boot, to find the card, then later we've had to do I/Os to the card when CAM starts up to get the card to attach (though these are trivial, they'd freak out if they read back all ff's). So what happened between probe time and now to get it into this state? Did a bridge go away, get renumbered, move its memory windows? Was something else mapped in conflict so the fight over the decoding results in ff's? Why didn't the interrupt happen so we got into the timeout path?

Comment 2 Warner Losh freebsd_committer

2023-11-08 22:32:04 UTC

Please attach a dmesg.

Can you interact with the card with nvmecontrol before you try to boot it (I suspect not)?

Are there one or two nvme cards behind that bridge?

Comment 3 Piotr Kubaj freebsd_committer

2023-11-10 06:58:45 UTC

Created attachment 246227 [details]
dmesg.boot

Here's a full dmesg from booting.

Comment 4 Piotr Kubaj freebsd_committer

2023-11-10 14:19:20 UTC

Trying to mount on CURRENT:
nvme1: Resetting controller due to a timeout and possible hot unplug.
nvme1: resetting controller
nvme1: failing outstanding i/o
nvme1: READ sqid:5 cid:127 nsid:1 lba:73 len:8
nvme1: ABORTED - BY REQUEST (00/07) crd:0 m:0 dnr:1 p:0 sqid:5 cid:127 cdw0:0
(nda1:nvme1:0:0:1): READ. NCB: opc=2 fuse=0 nsid=1 prp1=0 prp2=0 cdw=49 0 7 0 0 0
(nda1:nvme1:0:0:1): CAM status: NVME Status Error
(nda1:nvme1:0:0:1): Error 5, Retries exhausted
g_vfs_done():nda1p1[READ(offset=4608, length=4096)]error = 5
nda1 at nvme1 bus 0 scbus5 target 0 lun 1
nda1: <Samsung SSD 980 1TB 3B4QFXO7 S649NS0W619970N> s/n S649NS0W619970N detached
mount_msdosfs: /dev/nda1p1: (Input/output errnor
da1:nvme1root@:~ # :0:0:1): Periph destroyed
nvme1: Failed controller, stopping watchdog timeout.

root@:~ # uname -a
FreeBSD  15.0-CURRENT FreeBSD 15.0-CURRENT #0 main-n266315-b2b381d365fc: Thu Nov  9 04:12:28 UTC 2023     root@releng3.nyi.freebsd.org:/usr/obj/usr/src/powerpc.powerpc64le/sys/GENERIC64LE powerpc

Comment 5 Alexander Motin freebsd_committer

2023-11-12 17:11:57 UTC

As I can see, this Sonnet McFiver card has own PCIe bridge.  I wonder whether it (probably falsely) supports PCIe hot-plug on the M.2 (and may be falsely trigger it).  Since the system seems to be PowerPC, I wonder what is the status of the PCIe hot-plug support there.

Comment 6 Warner Losh freebsd_committer

2023-11-12 17:35:36 UTC

(In reply to Alexander Motin from comment #5)
> Since the system seems to be PowerPC, I wonder what is the status of the PCIe hot-plug support there.

GENERIC has it compiled in...
Can't comment on whether or not it is working.

Comment 7 Alexander Motin freebsd_committer

2023-11-12 17:44:19 UTC

Wonder if verbose dmesg could say more about hot-plug.  Same as `pciconf -lvcb` on the PCIe bridge ports parent to nvmes.

Comment 8 Piotr Kubaj freebsd_committer

2023-11-12 23:14:41 UTC

Disks are on pci3 and pci4:
nvme0: <Generic NVMe Device> mem 0x80000000-0x80003fff irq 1044473 at device 0.0 numa-domain 0 on pci3
nvme1: <Generic NVMe Device> mem 0x80400000-0x80403fff irq 1044474 at device 0.0 numa-domain 0 on pci4

pci3 and pci4 are on respectively pcib3 and pcib4:
pci3: <OFW PCI bus> numa-domain 0 on pcib3
pci4: <OFW PCI bus> numa-domain 0 on pcib4

Indeed one of those has HotPlug, but the other does not.
pcib3@pci0:2:1:0:       class=0x060400 rev=0xca hdr=0x01 vendor=0x10b5 device=0x8724 subvendor=0x16b8 subdevice=0x7404
    vendor     = 'PLX Technology, Inc.'
    device     = 'PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA'
    class      = bridge
    subclass   = PCI-PCI
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[48] = MSI supports 8 messages, 64 bit, vector masks
    cap 10[68] = PCI-Express 2 downstream port max data 256(1024) RO NS ARI disabled
                 max read 128
                 link x4(x4) speed 8.0(8.0) ASPM disabled(L1)
                 slot 1 power limit 25000 mW HotPlug(present) Attn Button PC(off) MRL(open)
    cap 0d[a4] = PCI Bridge subvendor=0x16b8 subdevice=0x7404
    ecap 0003[100] = Serial 1 ca870010b5df0e00
    ecap 0001[fb4] = AER 1 0 fatal 0 non-fatal 0 corrected
    ecap 0004[138] = Power Budgeting 1
    ecap 0019[10c] = PCIe Sec 1 lane errors 0
    ecap 0002[148] = VC 1 max VC0
    ecap 0012[e00] = Multicast 1
    ecap 000d[f24] = ACS 1 Source Validation disabled, Translation Blocking disabled
                     P2P Req Redirect disabled, P2P Cmpl Redirect disabled
                     P2P Upstream Forwarding disabled, P2P Egress Control disabled
                     P2P Direct Translated disabled, Enhanced Capability unavailable
    ecap 000b[b70] = Vendor [1] ID 0001 Rev 0 Length 16
pcib4@pci0:2:2:0:       class=0x060400 rev=0xca hdr=0x01 vendor=0x10b5 device=0x8724 subvendor=0x16b8 subdevice=0x7404
    vendor     = 'PLX Technology, Inc.'
    device     = 'PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA'
    class      = bridge
    subclass   = PCI-PCI
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[48] = MSI supports 8 messages, 64 bit, vector masks
    cap 10[68] = PCI-Express 2 downstream port max data 256(1024) RO NS ARI disabled
                 max read 128
                 link x4(x4) speed 8.0(8.0) ASPM disabled(L1)
                 slot 2 power limit 25000 mW
    cap 0d[a4] = PCI Bridge subvendor=0x16b8 subdevice=0x7404
    ecap 0003[100] = Serial 1 ca870010b5df0e00
    ecap 0001[fb4] = AER 1 0 fatal 0 non-fatal 0 corrected
    ecap 0004[138] = Power Budgeting 1
    ecap 0019[10c] = PCIe Sec 1 lane errors 0
    ecap 0002[148] = VC 1 max VC0
    ecap 0012[e00] = Multicast 1
    ecap 000d[f24] = ACS 1 Source Validation disabled, Translation Blocking disabled
                     P2P Req Redirect disabled, P2P Cmpl Redirect disabled
                     P2P Upstream Forwarding disabled, P2P Egress Control disabled
                     P2P Direct Translated disabled, Enhanced Capability unavailable
    ecap 000b[b70] = Vendor [1] ID 0001 Rev 0 Length 16


What's more strange, I can now mount the nvme0 drive on pci3 (which has HP), even though it didn't work before. Attempting to mount nvme1 still doesn't work.