Bug 243422 - NVME controller failure: resetting (AMD Radeon R5 NVMe Series)
Summary: NVME controller failure: resetting (AMD Radeon R5 NVMe Series)
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.1-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: Warner Losh
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-18 08:31 UTC by shamaz.mazum
Modified: 2021-10-01 08:45 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description shamaz.mazum 2020-01-18 08:31:58 UTC
Hello. I use a machine running FreeBSD 12.1-RELEASE and equipped with AMD Radeon R5 NVMe 120Gb drive.

When I try to write something on this NVMe drive (I have UFS on it), I constantly get these errors in :

nvme0: Resetting controller due to a timeout.
nvme0: Resetting controller due to a timeout.
nvme0: Resetting controller due to a timeout.
nvme0: Resetting controller due to a timeout.
nvme0: Resetting controller due to a timeout.
nvme0: Resetting controller due to a timeout.
nvme0: resetting controller
nvme0: aborting outstanding i/o
nvme0: aborting outstanding i/o
nvme0: aborting outstanding i/o
nvme0: aborting outstanding i/o
nvme0: aborting outstanding i/o
nvme0: aborting outstanding i/o
nvme0: aborting outstanding i/o

This is a message on drive attach:
nvme0: <Generic NVMe Device> mem 0xfe900000-0xfe903fff irq 24 at device 0.0 on pci1

The write speed decreases greatly and sometimes written files are corrupt. While reading, on the contrary, all works just fine and read speed is really impressive.

There are many bugs reported considering NVMe. Here is what I found and how this differs from my situation:

bug #211713 — probably the first bug reported. It is about suspend/resume and I do not use suspend/resume on my PC.
bug #232466 — again, this is about suspend/resume
bug #243148 — I am not sure, maybe this is the same problem as mine.
bug #243063 — This guy has a similar problem in bhyve guest.

I've decided to report another bug because this is maybe manufacturer specific bug (some of the bugs I posted above are closed and people have their NVMe drives working). So, I am sorry if it's a duplicate.
Comment 1 shamaz.mazum 2020-01-18 08:35:48 UTC
Also, in the end of discussion of bug #243422, someone said that switching TRIM off helps.

I've tried tunefs -t disable, but the problem is still there
Comment 2 Warner Losh freebsd_committer freebsd_triage 2021-07-10 22:03:20 UTC
Does this problem happen with FreeBSD 13? Many changes have gone in there and it's likely fixed.