Bug 225666 - WDC WDS120G2G0A-00JH30 SSD need quirks=0x3<4K,NCQ_TRIM_BROKEN>
Summary: WDC WDS120G2G0A-00JH30 SSD need quirks=0x3<4K,NCQ_TRIM_BROKEN>
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.1-STABLE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Scott Long
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-04 20:36 UTC by anders.lundgren
Modified: 2021-06-11 10:42 UTC (History)
7 users (show)

See Also:


Attachments
system log (9.45 KB, text/plain)
2018-02-04 20:36 UTC, anders.lundgren
no flags Details
wd-green-ssd-quirk.diff (446 bytes, patch)
2019-01-18 05:12 UTC, Oleksandr Tymoshenko
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description anders.lundgren 2018-02-04 20:36:10 UTC
Created attachment 190320 [details]
system log

The WD Green SSD drives (tested in my case with WDS120G2G0A-00JH30 with firmware: UE300000) needs kern.cam.ada.1.quirks=3 to operate properly. Without this quirk there will be silent data corruptions of the SSD drive.
It would be nice if this can be built into the kernel in ata_da.c.

camcontrol devlist
<ST3000DM001-1CH166 CC24>          at scbus0 target 0 lun 0 (ada0,pass0)
<Marvell Console 1.01>             at scbus7 target 0 lun 0 (pass1)
<WDC WDS120G2G0A-00JH30 UE300000>  at scbus8 target 0 lun 0 (ada1,pass2)
<USB Flash Drive PMAP>             at scbus10 target 0 lun 0 (da0,pass3)

camcontrol identify ada1
pass2: <WDC WDS120G2G0A-00JH30 UE300000> ACS-2 ATA SATA 3.x device
pass2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)

protocol              ATA/ATAPI-9 SATA 3.x
device model          WDC WDS120G2G0A-00JH30
firmware revision     UE300000
serial number         174802802899
WWN                   5001b448b6978e35
cylinders             16383
heads                 16
sectors/track         63
sector size           logical 512, physical 512, offset 0
LBA supported         234455040 sectors
LBA48 supported       234455040 sectors
PIO supported         PIO4
DMA supported         WDMA2 UDMA6 
media RPM             non-rotating

Feature                      Support  Enabled   Value           Vendor
read ahead                     yes	yes
write cache                    yes	yes
flush cache                    yes	yes
overlap                        no
Tagged Command Queuing (TCQ)   no	no
Native Command Queuing (NCQ)   yes		32 tags
NCQ Queue Management           no
NCQ Streaming                  no
Receive & Send FPDMA Queued    no
SMART                          yes	yes
microcode download             yes	yes
security                       yes	no
power management               yes	yes
advanced power management      yes	no	0/0x00
automatic acoustic management  no	no
media status notification      no	no
power-up in Standby            no	no
write-read-verify              no	no
unload                         no	no
general purpose logging        yes	yes
free-fall                      no	no
Data Set Management (DSM/TRIM) yes
DSM - max 512byte blocks       yes              8
DSM - deterministic read       yes              any value
Host Protected Area (HPA)      yes      no      234455040/234455040
HPA - Security                 no
Comment 1 Oleksandr Tymoshenko freebsd_committer freebsd_triage 2019-01-18 04:44:55 UTC
Do you still have hardware to verify the fix?
Comment 2 Oleksandr Tymoshenko freebsd_committer freebsd_triage 2019-01-18 05:12:48 UTC
Created attachment 201227 [details]
wd-green-ssd-quirk.diff

Patch with the quirk for WD Green SSD
Comment 3 Mario Olofo 2020-02-27 03:26:43 UTC
Hi,
I have a WD Green SSD, model WDC WDS480G2G0B-00EPW0 with firmware UK450000 and the quirk did not work on FreeBSD 12.0, tried to put it on loader.conf and device.hints and it's not recognized at all, so I recompiled the kernel with this patch but I continue to receive silent data corruptions.
Comment 4 commit-hook freebsd_committer freebsd_triage 2020-02-27 05:00:56 UTC
A commit references this bug:

Author: scottl
Date: Thu Feb 27 05:00:21 UTC 2020
New revision: 358366
URL: https://svnweb.freebsd.org/changeset/base/358366

Log:
  Add a quirk for the WDC Green series of SSDs to disable NCQ TRIM, as this
  avoids silent data corruption.

  PR:		225666
  Submitted by:	anders lundgren
  MFC after:	3 days

Changes:
  head/sys/cam/ata/ata_da.c
Comment 5 Scott Long freebsd_committer freebsd_triage 2020-02-27 05:01:40 UTC
I've committed this patch to HEAD, will merge it to 12.x and 11.x in a few days.  That will eliminate the need for a manual quirk entry.  Can you say more about what you tried and what kind of corruption resulted?
Comment 6 Mario Olofo 2020-02-27 12:42:58 UTC
Yes, I had this data corruption a while ago, so I tried to get help on the forums, but no luck. The link is https://forums.freebsd.org/threads/fixing-metadata-errors-after-zfs-clear-zfs-scrub.72139/

So some days ago I again got time to try installing FreeBSD again so I send an email to freebsd-stable lists, but besides the very good support from the people, didn't found a solution so far, but tested some things:

1- Installed FBSD on a hybrid HDD, worked without errors;
2- Did a memory test from Windows and the dell diagnostics tool, no errors found;
3- Installed on the SSD and after install, reboot and download some tools (npm, node, git), the zpool scrub already show me checksum errors.
4- Reinstalled and this time just applied the patch and build the kernel. Did a reboot, the quirks detected was 0x03 (from 0x0D that it detected previously), so I downloaded the same tools, did a git clone of one of my projects, npm install to get a lot of little files on node_modules and then after this I ran the zpool scrub. The scrub found data errors. I think that enabling the bit 4k just delays the problem, at least in my case.
Comment 7 Mario Olofo 2020-02-27 14:46:00 UTC
Tried to reinstall FreeBSD again with vfs.zfs.trim.enabled=0 and it appears that the silent data corruptions are gone. I'll rebuild the kernel and install more packages to use more disk space and see if the problem shows up.
Comment 8 Scott Long freebsd_committer freebsd_triage 2020-02-27 15:41:11 UTC
Thanks for your feedback.  It sounds like TRIM in general is problematic for you, not just NCQ TRIM.  I'll see if I can reproduce.
Comment 9 Mario Olofo 2020-02-27 16:17:40 UTC
Ok, thank you.
If you need help with anything, let me know.
Comment 10 commit-hook freebsd_committer freebsd_triage 2020-03-01 18:02:54 UTC
A commit references this bug:

Author: scottl
Date: Sun Mar  1 18:02:00 UTC 2020
New revision: 358489
URL: https://svnweb.freebsd.org/changeset/base/358489

Log:
  Add a quirk for the WDC Green series of SSDs to disable NCQ TRIM, as this
  avoids silent data corruption.

  PR:             225666
  Submitted by:   anders lundgren

Changes:
_U  stable/12/
  stable/12/sys/cam/ata/ata_da.c
Comment 11 commit-hook freebsd_committer freebsd_triage 2020-03-01 18:03:57 UTC
A commit references this bug:

Author: scottl
Date: Sun Mar  1 18:03:09 UTC 2020
New revision: 358490
URL: https://svnweb.freebsd.org/changeset/base/358490

Log:
  Add a quirk for the WDC Green series of SSDs to disable NCQ TRIM, as this
  avoids silent data corruption.

  PR:             225666
  Submitted by:   anders lundgren

Changes:
_U  stable/11/
  stable/11/sys/cam/ata/ata_da.c
Comment 12 Jason W. Bacon freebsd_committer freebsd_triage 2020-10-06 13:33:32 UTC
Not sure if it's related, but I'm seeing system hangs followed by spontaneous reboots in 12.2-RC1 with a <WDC WDS240G2G0A-00JH30 UF510000> ACS-2 ATA SATA 3.x.

It happens consistently during portsnap extract.

No clues in dmesg or /var/log/messages.

The model seems to match the pattern in your previous ata_da.c patch, so maybe this is a different issue.
Comment 13 Jason W. Bacon freebsd_committer freebsd_triage 2020-10-06 22:50:47 UTC
Disabling hardware write cache works around the issue:

pkg install auto-admin
auto-write-cache-toggle off cli

which simply adds

# Added by auto-admin from cli
kern.cam.ada.write_cache=0
# End auto-admin addition

to /boot/loader.conf and reminds you to reboot.

Before disabling the cache, the system with crash within a minute of the onset of heavy disk writing.  I've now been running most of the day with no issues.

Some relevant system info:

dmnesg.boot:

ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <WDC WDS240G2G0A-00JH30 UF510000> ACS-2 ATA SATA 3.x device
ada0: Serial Number 203058803017
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 228944MB (468877312 512 byte sectors)
ada0: quirks=0x3<4K,NCQ_TRIM_BROKEN>

FreeBSD quagga.acadix  bacon ~ 25: tunefs -p /
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       disabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         disabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             8%
tunefs: space to hold for metadata blocks: (-k)            6408
tunefs: optimization preference: (-o)                      time
tunefs: volume label: (-L)                                 

FreeBSD quagga.acadix  bacon ~ 26: mount
/dev/ada0p2 on / (ufs, local, soft-updates)
Comment 14 Mark Linimon freebsd_committer freebsd_triage 2021-06-11 10:42:43 UTC
^Triage: assign to committer who resolved.