Created attachment 190320 [details] system log The WD Green SSD drives (tested in my case with WDS120G2G0A-00JH30 with firmware: UE300000) needs kern.cam.ada.1.quirks=3 to operate properly. Without this quirk there will be silent data corruptions of the SSD drive. It would be nice if this can be built into the kernel in ata_da.c. camcontrol devlist <ST3000DM001-1CH166 CC24> at scbus0 target 0 lun 0 (ada0,pass0) <Marvell Console 1.01> at scbus7 target 0 lun 0 (pass1) <WDC WDS120G2G0A-00JH30 UE300000> at scbus8 target 0 lun 0 (ada1,pass2) <USB Flash Drive PMAP> at scbus10 target 0 lun 0 (da0,pass3) camcontrol identify ada1 pass2: <WDC WDS120G2G0A-00JH30 UE300000> ACS-2 ATA SATA 3.x device pass2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) protocol ATA/ATAPI-9 SATA 3.x device model WDC WDS120G2G0A-00JH30 firmware revision UE300000 serial number 174802802899 WWN 5001b448b6978e35 cylinders 16383 heads 16 sectors/track 63 sector size logical 512, physical 512, offset 0 LBA supported 234455040 sectors LBA48 supported 234455040 sectors PIO supported PIO4 DMA supported WDMA2 UDMA6 media RPM non-rotating Feature Support Enabled Value Vendor read ahead yes yes write cache yes yes flush cache yes yes overlap no Tagged Command Queuing (TCQ) no no Native Command Queuing (NCQ) yes 32 tags NCQ Queue Management no NCQ Streaming no Receive & Send FPDMA Queued no SMART yes yes microcode download yes yes security yes no power management yes yes advanced power management yes no 0/0x00 automatic acoustic management no no media status notification no no power-up in Standby no no write-read-verify no no unload no no general purpose logging yes yes free-fall no no Data Set Management (DSM/TRIM) yes DSM - max 512byte blocks yes 8 DSM - deterministic read yes any value Host Protected Area (HPA) yes no 234455040/234455040 HPA - Security no
Do you still have hardware to verify the fix?
Created attachment 201227 [details] wd-green-ssd-quirk.diff Patch with the quirk for WD Green SSD
Hi, I have a WD Green SSD, model WDC WDS480G2G0B-00EPW0 with firmware UK450000 and the quirk did not work on FreeBSD 12.0, tried to put it on loader.conf and device.hints and it's not recognized at all, so I recompiled the kernel with this patch but I continue to receive silent data corruptions.
A commit references this bug: Author: scottl Date: Thu Feb 27 05:00:21 UTC 2020 New revision: 358366 URL: https://svnweb.freebsd.org/changeset/base/358366 Log: Add a quirk for the WDC Green series of SSDs to disable NCQ TRIM, as this avoids silent data corruption. PR: 225666 Submitted by: anders lundgren MFC after: 3 days Changes: head/sys/cam/ata/ata_da.c
I've committed this patch to HEAD, will merge it to 12.x and 11.x in a few days. That will eliminate the need for a manual quirk entry. Can you say more about what you tried and what kind of corruption resulted?
Yes, I had this data corruption a while ago, so I tried to get help on the forums, but no luck. The link is https://forums.freebsd.org/threads/fixing-metadata-errors-after-zfs-clear-zfs-scrub.72139/ So some days ago I again got time to try installing FreeBSD again so I send an email to freebsd-stable lists, but besides the very good support from the people, didn't found a solution so far, but tested some things: 1- Installed FBSD on a hybrid HDD, worked without errors; 2- Did a memory test from Windows and the dell diagnostics tool, no errors found; 3- Installed on the SSD and after install, reboot and download some tools (npm, node, git), the zpool scrub already show me checksum errors. 4- Reinstalled and this time just applied the patch and build the kernel. Did a reboot, the quirks detected was 0x03 (from 0x0D that it detected previously), so I downloaded the same tools, did a git clone of one of my projects, npm install to get a lot of little files on node_modules and then after this I ran the zpool scrub. The scrub found data errors. I think that enabling the bit 4k just delays the problem, at least in my case.
Tried to reinstall FreeBSD again with vfs.zfs.trim.enabled=0 and it appears that the silent data corruptions are gone. I'll rebuild the kernel and install more packages to use more disk space and see if the problem shows up.
Thanks for your feedback. It sounds like TRIM in general is problematic for you, not just NCQ TRIM. I'll see if I can reproduce.
Ok, thank you. If you need help with anything, let me know.
A commit references this bug: Author: scottl Date: Sun Mar 1 18:02:00 UTC 2020 New revision: 358489 URL: https://svnweb.freebsd.org/changeset/base/358489 Log: Add a quirk for the WDC Green series of SSDs to disable NCQ TRIM, as this avoids silent data corruption. PR: 225666 Submitted by: anders lundgren Changes: _U stable/12/ stable/12/sys/cam/ata/ata_da.c
A commit references this bug: Author: scottl Date: Sun Mar 1 18:03:09 UTC 2020 New revision: 358490 URL: https://svnweb.freebsd.org/changeset/base/358490 Log: Add a quirk for the WDC Green series of SSDs to disable NCQ TRIM, as this avoids silent data corruption. PR: 225666 Submitted by: anders lundgren Changes: _U stable/11/ stable/11/sys/cam/ata/ata_da.c
Not sure if it's related, but I'm seeing system hangs followed by spontaneous reboots in 12.2-RC1 with a <WDC WDS240G2G0A-00JH30 UF510000> ACS-2 ATA SATA 3.x. It happens consistently during portsnap extract. No clues in dmesg or /var/log/messages. The model seems to match the pattern in your previous ata_da.c patch, so maybe this is a different issue.
Disabling hardware write cache works around the issue: pkg install auto-admin auto-write-cache-toggle off cli which simply adds # Added by auto-admin from cli kern.cam.ada.write_cache=0 # End auto-admin addition to /boot/loader.conf and reminds you to reboot. Before disabling the cache, the system with crash within a minute of the onset of heavy disk writing. I've now been running most of the day with no issues. Some relevant system info: dmnesg.boot: ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: <WDC WDS240G2G0A-00JH30 UF510000> ACS-2 ATA SATA 3.x device ada0: Serial Number 203058803017 ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 512bytes) ada0: Command Queueing enabled ada0: 228944MB (468877312 512 byte sectors) ada0: quirks=0x3<4K,NCQ_TRIM_BROKEN> FreeBSD quagga.acadix bacon ~ 25: tunefs -p / tunefs: POSIX.1e ACLs: (-a) disabled tunefs: NFSv4 ACLs: (-N) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) enabled tunefs: soft update journaling: (-j) disabled tunefs: gjournal: (-J) disabled tunefs: trim: (-t) disabled tunefs: maximum blocks per file in a cylinder group: (-e) 4096 tunefs: average file size: (-f) 16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: space to hold for metadata blocks: (-k) 6408 tunefs: optimization preference: (-o) time tunefs: volume label: (-L) FreeBSD quagga.acadix bacon ~ 26: mount /dev/ada0p2 on / (ufs, local, soft-updates)
^Triage: assign to committer who resolved.