Bug 277673

Summary: TRIM visibility bugs
Product: Base System Reporter: mgrooms
Component: kernAssignee: freebsd-scsi (Nobody) <scsi>
Status: Closed FIXED    
Severity: Affects Many People CC: imp, stephan
Priority: --- Keywords: cam
Version: 14.0-RELEASE   
Hardware: Any   
OS: Any   

Description mgrooms 2024-03-13 17:02:36 UTC
Hey All,

While toying with different storage options to use with bhyve, I've run across a few frustrations related to visibility into TRIM operations. Specifically, it's not always easy to tell if trim operations are being processed by a hardware device due to the counters not being updated by the SCSI da device. The counters exported as sysctl values are updated by the SCCI UNMAP method, but not by the TRIM or WD methods. You can see the BIO_DELETE opterations being processed in real time by running the gstatus -d command, but the counters never change from 0. This gives the false illusion that nothing is happening. With the following patch, I'm now able to see the counters reflect trim operations that are processed by da devices ...

mgrooms@mgrooms:~/trim $ cat scsi_da.diff
--- scsi_da.c.orig    2024-03-13 11:32:32.098922000 -0500
+++ scsi_da.c    2024-03-13 11:31:37.255187000 -0500
@@ -4197,6 +4197,9 @@
               da_default_timeout * 1000);
     ccb->ccb_h.ccb_state = DA_CCB_DELETE;
     ccb->ccb_h.flags |= CAM_UNLOCKED;
+    softc->trim_count++;
+    softc->trim_ranges += ranges;
+    softc->trim_lbas += block_count;
     cam_iosched_submit_trim(softc->cam_iosched);
 }
 
@@ -4257,6 +4260,8 @@
             da_default_timeout * 1000);
     ccb->ccb_h.ccb_state = DA_CCB_DELETE;
     ccb->ccb_h.flags |= CAM_UNLOCKED;
+    softc->trim_count++;
+    softc->trim_lbas += count;
     cam_iosched_submit_trim(softc->cam_iosched);
 }

Additionally, while attempting to test geom mirror+stripe to provide software RAID10, the diskinfo utility reports that a mirror supports UNMAP/TRIM when at least one underlying devices supports it, but a stripe does not. I added a small patch that attempts to use the same logic as mirror so that it will report that UNMAP/TRIM is supported when one of the underlying devices does ...

--- g_stripe.c.orig    2024-03-12 18:23:52.960025000 -0500
+++ g_stripe.c    2024-03-12 18:25:01.009378000 -0500
@@ -26,6 +26,7 @@
  * SUCH DAMAGE.
  */
 
+#include <sys/cdefs.h>
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
@@ -568,7 +569,7 @@
     off_t offset, start, length, nstripe, stripesize;
     struct g_stripe_softc *sc;
     u_int no;
-    int error, fast = 0;
+    int error, fast = 0, val = 0;
 
     sc = bp->bio_to->geom->softc;
     /*
@@ -591,6 +592,12 @@
         g_stripe_pushdown(sc, bp);
         return;
     case BIO_GETATTR:
+        if (!strcmp(bp->bio_attribute, "GEOM::candelete")) {
+            if (sc->sc_flags & G_STRIPE_FLAG_CANDELETE)
+                val = 1;
+            g_handleattr(bp, "GEOM::candelete", &val, sizeof(val));
+            return;
+        }
         /* To which provider it should be delivered? */
     default:
         g_io_deliver(bp, EOPNOTSUPP);
@@ -745,7 +752,7 @@
 {
     struct g_consumer *cp, *fcp;
     struct g_geom *gp;
-    int error;
+    int error, i;
 
     g_topology_assert();
     /* Metadata corrupted? */
@@ -792,8 +799,19 @@
             goto fail;
         }
     }
-
     sc->sc_disks[no] = cp;
+
+    /* cascade candelete */
+    error = g_access(cp, 1, 0, 0);
+    if (error == 0)
+    {
+        error = g_getattr("GEOM::candelete", cp, &i);
+        if (error == 0 && i != 0)
+            sc->sc_flags |= G_STRIPE_FLAG_CANDELETE;
+        G_STRIPE_DEBUG(1, "Provider %s candelete %i.", pp->name, i);
+        g_access(cp, -1, 0, 0);
+    }
+
     G_STRIPE_DEBUG(0, "Disk %s attached to %s.", pp->name, sc->sc_name);
     g_stripe_check_and_run(sc);
 
--- g_stripe.h.orig    2024-03-12 18:24:00.960741000 -0500
+++ g_stripe.h    2024-03-12 12:25:22.842925000 -0500
@@ -47,6 +47,8 @@
 #define    G_STRIPE_TYPE_MANUAL    0
 #define    G_STRIPE_TYPE_AUTOMATIC    1
 
+#define    G_STRIPE_FLAG_CANDELETE        0x00000001UL
+
 #define    G_STRIPE_DEBUG(lvl, ...) \
     _GEOM_DEBUG("GEOM_STRIPE", g_stripe_debug, (lvl), NULL, __VA_ARGS__)
 #define    G_STRIPE_LOGREQ(bp, ...) \
@@ -61,6 +63,7 @@
     uint16_t     sc_ndisks;
     off_t         sc_stripesize;
     uint32_t     sc_stripebits;
+    uint32_t     sc_flags;
     struct mtx     sc_lock;
 };
 #define    sc_name    sc_geom->name
Comment 1 Warner Losh freebsd_committer freebsd_triage 2024-05-03 14:48:28 UTC
da part of this pushed. It was almost correct (I think it would be better to treat each WRITE SAME as 1 range so the stats are consistent with other methods).
Comment 2 commit-hook freebsd_committer freebsd_triage 2024-05-03 15:04:21 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ea2d874cca7cdfe6133c1835dadd8f0672723fa6

commit ea2d874cca7cdfe6133c1835dadd8f0672723fa6
Author:     Matthew Grooms <mgrooms@shrew.net>
AuthorDate: 2024-05-03 15:01:21 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2024-05-03 15:03:31 +0000

    geom_stripe: Cascade cantrim just like we do for gmirror

    If any of the disks can support trim, cascade that up the
    stack. Otherwise, trims won't pass through striped raid setups.

    PR: 277673
    Reviewed by: imp (minor style tweaks from bug report)

 sys/geom/stripe/g_stripe.c | 21 ++++++++++++++++++++-
 sys/geom/stripe/g_stripe.h |  3 +++
 2 files changed, 23 insertions(+), 1 deletion(-)
Comment 3 Warner Losh freebsd_committer freebsd_triage 2024-05-03 15:05:21 UTC
And the second one has landed. I've queued these for MFC next time I do a MFC run, but it likely won't be in 14.1 due to the lateness of the hour.
Comment 4 mgrooms 2024-05-03 15:36:46 UTC
I appreciate you looking at this Warner. I was concerned that second patch might be incorrect for the case that a disk with trim support was added to gstripe and later removed. The new behavior would report a false positive since the G_STRIPE_FLAG_CANDELETE value is irreversibly set to true when one or more disks that support TRIM are added to the stripe. I could rework the patch to be smarter for that case if that would be helpful.
Comment 5 Warner Losh freebsd_committer freebsd_triage 2024-05-04 14:23:41 UTC
(In reply to mgrooms from comment #4)
Yes. keeping a mask of disks that support it and/or recomputing the single bit when a disk is added / removed would be a bit better. Though the consequences of having it advertised when it isn't possible are low.
Comment 6 commit-hook freebsd_committer freebsd_triage 2024-05-20 20:07:09 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f65f02ccf2a5656e96b32705bad52b11fbc3177c

commit f65f02ccf2a5656e96b32705bad52b11fbc3177c
Author:     Matthew Grooms <mgrooms@shrew.net>
AuthorDate: 2024-05-03 15:01:21 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2024-05-20 19:23:40 +0000

    geom_stripe: Cascade cantrim just like we do for gmirror

    If any of the disks can support trim, cascade that up the
    stack. Otherwise, trims won't pass through striped raid setups.

    PR: 277673
    Reviewed by: imp (minor style tweaks from bug report)

    (cherry picked from commit ea2d874cca7cdfe6133c1835dadd8f0672723fa6)

 sys/geom/stripe/g_stripe.c | 21 ++++++++++++++++++++-
 sys/geom/stripe/g_stripe.h |  3 +++
 2 files changed, 23 insertions(+), 1 deletion(-)