Hey All, While toying with different storage options to use with bhyve, I've run across a few frustrations related to visibility into TRIM operations. Specifically, it's not always easy to tell if trim operations are being processed by a hardware device due to the counters not being updated by the SCSI da device. The counters exported as sysctl values are updated by the SCCI UNMAP method, but not by the TRIM or WD methods. You can see the BIO_DELETE opterations being processed in real time by running the gstatus -d command, but the counters never change from 0. This gives the false illusion that nothing is happening. With the following patch, I'm now able to see the counters reflect trim operations that are processed by da devices ... mgrooms@mgrooms:~/trim $ cat scsi_da.diff --- scsi_da.c.orig 2024-03-13 11:32:32.098922000 -0500 +++ scsi_da.c 2024-03-13 11:31:37.255187000 -0500 @@ -4197,6 +4197,9 @@ da_default_timeout * 1000); ccb->ccb_h.ccb_state = DA_CCB_DELETE; ccb->ccb_h.flags |= CAM_UNLOCKED; + softc->trim_count++; + softc->trim_ranges += ranges; + softc->trim_lbas += block_count; cam_iosched_submit_trim(softc->cam_iosched); } @@ -4257,6 +4260,8 @@ da_default_timeout * 1000); ccb->ccb_h.ccb_state = DA_CCB_DELETE; ccb->ccb_h.flags |= CAM_UNLOCKED; + softc->trim_count++; + softc->trim_lbas += count; cam_iosched_submit_trim(softc->cam_iosched); } Additionally, while attempting to test geom mirror+stripe to provide software RAID10, the diskinfo utility reports that a mirror supports UNMAP/TRIM when at least one underlying devices supports it, but a stripe does not. I added a small patch that attempts to use the same logic as mirror so that it will report that UNMAP/TRIM is supported when one of the underlying devices does ... --- g_stripe.c.orig 2024-03-12 18:23:52.960025000 -0500 +++ g_stripe.c 2024-03-12 18:25:01.009378000 -0500 @@ -26,6 +26,7 @@ * SUCH DAMAGE. */ +#include <sys/cdefs.h> #include <sys/param.h> #include <sys/systm.h> #include <sys/kernel.h> @@ -568,7 +569,7 @@ off_t offset, start, length, nstripe, stripesize; struct g_stripe_softc *sc; u_int no; - int error, fast = 0; + int error, fast = 0, val = 0; sc = bp->bio_to->geom->softc; /* @@ -591,6 +592,12 @@ g_stripe_pushdown(sc, bp); return; case BIO_GETATTR: + if (!strcmp(bp->bio_attribute, "GEOM::candelete")) { + if (sc->sc_flags & G_STRIPE_FLAG_CANDELETE) + val = 1; + g_handleattr(bp, "GEOM::candelete", &val, sizeof(val)); + return; + } /* To which provider it should be delivered? */ default: g_io_deliver(bp, EOPNOTSUPP); @@ -745,7 +752,7 @@ { struct g_consumer *cp, *fcp; struct g_geom *gp; - int error; + int error, i; g_topology_assert(); /* Metadata corrupted? */ @@ -792,8 +799,19 @@ goto fail; } } - sc->sc_disks[no] = cp; + + /* cascade candelete */ + error = g_access(cp, 1, 0, 0); + if (error == 0) + { + error = g_getattr("GEOM::candelete", cp, &i); + if (error == 0 && i != 0) + sc->sc_flags |= G_STRIPE_FLAG_CANDELETE; + G_STRIPE_DEBUG(1, "Provider %s candelete %i.", pp->name, i); + g_access(cp, -1, 0, 0); + } + G_STRIPE_DEBUG(0, "Disk %s attached to %s.", pp->name, sc->sc_name); g_stripe_check_and_run(sc); --- g_stripe.h.orig 2024-03-12 18:24:00.960741000 -0500 +++ g_stripe.h 2024-03-12 12:25:22.842925000 -0500 @@ -47,6 +47,8 @@ #define G_STRIPE_TYPE_MANUAL 0 #define G_STRIPE_TYPE_AUTOMATIC 1 +#define G_STRIPE_FLAG_CANDELETE 0x00000001UL + #define G_STRIPE_DEBUG(lvl, ...) \ _GEOM_DEBUG("GEOM_STRIPE", g_stripe_debug, (lvl), NULL, __VA_ARGS__) #define G_STRIPE_LOGREQ(bp, ...) \ @@ -61,6 +63,7 @@ uint16_t sc_ndisks; off_t sc_stripesize; uint32_t sc_stripebits; + uint32_t sc_flags; struct mtx sc_lock; }; #define sc_name sc_geom->name
da part of this pushed. It was almost correct (I think it would be better to treat each WRITE SAME as 1 range so the stats are consistent with other methods).
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=ea2d874cca7cdfe6133c1835dadd8f0672723fa6 commit ea2d874cca7cdfe6133c1835dadd8f0672723fa6 Author: Matthew Grooms <mgrooms@shrew.net> AuthorDate: 2024-05-03 15:01:21 +0000 Commit: Warner Losh <imp@FreeBSD.org> CommitDate: 2024-05-03 15:03:31 +0000 geom_stripe: Cascade cantrim just like we do for gmirror If any of the disks can support trim, cascade that up the stack. Otherwise, trims won't pass through striped raid setups. PR: 277673 Reviewed by: imp (minor style tweaks from bug report) sys/geom/stripe/g_stripe.c | 21 ++++++++++++++++++++- sys/geom/stripe/g_stripe.h | 3 +++ 2 files changed, 23 insertions(+), 1 deletion(-)
And the second one has landed. I've queued these for MFC next time I do a MFC run, but it likely won't be in 14.1 due to the lateness of the hour.
I appreciate you looking at this Warner. I was concerned that second patch might be incorrect for the case that a disk with trim support was added to gstripe and later removed. The new behavior would report a false positive since the G_STRIPE_FLAG_CANDELETE value is irreversibly set to true when one or more disks that support TRIM are added to the stripe. I could rework the patch to be smarter for that case if that would be helpful.
(In reply to mgrooms from comment #4) Yes. keeping a mask of disks that support it and/or recomputing the single bit when a disk is added / removed would be a bit better. Though the consequences of having it advertised when it isn't possible are low.
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=f65f02ccf2a5656e96b32705bad52b11fbc3177c commit f65f02ccf2a5656e96b32705bad52b11fbc3177c Author: Matthew Grooms <mgrooms@shrew.net> AuthorDate: 2024-05-03 15:01:21 +0000 Commit: Warner Losh <imp@FreeBSD.org> CommitDate: 2024-05-20 19:23:40 +0000 geom_stripe: Cascade cantrim just like we do for gmirror If any of the disks can support trim, cascade that up the stack. Otherwise, trims won't pass through striped raid setups. PR: 277673 Reviewed by: imp (minor style tweaks from bug report) (cherry picked from commit ea2d874cca7cdfe6133c1835dadd8f0672723fa6) sys/geom/stripe/g_stripe.c | 21 ++++++++++++++++++++- sys/geom/stripe/g_stripe.h | 3 +++ 2 files changed, 23 insertions(+), 1 deletion(-)