Bug 234705

Summary: zpool initialize does not BIO_DELETE
Product: Base System Reporter: Poul-Henning Kamp <phk>
Component: kernAssignee: Steven Hartland <smh>
Status: New ---    
Severity: Affects Only Me CC: imp, smh
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
bio_delete dtrace script none

Description Poul-Henning Kamp freebsd_committer freebsd_triage 2019-01-07 20:38:16 UTC
On devices which support BIO_DELETE, zpool initialize should use it.

Not obvious to me if the BIO_DELETE should replace the write or supplement it.
Comment 1 Conrad Meyer freebsd_committer freebsd_triage 2019-01-07 21:22:30 UTC
This could probably be implemented in vdev_initialize_ranges or vdev_initialize_thread using something like zio_trim (or otherwise invoking ZIO_TYPE_FREE); the 0xdeadbeef fill is likely useful for debugging, but probably not useful for end users.  It could default differently based on INVARIANTS.
Comment 2 Warner Losh freebsd_committer freebsd_triage 2019-01-08 03:36:42 UTC
These days, BIO_DELETE is almost always a win. Writing zeros is a big lose often (though if the lower-layers implement a special WRITE ZEROS command, there might be some cases where that's useful: we currently do not). I doubt BOTH are ever anything but extra work. I would argue, however, that we should think carefully here. The difference between DSM Deallocate and WRITE ZEROS + Deallocate flag (for those nvme drives that support it) is very small (trimmed blocks can return a variety of things depending: 0, ff or last written data)...
Comment 3 Warner Losh freebsd_committer freebsd_triage 2019-01-08 03:48:44 UTC
(In reply to Warner Losh from comment #2)
Which is a long way of saying BIO_DELETE is our deallocation abstraction, and we should use it, but it is (in FreeBSD) non-deterministic. We don't have a way to 'widen' the abstraction to one of the three possible values (0, f's or last) today. If ZFS cares what the data is, it should write the data. If it doesn't care, it should just use TRIM on devices that support it. I'm not sure how useful widening the abstraction would be, though, given the hair involved to make it work and the current lack of a real use case.
Comment 4 Steven Hartland freebsd_committer freebsd_triage 2019-01-08 07:43:54 UTC
This has been the case for a very long time check to see if you have trim on init sysctl set to zero.
Comment 5 Steven Hartland freebsd_committer freebsd_triage 2019-01-08 08:19:20 UTC
Back at a PC now so just to confirm the specific sysctl is:
vfs.zfs.vdev.trim_on_init
Comment 6 Steven Hartland freebsd_committer freebsd_triage 2019-01-08 10:39:40 UTC
Created attachment 200907 [details]
bio_delete dtrace script

Just checked on my 12.0-RELEASE box by creating an md backed pool while running the attached dtrace script and checking the output of:
sysctl kstat.zfs.misc.zio_trim

These show both BIO_DELETE's being processed and no unsupported increment so as far as I can tell vdev's are still being erased on init.

So I guess the question is what lead to come to the conclusion they aren't?
Comment 7 Poul-Henning Kamp freebsd_committer freebsd_triage 2019-04-09 19:23:17 UTC
Sorry for not noticing your reponse.

I was watching gstat() during zpool init, and it was doing bio_write rather than bio_delete.

Underlying media was SATA (or possibly SAS) disks.

I will try to reproduce when I have a chance.
Comment 8 Steven Hartland freebsd_committer freebsd_triage 2019-04-10 09:26:20 UTC
For sata you can check the sysctl's to see what delete method is enabled for each disk:
`sysctl kern.cam.ada |grep delete`