Bug 204641 - 10.2 UNMAP/TRIM not available on a zfs zpool that uses iSCSI disks, backed on a zpool file target
Summary: 10.2 UNMAP/TRIM not available on a zfs zpool that uses iSCSI disks, backed on...
Status: Closed Works As Intended
Alias: None
Product: Base System
Classification: Unclassified
Component: misc (show other bugs)
Version: 10.2-RELEASE
Hardware: amd64 Any
: --- Affects Many People
Assignee: Alexander Motin
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-11-17 20:33 UTC by Christopher Forgeron
Modified: 2016-05-02 09:05 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christopher Forgeron 2015-11-17 20:33:50 UTC
Consider this scenario:

Virtual FreeBSD Machine, with a zpool created out of iSCSI disks.
Physical FreeBSD Machine, with a zpool holding a sparse file that is the target for the iSCSI disk. 

This setup works in an environment with all 10.1 machines, doesn't with all 10.2 machines. 

- The 10.2 Machines are 10.2-p7 RELEASE, updated via freebsd-update, no custom.
- The 10.1 Machine are 10.1-p24 RELEASE, updated via freebsd-update, no custom.
- iSCSI is all CAM iSCSI, not the old istgt platform. 
- The iSCSI Target is a sparse file, stored on a zpool (not a vdev Target)

The target machine is the same physical machine, with the same zpools - I either boot 10.1 or 10.2 for testing, and use the same zpool/disks to ensure nothing is changing. 

If I have a 10.2 iSCSI Initiator (client) connected to a 10.2 iSCSI Target, TRIM doesn't work (shows as NONE below).
If I have a 10.2 iSCSI Initiator (client) connected to a 10.1 iSCSI Target, TRIM does work. 

(There is another bug with that last scenario as well, but I will open it separately)

...for clarity, a 10.1 iSCSI Initiator connected to a 10.1 iSCSI Target also works perfectly. I have ~20 of these in the field. 

On the 10.1 / 10.2 Targets, the ctl.conf file is identical. Zpools are identical, because they are shared between reboots of the same iSCSI target machine. 



On the 10.2 initiator machine, connected to a 10.2 Target machine:

# sysctl -a | grep cam.da

kern.cam.da.2.minimum_cmd_size: 6
kern.cam.da.2.delete_max: 131072
kern.cam.da.2.delete_method: NONE
kern.cam.da.1.error_inject: 0
kern.cam.da.1.sort_io_queue: 0
kern.cam.da.1.minimum_cmd_size: 6
kern.cam.da.1.delete_max: 131072
kern.cam.da.1.delete_method: NONE
kern.cam.da.0.error_inject: 0
kern.cam.da.0.sort_io_queue: -1
kern.cam.da.0.minimum_cmd_size: 6
kern.cam.da.0.delete_max: 131072
kern.cam.da.0.delete_method: NONE

Note the delete_method is NONE


# sysctl -a | grep trim
vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1
vfs.zfs.vdev.trim_max_pending: 10000
vfs.zfs.vdev.trim_max_active: 64
vfs.zfs.vdev.trim_min_active: 1
vfs.zfs.vdev.trim_on_init: 1
kstat.zfs.misc.zio_trim.failed: 0
kstat.zfs.misc.zio_trim.unsupported: 181
kstat.zfs.misc.zio_trim.success: 0
kstat.zfs.misc.zio_trim.bytes: 0

Note no trimmed bytes. 


On the target machine, 10.1 and 10.2 share the same config file: /etc/ctl.conf

portal-group pg0 {
        discovery-auth-group no-authentication
        listen 0.0.0.0
        listen [::]
}

        lun 0 {
                path /pool92/iscsi/iscsi.zvol
                blocksize 4K
                size 5T
                option unmap "on"
                option scsiname "pool92"
                option vendor "pool92"
                option insecure_tpc "on"
        }
}


target iqn.iscsi1.zvol {
        auth-group no-authentication
        portal-group pg0

        lun 0 {
                path /pool92_1/iscsi/iscsi.zvol
                blocksize 4K
                size 5T
                option unmap "on"
                option scsiname "pool92_1"
                option vendor "pool92_1"
                option insecure_tpc "on"
        }
}


When I boot a 10.1 Target server, the 10.2 initiator connects, and we do see proper UNMAP ability:


kern.cam.da.2.minimum_cmd_size: 6
kern.cam.da.2.delete_max: 5497558138880
kern.cam.da.2.delete_method: UNMAP
kern.cam.da.1.error_inject: 0
kern.cam.da.1.sort_io_queue: 0
kern.cam.da.1.minimum_cmd_size: 6
kern.cam.da.1.delete_max: 5497558138880
kern.cam.da.1.delete_method: UNMAP
kern.cam.da.0.error_inject: 0
kern.cam.da.0.sort_io_queue: -1
kern.cam.da.0.minimum_cmd_size: 6
kern.cam.da.0.delete_max: 131072
kern.cam.da.0.delete_method: NONE


Please let me know what you'd like to know next. 

Thanks.
Comment 1 Alexander Motin freebsd_committer 2015-11-18 08:15:18 UTC
Are your iSCSI LUNs backed by regular files or ZVOLs? I am asking because FreeBSD simply has no API to punch hole in existing file, that is why UNMAP simply can not work for file-backend LUNs, and so it is automatically disabled. CTL supports UNMAP for all other cases: ZVOLs in both device and file modes and raw devices.
Comment 2 Edward Tomasz Napierala freebsd_committer 2015-11-18 12:02:22 UTC
Hm, but then why did ZFS report it as working with 10.1 target?
Comment 3 Alexander Motin freebsd_committer 2015-11-18 12:22:32 UTC
(In reply to Edward Tomasz Napierala from comment #2)
Probably because of "unmap" options set in config, while CTL in 10.1 was too dumb to allow user to shoot himself in a foot. Present CTL allows to disable UNMAP when it is supported, but not enable it when it is not supported.
Comment 4 Christopher Forgeron 2015-11-18 13:42:12 UTC
Very interesting:

1) Yes, it's file based iSCSI, not zvol

2) I enabled zle compression on the Target's zfs dataset, so the behaviour in 10.1 was that it tried UNMAP, then BIO_DELETE, then finally went to ZERO as the delete method. Since ZLE was on, the writing of all Zero's essentially became an UNMAP

So what you are saying is that in 10.1, I was enabling UNMAP, and ctl wasn't checking to make sure it was really enabled, thus I was able to succeed?

I thought the old behaviour was rather handy. I was able to use file based iSCSI (ling story why I needed this over zvol), make sparse files with 'truncate' and the file stayed sparse as it was worked.
Comment 5 Alexander Motin freebsd_committer 2015-11-18 14:17:14 UTC
(In reply to Christopher Forgeron from comment #4)
Writing zeroes to implement UNMAP when ZFS has compression enabled is possible, but was never supported by CTL in this way.  Same time, CTL supports WRITE SAME commands, so if initiator is clever enough, it may be configured to do that. FreeBSD's da driver kind of supports that, but that functionality was close to broken until r289146 month ago. After that change it is possible to set sysctl kern.cam.da.0.delete_method="ZERO", and initiator will convert all BIO_DELETEs into WRITE SAME with zero buffer.
Comment 6 Christopher Forgeron 2015-11-18 15:57:34 UTC
Thanks for the prompt responses.

# sysctl kern.cam.da.1.delete_method="ZERO"
kern.cam.da.1.delete_method: NONE
sysctl: kern.cam.da.1.delete_method=ZERO: Invalid argument


I can't seem to set this - Will try /boot/loader.conf but I suspect since it's not saying it's a loader variable, that it won't work either. 

I've tried on both the Target and the Initiator (so one is an iSCSI disk)

Is there something I'm missing?
Comment 7 Christopher Forgeron 2015-11-18 17:31:05 UTC
Just wanted to confirm that it looks to work properly when the iSCSI target is a zvol, not a file.
Comment 8 Nick Wolff 2016-04-30 19:52:05 UTC
I'm seeing this on a 10.3 (r297687) writing to iscsi backed by a  zvol on a 10.1  r274348 system. There is also a geli on the iscsi. The stats below are after a zfs destroy of around 100g+ dataset. No shrinking in the backing zvol. If I need to open a new ticket I will but am concerned this is related to this ticket.

sysctl kern.cam.da.5
kern.cam.da.5.error_inject: 0
kern.cam.da.5.sort_io_queue: 0
kern.cam.da.5.minimum_cmd_size: 6
kern.cam.da.5.delete_max: 1099511627776
kern.cam.da.5.delete_method: UNMAP

sysctl -a|grep trim
vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1
vfs.zfs.vdev.trim_max_pending: 10000
vfs.zfs.vdev.trim_max_active: 64
vfs.zfs.vdev.trim_min_active: 1
vfs.zfs.vdev.trim_on_init: 1
kstat.zfs.misc.zio_trim.failed: 0
kstat.zfs.misc.zio_trim.unsupported: 2768
kstat.zfs.misc.zio_trim.success: 0
kstat.zfs.misc.zio_trim.bytes: 0
clmbs-base10.eng:~/svn/servers/fre
Comment 9 Nick Wolff 2016-04-30 20:44:23 UTC
Nevermind, My issue appears to be related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198863

I thought geli trim passthrough was working in stable.