vn_generic_copy_file_range can copy both data and holes. However, it copies holes much more slowly than theoretically possible. In my experiments using NFS 4.2 with FreeBSD 13.0-RELEASE servers and clients and ZFS local file systems, I can only achieve about 300 MBps when copying a file from NFS to local disk, when that file consists of a single giant hole. In theory, the maximum speed should be infinite, given a sufficiently "large" sparse file. Steps to Reproduce ================== 1) Mount a ZFS-backed file system using nfsv4 with minorversion=2 2) On the server, create a giant sparse file with "truncate -s 1t sparsefile" 3) On the client, copy it to local storage with "cp sparsefile /tmp" Analysis ======== dtrace shows that vn_rdwr gets called repeatedly with len=64k. Basically, it extends the file by 64kB at a time. That's due to the "xfer = blksize" assignment in vn_generic_copy_file_range. vn_generic_copy_file_range proceeds to loop xfer bytes at a time. Instead, it should write the entire hole at once. Note that this only happens when the hole extends to EOF. If instead there is a small data region at EOF, then vn_generic_copy_file_range does indeed handle the hole efficiently.
Would this deal with: https://cgit.freebsd.org/src/commit/?id=19fe23fa2bd52d6a42fb408d21b9d49c4bee81ef Titled: "Make vn_generic_copy_file_range() interruptible via a signal." QUOTE This patch adds checks for signals that need to be processed after each successful data copy cycle. END QUOTE Might making the "data copy cycle" huge in some contexts reintroduce such problems by making the sig_intr() calls too infrequent? In other words: might vn_rdwr(. . .) and/or vn_write_outvp(. . .) sometimes take too long relative to checking sig_intr() frequently enough? There is also the cantseek related mem_iszero(dat,xfer) if xfer were huge. Note: I've not done a deep analyzis. I'm just asking based on what I see in the vn_generic_copy_file_range(. . .) code.
Created attachment 224599 [details] add support for a hole to eof to vn_generic_copy_range() You can try this patch. I cannot verify if it helps, because I only use UFS I left assorted debug printfs in it.
vn_generic_copy_file_range() was coded the way it is, since UFS (the only file system type I use/test on) always allocates a data block (all 0 bytes) at the end of the file. --> Never has a hole extending to EOF. I've attached a patch that tries to handle the hole to EOF case. Maybe you can test it? Btw, the data size in the copy loop is pegged at whatever the file system uses as a block size, capped at 1Mbyte. This implies that interrupting signals will be handled quickly. --> Recall that read(2)/write(2) ignore signals, except for the weird NFS "intr" mount option case.
Works for me! I just have two comments about the patch: * The (error == 0) check at line 3118 can be removed if you just move VOP_GETATTR two lines lower. * Assigning false to holetoeof doesn't need to be done inside the loop. You can initialize it in its declaration.
Created attachment 224603 [details] add support for a hole to eof to vn_generic_copy_range() Update patch as suggested by asomers@. Moving the VOP_GETATTR() is actually a fix, since "goto out;" assumes an unlocked invp. It also is slightly optimized, since the inva are only needed if holein > 0. I moved the initialization of holetoeof to above the loop, since it easier to spot there (and I thoght style(9) frowned on initializations in declarations). No semntics change unless VOP_GETATTR() fails for some reason.
Latest patch LGTM.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4f592683c356379c5bac56b52807ed4ad54ee647 commit 4f592683c356379c5bac56b52807ed4ad54ee647 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-05-02 23:04:27 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-05-02 23:04:27 +0000 copy_file_range(2): improve copying of a large hole to EOF PR#255523 reported that a file copy for a file with a large hole to EOF on ZFS ran slowly over NFSv4.2. The problem was that vn_generic_copy_file_range() would loop around reading the hole's data and then see it is all 0s. It was coded this way since UFS always allocates a data block near the end of the file, such that a hole to EOF never exists. This patch modifies vn_generic_copy_file_range() to check for a ENXIO returned from VOP_IOCTL(..FIOSEEKDATA..) and handle that case as a hole to EOF. asomers@ confirms that it works for his ZFS test case. PR: 255523 Tested by: asomers Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30076 sys/kern/vfs_vnops.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-)
Patch has been tested by asomers@ and committed to main. It will be MFC'd in 2 weeks.
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=cd192185662ef90f52614ea17cb739f5047d5e0c commit cd192185662ef90f52614ea17cb739f5047d5e0c Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-05-02 23:04:27 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-05-16 23:48:06 +0000 copy_file_range(2): improve copying of a large hole to EOF PR#255523 reported that a file copy for a file with a large hole to EOF on ZFS ran slowly over NFSv4.2. The problem was that vn_generic_copy_file_range() would loop around reading the hole's data and then see it is all 0s. It was coded this way since UFS always allocates a data block near the end of the file, such that a hole to EOF never exists. This patch modifies vn_generic_copy_file_range() to check for a ENXIO returned from VOP_IOCTL(..FIOSEEKDATA..) and handle that case as a hole to EOF. asomers@ confirms that it works for his ZFS test case. PR: 255523 (cherry picked from commit 4f592683c356379c5bac56b52807ed4ad54ee647) sys/kern/vfs_vnops.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-)
Patch has been MFC'd.