Stress test with "sendfile(2) && block size > page size": panic: vnode_pager_generic_getpages: sector size 8192 too large cpuid = 8 time = 1550044270 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00b6976050 vpanic() at vpanic+0x1b4/frame 0xfffffe00b69760b0 panic() at panic+0x43/frame 0xfffffe00b6976110 vnode_pager_generic_getpages() at vnode_pager_generic_getpages+0xf58/frame 0xfffffe00b6976240 vnode_pager_local_getpages_async() at vnode_pager_local_getpages_async+0x4e/frame 0xfffffe00b6976260 VOP_GETPAGES_ASYNC_APV() at VOP_GETPAGES_ASYNC_APV+0x1f5/frame 0xfffffe00b69762d0 VOP_GETPAGES_ASYNC() at VOP_GETPAGES_ASYNC+0x79/frame 0xfffffe00b6976360 vnode_pager_getpages_async() at vnode_pager_getpages_async+0x78/frame 0xfffffe00b69763c0 sendfile_swapin() at sendfile_swapin+0x594/frame 0xfffffe00b69764d0 vn_sendfile() at vn_sendfile+0x995/frame 0xfffffe00b6976800 fo_sendfile() at fo_sendfile+0x8e/frame 0xfffffe00b69768a0 sendfile() at sendfile+0x1f2/frame 0xfffffe00b6976960 sys_sendfile() at sys_sendfile+0x24/frame 0xfffffe00b6976990 amd64_syscall() at amd64_syscall+0x291/frame 0xfffffe00b6976ab0 https://people.freebsd.org/~pho/stress/log/sendfile11.txt
It looks like FFS is still using the generic pager for its getpages_async VNOP, which doesn't work with block sizes larger than the system page size. The synchronous variant uses vfs_bio_getpages() to handle larger blocksizes, but since that iteratively calls bread() it is a fundamentally synchronous operation. I need to do more digging, but right now it seems like trying to implement an asynchronous-ish variant of vfs_bio_getpages() might be too much work for little real benefit in a not-common use case. It seems like it might be acceptable to implement an ffs_getpages_async() which still uses the asynchronous generic pager when possible, but if blocksize > pagesize will fall back to synchronous vfs_bio_getpages() and then immediately call iodone. In fact, vop_stdgetpages_async() is implemented as a call to synchronous VOP_GETPAGES() followed by iodone. This would allow sendfile(2) to work with larger blocksizes while still preserving the performance benefit of asynchronous completion for the majority (blocksize <= pagesize) case.
Created attachment 202317 [details] proposed patch
(In reply to Jason A. Harmening from comment #2) It fixed the problem seen with the test. I ran all of the stress2 tests on amd64 with this patch without finding any problems.
A commit references this bug: Author: jah Date: Tue Feb 26 04:56:10 UTC 2019 New revision: 344562 URL: https://svnweb.freebsd.org/changeset/base/344562 Log: FFS: allow sendfile(2) to work with block sizes greater than the page size Implement ffs_getpages_async(), which when possible calls the asynchronous flavor of the generic pager's getpages function. When the underlying block size is larger than the system page size, however, it will invoke the (synchronous) buffer cache pager, followed by a call to the client completion routine. This retains true asynchronous completion in the most common (block size <= page size) case, which is important for the performance of the new sendfile(2). The behavior in the larger block size case mirrors the default implementation of VOP_GETPAGES_ASYNC, which most other filesystems use anyway as they do not override the getpages_async method. PR: 235708 Reported by: pho Reviewed by: kib, glebius MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D19340 Changes: head/sys/ufs/ffs/ffs_vnops.c
A commit references this bug: Author: jah Date: Sun Mar 17 06:05:19 UTC 2019 New revision: 345240 URL: https://svnweb.freebsd.org/changeset/base/345240 Log: MFC r344562: FFS: allow sendfile(2) to work with block sizes greater than the page size Implement ffs_getpages_async(), which when possible calls the asynchronous flavor of the generic pager's getpages function. When the underlying block size is larger than the system page size, however, it will invoke the (synchronous) buffer cache pager, followed by a call to the client completion routine. This retains true asynchronous completion in the most common (block size <= page size) case, which is important for the performance of the new sendfile(2). The behavior in the larger block size case mirrors the default implementation of VOP_GETPAGES_ASYNC, which most other filesystems use anyway as they do not override the getpages_async method. PR: 235708 Changes: _U stable/12/ stable/12/sys/ufs/ffs/ffs_vnops.c