THe below is a copy of, for example: https://lists.freebsd.org/archives/freebsd-arm/2025-November/005408.html Due to my normal environments being main [so: 16] based the examples are various vintages of 16 based. The main's are from official pkgbase distributions, not personal builds. I do use the non-debug kernel variant. On a real armv7 system, booted from a armv7 kernel, so far the example fio command has worked correctly. But moving that same USB3 capable media to an aarch64 system and mounting and chrooting into it to do the test in the armv7 chroot fails. Both the RPi5 tested and the Windows Dev Kit 20023 tested show the issue. A lib32 example on aarch64: # /usr/obj/DESTDIRs/main-armv7-chroot-ports-main/usr/local/bin/fio --name=random_rw_test --filename=./testfile1 --rw=randrw --bs=128k \ --ioengine=posixaio --iodepth=256 --numjobs=4 --runtime=120 --time_based \ --group_reporting --direct=1 --size=1G random_rw_test: (g=0): rw=randrw, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=posixaio, iodepth=256 ... fio-3.40 Starting 4 processes fio: io_u error on file ./testfile1: File too large: write offset=772538368, buflen=131072 . . . (lots of the above) fio: io_u error on file ./testfile1: Bad address: write offset=524288, buflen=131072 . . . (more "File too large: write offset" notices) fio: pid=2078, err=27/file:io_u.c:1982, func=io_u error, error=File too large . . . (more "File too large: write offset" notices) fio: pid=2077, err=27/file:io_u.c:1982, func=io_u error, error=File too large . . . (more "File too large: write offset" notices) fio: pid=2079, err=27/file:io_u.c:1982, func=io_u error, error=File too large random_rw_test: (groupid=0, jobs=4): err=27 (file:io_u.c:1982, func=io_u error, error=File too large): pid=2077: Tue Nov 11 13:31:32 2025 lat (usec) : 500=1.17%, 750=0.68%, 1000=1.66% lat (msec) : 2=3.71%, 4=5.27%, 20=3.22%, 50=32.81%, 2000=0.59% cpu : usr=0.19%, sys=0.64%, ctx=85, majf=3, minf=56 IO depths : 1=0.4%, 2=0.8%, 4=1.6%, 8=3.1%, 16=6.2%, 32=12.5%, >=64=75.4% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.1%, 4=99.3%, 8=0.1%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.4% issued rwts: total=503,521,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=256 Run status group 0 (all jobs): Being in a armv7 chroot and testing such (with a simple fio command instead of such a path) looks the same.
Another thing that all the aarch64 and armv7 testing here has in common: UFS instead of ZFS.
Could you provide a truss log on both armv7 and aarch64? Ideally run with one thread so it's easier to read.
(In reply to Robert Clausecker from comment #2) It will be some time before I have anything useful.
(In reply to Robert Clausecker from comment #2) Truss greatly changes the fio test behavior, eventually overflowing a queue and doing assert(0): Assertion failed: (0), function io_u_qpush, file ./io_u_queue.h, line 37. 48748 100340: write(2,"Assertion failed: (0), function "...,74) = 74 (0x4a) I've worked some on finding smaller test runs that generate a subset of the errors, such as: 48869 100333: aio_write({ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }) = 0 (0x0) 48869 100333: clock_gettime(4,{ 94521.883112608 }) = 0 (0x0) 48869 100333: aio_error({ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }) = 36 (0x24) 48869 100333: aio_suspend([{ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }],1,0x0) = 0 (0x0) 48869 100333: aio_error({ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }) = 14 (0xe) 48869 100333: clock_gettime(4,{ 94521.888278246 }) = 0 (0x0) 48869 100333: issetugid() = 0 (0x0) 48869 100333: mmap(0x0,28672,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(12),-1,0x0) = 691605504 (0x29391000) 48869 100333: fstatat(AT_FDCWD,"/usr/share/nls/C/libc.cat",0xffff9590,0x0) ERR#2 'No such file or directory' 48869 100333: fstatat(AT_FDCWD,"/usr/share/nls/libc/C",0xffff9590,0x0) ERR#2 'No such file or directory' 48869 100333: fstatat(AT_FDCWD,"/usr/local/share/nls/C/libc.cat",0xffff9590,0x0) ERR#2 'No such file or directory' 48869 100333: fstatat(AT_FDCWD,"/usr/local/share/nls/libc/C",0xffff9590,0x0) ERR#2 'No such file or directory' fio: io_u error on file ./testfile1: Bad address: write offset=524288, buflen=131072 48869 100333: write(2,"fio: io_u error on file ./testfi"...,85) = 85 (0x55) 48869 100333: clock_gettime(4,{ 94521.888931965 }) = 0 (0x0) 48869 100333: clock_gettime(4,{ 94521.888980765 }) = 0 (0x0) 48869 100333: getrusage(RUSAGE_THREAD,{ u=0.000000,s=0.001056,in=0,out=0 }) = 0 (0x0) 48869 100333: clock_gettime(4,{ 94521.889085707 }) = 0 (0x0) 48869 100333: clock_gettime(4,{ 94521.889142580 }) = 0 (0x0) fio: pid=48869, err=14/file:io_u.c:1982, func=io_u error, error=Bad address 48869 100333: write(1,"fio: pid=48869, err=14/file:io_u"...,76) = 76 (0x4c) 48869 100333: close(6) = 0 (0x0) 48869 100333: _exit(0xe) 48869 100333: process exit, rval = 14 48868 100393: nanosleep({ 0.010000000 }) = 0 (0x0)
(In reply to Mark Millard from comment #4) Surprising! I don't see any call returning EFAULT, so I wonder where it gets that from.
(In reply to Robert Clausecker from comment #5) Well, compare: /usr/include/sys/errno.h:#define EFAULT 14 /* Bad address */ vs.: 48869 100333: aio_error({ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }) = 14 (0xe) Given that the prior sequence followed from a aio_write and the man page for aio_error reports: RETURN VALUES If the asynchronous I/O request has completed successfully, aio_error() returns 0. If the request has not yet completed, EINPROGRESS is returned. If the request has completed unsuccessfully the error status is returned as described in read(2), readv(2), write(2), writev(2), or fsync(2). I see for write: [EFAULT] Part of iov or data to be written to the file points outside the process's allocated address space. So it would seem to be some sort of mishandling of the iov or data such that the aarch64 kernel ends up being told to reference outside the process's allocated address space. Not that I've any clue how to get evidence any details.
(In reply to Mark Millard from comment #6) Hm this could very well be. aio is a sufficiently rare interface that it could just be poorly tested.
kgdb on armv7 for struct aiocb offsets vs. kgdb on aarch64 for struct aiocb32 offsets (through first few differences) : kgdb on armv7 for struct aiocb offsets : (kgdb) ptype /o *(struct aiocb*)0 /* offset | size */ type = struct aiocb { /* 0 | 4 */ int aio_fildes; /* XXX 4-byte hole */ /* 8 | 8 */ off_t aio_offset; /* 16 | 4 */ volatile void *aio_buf; /* 20 | 4 */ size_t aio_nbytes; . . . /* total size (bytes): 104 */ vs.: kgdb on aarch64 for struct aiocb32 offsets : (kgdb) ptype /o *(struct aiocb32*)0 /* offset | size */ type = struct aiocb32 { /* 0 | 4 */ int32_t aio_fildes; /* 4 | 8 */ uint64_t aio_offset; /* 12 | 4 */ uint32_t aio_buf; /* 16 | 4 */ uint32_t aio_nbytes; . . . /* total size (bytes): 96 */
As the native armv7 struct aiocb vs. aarch64 struct aiocb32 mismatch is is very old, not just specific to 16 or CURRENT, I've changed the settings below as indicated: Version: Unspecified (i.e., here, all available to select) Hardware: arm64 Importance: Affects Many People (That last is just for accuracy, my understanding is that the field is basically: officially ignored.) Side note: It may be that things are okay for i386 on amd64 as things are and that more differentiation of the two 32-bit contexts is needed in order for both to work on the matching 64 systems that also support lib32 and the related chroots/jails with 32-bite worlds.
(In reply to Mark Millard from comment #8) I can not seem to type what I actually select: should have type Some, not Many. Importance: Affects Some People
``` #ifdef COMPAT_FREEBSD6 typedef struct oaiocb32 { int aio_fildes; /* File descriptor */ uint64_t aio_offset __packed; /* File offset for I/O */ uint32_t aio_buf; /* I/O buffer in process space */ uint32_t aio_nbytes; /* Number of bytes for I/O */ struct osigevent32 aio_sigevent; /* Signal to deliver */ int aio_lio_opcode; /* LIO opcode */ int aio_reqprio; /* Request priority -- ignored */ struct __aiocb_private32 _aiocb_private; } oaiocb32_t; #endif typedef struct aiocb32 { int32_t aio_fildes; /* File descriptor */ uint64_t aio_offset __packed; /* File offset for I/O */ uint32_t aio_buf; /* I/O buffer in process space */ uint32_t aio_nbytes; /* Number of bytes for I/O */ int __spare__[2]; uint32_t __spare2__; int aio_lio_opcode; /* LIO opcode */ int aio_reqprio; /* Request priority -- ignored */ struct __aiocb_private32 _aiocb_private; struct sigevent32 aio_sigevent; /* Signal to deliver */ } aiocb32_t; ``` The __packed on the fields means the uint64_t has alignment 1 not 8 as should be the case for the ABI. It is likely there to bodge the structure layout for freebsd32 on amd64, as i386's ABI only uses 4-byte alignment for 8-byte long long.
Created attachment 265404 [details] proposed patch Here's a proposed patch discussed with jrtc27. Please apply it and let me know if that helps. Unfortunately I can't reboot my aarch64 box right now (it's building ports...) so I'll have to rely on you to check.
(In reply to Robert Clausecker from comment #12) I have tested the patch and can confirm that it works. No change on amd64 (test case works before and after in i386 jail). Fixes test case on aarch64 (test case fails before and works after in armv7 jail). The bug fix modifies the layout of struct oaiocb32 and struct aiocb32 in sys/kern/vfs_aio.c when building with freebsd32 support on !amd64. These structs are local to the file and only used there for freebsd32 support. There should thus be no impact on other parts of the kernel. Please MFC into stable/15 and releng/15.0 if possible.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=f0af21824331648a41b4e5d3323bea9216bcb7e2 commit f0af21824331648a41b4e5d3323bea9216bcb7e2 Author: Robert Clausecker <fuz@FreeBSD.org> AuthorDate: 2025-11-14 00:55:59 +0000 Commit: Robert Clausecker <fuz@FreeBSD.org> CommitDate: 2025-11-14 00:56:12 +0000 aio: fix alignment of struct (o)aiocb32 on non-amd64 Only i386 has a four-byte alignment for uint64_t, others have eight-byte alignment. This causes the structure to mismatch on armv7 binaries running under aarch64, breaking the aio interface. Fixes: 3858a1f4f501d00000447309aae14029f8133946 Approved by: markj (mentor) Reported by: Mark Millard <marklmi26-fbsd@yahoo.com> Discussed with: jrtc27 PR: 290962 MFC after: immediately (for 15.0) sys/kern/vfs_aio.c | 8 ++++++++ 1 file changed, 8 insertions(+)
A commit in branch stable/15 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=ea26fd52a949116d03f59066d364eee2af6c9f51 commit ea26fd52a949116d03f59066d364eee2af6c9f51 Author: Robert Clausecker <fuz@FreeBSD.org> AuthorDate: 2025-11-14 00:55:59 +0000 Commit: Robert Clausecker <fuz@FreeBSD.org> CommitDate: 2025-11-14 01:03:44 +0000 aio: fix alignment of struct (o)aiocb32 on non-amd64 Only i386 has a four-byte alignment for uint64_t, others have eight-byte alignment. This causes the structure to mismatch on armv7 binaries running under aarch64, breaking the aio interface. Fixes: 3858a1f4f501d00000447309aae14029f8133946 Approved by: markj (mentor) Reported by: Mark Millard <marklmi26-fbsd@yahoo.com> Discussed with: jrtc27 PR: 290962 MFC after: immediately (for 15.0) sys/kern/vfs_aio.c | 8 ++++++++ 1 file changed, 8 insertions(+)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=498a4c7f48a33c60b1768b8a6daf519bad84e18d commit 498a4c7f48a33c60b1768b8a6daf519bad84e18d Author: Robert Clausecker <fuz@FreeBSD.org> AuthorDate: 2025-11-14 00:55:59 +0000 Commit: Robert Clausecker <fuz@FreeBSD.org> CommitDate: 2025-11-14 01:04:53 +0000 aio: fix alignment of struct (o)aiocb32 on non-amd64 Only i386 has a four-byte alignment for uint64_t, others have eight-byte alignment. This causes the structure to mismatch on armv7 binaries running under aarch64, breaking the aio interface. Fixes: 3858a1f4f501d00000447309aae14029f8133946 Approved by: markj (mentor) Reported by: Mark Millard <marklmi26-fbsd@yahoo.com> Discussed with: jrtc27 PR: 290962 MFC after: immediately (for 15.0) (cherry picked from commit f0af21824331648a41b4e5d3323bea9216bcb7e2) sys/kern/vfs_aio.c | 8 ++++++++ 1 file changed, 8 insertions(+)
Should be fixed now. Let's hope this gets added to 15.0.
A commit in branch releng/15.0 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=cb9075718bc66844c6d25fed1df18de61af5267b commit cb9075718bc66844c6d25fed1df18de61af5267b Author: Robert Clausecker <fuz@FreeBSD.org> AuthorDate: 2025-11-14 00:55:59 +0000 Commit: Colin Percival <cperciva@FreeBSD.org> CommitDate: 2025-11-14 02:27:28 +0000 aio: fix alignment of struct (o)aiocb32 on non-amd64 Only i386 has a four-byte alignment for uint64_t, others have eight-byte alignment. This causes the structure to mismatch on armv7 binaries running under aarch64, breaking the aio interface. Approved by: re (cperciva) Fixes: 3858a1f4f501d00000447309aae14029f8133946 Approved by: markj (mentor) Reported by: Mark Millard <marklmi26-fbsd@yahoo.com> Discussed with: jrtc27 PR: 290962 MFC after: immediately (for 15.0) (cherry picked from commit ea26fd52a949116d03f59066d364eee2af6c9f51) sys/kern/vfs_aio.c | 8 ++++++++ 1 file changed, 8 insertions(+)
(In reply to commit-hook from comment #16) For stable-14/ the update could change the status of powerpc64 support for 32-bit powerpc programs. 32-bit powerpc has a tier 2 status for 14.* and before but are not supported for 15.* and later. However, I've no clue if aio is working vs. failing for 14.3 and 13.5 32-bit powerpc code on powerpc64's. I've not had operational powerpc context (32-bit, 64-bit) in years. As I remember, the little endian powerpc64's do not support 32-bit in the hardware. The old PowerMac G5's were 64-bit and did support such. I'll send a note to Justin Hibbits (with you cc'd) asking if he knows the status.
(In reply to Jessica Clarke from comment #11) > The __packed on the fields means the uint64_t has alignment 1 not 8 as should > be the case for the ABI. It is likely there to bodge the structure layout > for freebsd32 on amd64, as i386's ABI only uses 4-byte alignment for 8-byte > long long. This diagnose seems to be correct, since the commit referenced in "Fixes" had the structure in a COMPAT_IA32 block. Thus the "Fixes" reference in the recent commits is wrong.
(In reply to Gunther Nikl from comment #20) The __packed attribute is there to have to struct confirm to the i386 alignment when building the kernel for amd64. No such alignment difference exists on the other platform pairs with freebsd32 support (that is, powerpc64/powerpc and aarch64/armv7), so the __packed attribute is wrong there. This is what my commit fixes. The linked commit (base 3858a1f4f501d00000447309aae14029f8133946) is the one where the structures in question were introduced.
(In reply to Robert Clausecker from comment #21) I think the reference is just indicating that 3858a1f4f501 being from: committer John Baldwin <jhb@FreeBSD.org> 2008-12-10 20:56:19 +0000 may be so old that both aarch64/armv7 and powerpc64/powerpc might not have been part of the criteria at the time. The error may have been from a later update that first tried to generalize things beyond amd64/i386. Looking . . . committer Nathan Whitehorn <nwhitehorn@FreeBSD.org> 2010-03-11 14:49:06 +0000 commit 841c0c7ec75bef3c9920cd811270f9f84791ee04 (patch) . . . Provide groundwork for 32-bit binary compatibility on non-x86 platforms,
(In reply to Robert Clausecker from comment #21) > [...] __packed attribute is wrong there. This is what my commit fixes. Yes, I do not dispute that. > The linked commit (base 3858a1f4f501d00000447309aae14029f8133946) is the one > where the structures in question were introduced. Correct as well (thank you for providing the reference!), *but* it was solely for COMPAT_IA32 aka i386 meaning the commit was correct for its purpose at that time. A later change is the real culprit since now the code is wrapped with COMPAT_FREEBSD32. I believe that commit is fixed here.
(In reply to Mark Millard from comment #22) > commit 841c0c7ec75bef3c9920cd811270f9f84791ee04 (patch) Yeah, that is the commit fixed now. Given how old it is, the issue appears to happen rarely. It is no surprise that you stumbled about it ;-)