Bug 290962 - armv7 chroot and lib32 use on aarch64: example fio command works on real armv7 system boots but fails for aarch64-to-armv7 chroot or lib32 use
Summary: armv7 chroot and lib32 use on aarch64: example fio command works on real armv...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: arm (show other bugs)
Version: Unspecified
Hardware: arm64 Any
: --- Affects Some People
Assignee: Robert Clausecker
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-11-11 22:15 UTC by Mark Millard
Modified: 2025-11-14 14:58 UTC (History)
2 users (show)

See Also:
fuz: mfc-stable15?
fuz: mfc-stable14?


Attachments
proposed patch (1.55 KB, patch)
2025-11-13 23:55 UTC, Robert Clausecker
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Millard 2025-11-11 22:15:21 UTC
THe below is a copy of, for example:

https://lists.freebsd.org/archives/freebsd-arm/2025-November/005408.html

Due to my normal environments being main [so: 16] based
the examples are various vintages of 16 based. The main's
are from official pkgbase distributions, not personal
builds. I do use the non-debug kernel variant.

On a real armv7 system, booted from a armv7 kernel,
so far the example fio command has worked correctly.

But moving that same USB3 capable media to an aarch64
system and mounting and chrooting into it to do the test
in the armv7 chroot fails.

Both the RPi5 tested and the Windows Dev Kit 20023
tested show the issue.

A lib32 example on aarch64:

# /usr/obj/DESTDIRs/main-armv7-chroot-ports-main/usr/local/bin/fio --name=random_rw_test --filename=./testfile1 --rw=randrw --bs=128k \
--ioengine=posixaio --iodepth=256 --numjobs=4 --runtime=120 --time_based \
--group_reporting --direct=1 --size=1G
random_rw_test: (g=0): rw=randrw, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=posixaio, iodepth=256
...
fio-3.40
Starting 4 processes
fio: io_u error on file ./testfile1: File too large: write offset=772538368, buflen=131072
. . . (lots of the above)
fio: io_u error on file ./testfile1: Bad address: write offset=524288, buflen=131072
. . . (more "File too large: write offset" notices)
fio: pid=2078, err=27/file:io_u.c:1982, func=io_u error, error=File too large
. . . (more "File too large: write offset" notices)
fio: pid=2077, err=27/file:io_u.c:1982, func=io_u error, error=File too large
. . . (more "File too large: write offset" notices)
fio: pid=2079, err=27/file:io_u.c:1982, func=io_u error, error=File too large

random_rw_test: (groupid=0, jobs=4): err=27 (file:io_u.c:1982, func=io_u error, error=File too large): pid=2077: Tue Nov 11 13:31:32 2025
lat (usec)   : 500=1.17%, 750=0.68%, 1000=1.66%
lat (msec)   : 2=3.71%, 4=5.27%, 20=3.22%, 50=32.81%, 2000=0.59%
cpu          : usr=0.19%, sys=0.64%, ctx=85, majf=3, minf=56
IO depths    : 1=0.4%, 2=0.8%, 4=1.6%, 8=3.1%, 16=6.2%, 32=12.5%, >=64=75.4%
   submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
   complete  : 0=0.1%, 4=99.3%, 8=0.1%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.4%
   issued rwts: total=503,521,0,0 short=0,0,0,0 dropped=0,0,0,0
   latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):


Being in a armv7 chroot and testing such (with a simple
fio command instead of such a path) looks the same.
Comment 1 Mark Millard 2025-11-12 00:08:04 UTC
Another thing that all the aarch64 and armv7 testing
here has in common: UFS instead of ZFS.
Comment 2 Robert Clausecker freebsd_committer freebsd_triage 2025-11-12 00:31:16 UTC
Could you provide a truss log on both armv7 and aarch64?  Ideally run with one thread so it's easier to read.
Comment 3 Mark Millard 2025-11-12 06:21:39 UTC
(In reply to Robert Clausecker from comment #2)

It will be some time before I have anything
useful.
Comment 4 Mark Millard 2025-11-13 05:53:22 UTC
(In reply to Robert Clausecker from comment #2)

Truss greatly changes the fio test behavior, eventually
overflowing a queue and doing assert(0):

Assertion failed: (0), function io_u_qpush, file ./io_u_queue.h, line 37.
48748 100340: write(2,"Assertion failed: (0), function "...,74) = 74 (0x4a)

I've worked some on finding smaller test runs that generate
a subset of the errors, such as:

48869 100333: aio_write({ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }) = 0 (0x0)
48869 100333: clock_gettime(4,{ 94521.883112608 }) = 0 (0x0)
48869 100333: aio_error({ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }) = 36 (0x24)
48869 100333: aio_suspend([{ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }],1,0x0) = 0 (0x0)
48869 100333: aio_error({ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }) = 14 (0xe)
48869 100333: clock_gettime(4,{ 94521.888278246 }) = 0 (0x0)
48869 100333: issetugid()                        = 0 (0x0)
48869 100333: mmap(0x0,28672,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(12),-1,0x0) = 691605504 (0x29391000)
48869 100333: fstatat(AT_FDCWD,"/usr/share/nls/C/libc.cat",0xffff9590,0x0) ERR#2 'No such file or directory'
48869 100333: fstatat(AT_FDCWD,"/usr/share/nls/libc/C",0xffff9590,0x0) ERR#2 'No such file or directory'
48869 100333: fstatat(AT_FDCWD,"/usr/local/share/nls/C/libc.cat",0xffff9590,0x0) ERR#2 'No such file or directory'
48869 100333: fstatat(AT_FDCWD,"/usr/local/share/nls/libc/C",0xffff9590,0x0) ERR#2 'No such file or directory'
fio: io_u error on file ./testfile1: Bad address: write offset=524288, buflen=131072
48869 100333: write(2,"fio: io_u error on file ./testfi"...,85) = 85 (0x55)
48869 100333: clock_gettime(4,{ 94521.888931965 }) = 0 (0x0)
48869 100333: clock_gettime(4,{ 94521.888980765 }) = 0 (0x0)
48869 100333: getrusage(RUSAGE_THREAD,{ u=0.000000,s=0.001056,in=0,out=0 }) = 0 (0x0)
48869 100333: clock_gettime(4,{ 94521.889085707 }) = 0 (0x0)
48869 100333: clock_gettime(4,{ 94521.889142580 }) = 0 (0x0)
fio: pid=48869, err=14/file:io_u.c:1982, func=io_u error, error=Bad address
48869 100333: write(1,"fio: pid=48869, err=14/file:io_u"...,76) = 76 (0x4c)
48869 100333: close(6)                           = 0 (0x0)
48869 100333: _exit(0xe)                        
48869 100333: process exit, rval = 14
48868 100393: nanosleep({ 0.010000000 })         = 0 (0x0)
Comment 5 Robert Clausecker freebsd_committer freebsd_triage 2025-11-13 09:59:17 UTC
(In reply to Mark Millard from comment #4)

Surprising!  I don't see any call returning EFAULT, so I wonder where it gets that from.
Comment 6 Mark Millard 2025-11-13 18:07:39 UTC
(In reply to Robert Clausecker from comment #5)

Well, compare:

/usr/include/sys/errno.h:#define	EFAULT		14		/* Bad address */

vs.:

48869 100333: aio_error({ 6,524288,0x29369000,131072,LIO_NOP,{ sigev_notify=SIGEV_NONE } }) = 14 (0xe)

Given that the prior sequence followed from a aio_write
and the man page for aio_error reports:

RETURN VALUES
     If the asynchronous I/O request has completed successfully, aio_error()
     returns 0.  If the request has not yet completed, EINPROGRESS is
     returned.  If the request has completed unsuccessfully the error status
     is returned as described in read(2), readv(2), write(2), writev(2), or
     fsync(2).

I see for write:

     [EFAULT]           Part of iov or data to be written to the file points
                        outside the process's allocated address space.

So it would seem to be some sort of mishandling of the
iov or data such that the aarch64 kernel ends up being
told to reference outside the process's allocated
address space.

Not that I've any clue how to get evidence any details.
Comment 7 Robert Clausecker freebsd_committer freebsd_triage 2025-11-13 19:14:21 UTC
(In reply to Mark Millard from comment #6)

Hm this could very well be.  aio is a sufficiently rare interface that it could just be poorly tested.
Comment 8 Mark Millard 2025-11-13 21:52:33 UTC
kgdb on armv7   for struct aiocb   offsets vs.
kgdb on aarch64 for struct aiocb32 offsets
(through first few differences) :

kgdb on armv7   for struct aiocb   offsets :

(kgdb) ptype /o *(struct aiocb*)0
/* offset      |    size */  type = struct aiocb {
/*      0      |       4 */    int aio_fildes;
/* XXX  4-byte hole      */
/*      8      |       8 */    off_t aio_offset;
/*     16      |       4 */    volatile void *aio_buf;
/*     20      |       4 */    size_t aio_nbytes;
. . .
                               /* total size (bytes):  104 */

vs.:

kgdb on aarch64 for struct aiocb32 offsets :

(kgdb) ptype /o *(struct aiocb32*)0
/* offset      |    size */  type = struct aiocb32 {
/*      0      |       4 */    int32_t aio_fildes;
/*      4      |       8 */    uint64_t aio_offset;
/*     12      |       4 */    uint32_t aio_buf;
/*     16      |       4 */    uint32_t aio_nbytes;
. . .
                               /* total size (bytes):   96 */
Comment 9 Mark Millard 2025-11-13 22:20:03 UTC
As the native armv7 struct aiocb vs.
aarch64 struct aiocb32 mismatch is
is very old, not just specific to
16 or CURRENT, I've changed the
settings below as indicated:

Version:    Unspecified (i.e., here, all available to select)
Hardware:   arm64
Importance: Affects Many People

(That last is just for accuracy, my understanding
is that the field is basically: officially ignored.)


Side note:

It may be that things are okay for i386 on amd64 as
things are and that more differentiation of the two
32-bit contexts is needed in order for both to work
on the matching 64 systems that also support lib32
and the related chroots/jails with 32-bite worlds.
Comment 10 Mark Millard 2025-11-13 22:22:25 UTC
(In reply to Mark Millard from comment #8)

I can not seem to type what I actually select:
should have type Some, not Many.

Importance: Affects Some People
Comment 11 Jessica Clarke freebsd_committer freebsd_triage 2025-11-13 23:47:32 UTC
```
#ifdef COMPAT_FREEBSD6
typedef struct oaiocb32 {
        int     aio_fildes;             /* File descriptor */
        uint64_t aio_offset __packed;   /* File offset for I/O */
        uint32_t aio_buf;               /* I/O buffer in process space */
        uint32_t aio_nbytes;            /* Number of bytes for I/O */
        struct  osigevent32 aio_sigevent; /* Signal to deliver */
        int     aio_lio_opcode;         /* LIO opcode */
        int     aio_reqprio;            /* Request priority -- ignored */
        struct  __aiocb_private32 _aiocb_private;
} oaiocb32_t;
#endif

typedef struct aiocb32 {
        int32_t aio_fildes;             /* File descriptor */
        uint64_t aio_offset __packed;   /* File offset for I/O */
        uint32_t aio_buf;       /* I/O buffer in process space */
        uint32_t aio_nbytes;    /* Number of bytes for I/O */
        int     __spare__[2];
        uint32_t __spare2__;
        int     aio_lio_opcode;         /* LIO opcode */
        int     aio_reqprio;            /* Request priority -- ignored */
        struct  __aiocb_private32 _aiocb_private;
        struct  sigevent32 aio_sigevent;        /* Signal to deliver */
} aiocb32_t;
```

The __packed on the fields means the uint64_t has alignment 1 not 8 as should be the case for the ABI. It is likely there to bodge the structure layout for freebsd32 on amd64, as i386's ABI only uses 4-byte alignment for 8-byte long long.
Comment 12 Robert Clausecker freebsd_committer freebsd_triage 2025-11-13 23:55:35 UTC
Created attachment 265404 [details]
proposed patch

Here's a proposed patch discussed with jrtc27.  Please apply it and let me know if that helps.  Unfortunately I can't reboot my aarch64 box right now (it's building ports...) so I'll have to rely on you to check.
Comment 13 Robert Clausecker freebsd_committer freebsd_triage 2025-11-14 01:03:06 UTC
(In reply to Robert Clausecker from comment #12)

I have tested the patch and can confirm that it works.
No change on amd64 (test case works before and after in i386 jail).
Fixes test case on aarch64 (test case fails before and works after in armv7 jail).

The bug fix modifies the layout of struct oaiocb32 and struct aiocb32 in sys/kern/vfs_aio.c when building with freebsd32 support on !amd64.  These structs are local to the file and only used there for freebsd32 support.  There should thus be no impact on other parts of the kernel.

Please MFC into stable/15 and releng/15.0 if possible.
Comment 14 commit-hook freebsd_committer freebsd_triage 2025-11-14 01:04:11 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f0af21824331648a41b4e5d3323bea9216bcb7e2

commit f0af21824331648a41b4e5d3323bea9216bcb7e2
Author:     Robert Clausecker <fuz@FreeBSD.org>
AuthorDate: 2025-11-14 00:55:59 +0000
Commit:     Robert Clausecker <fuz@FreeBSD.org>
CommitDate: 2025-11-14 00:56:12 +0000

    aio: fix alignment of struct (o)aiocb32 on non-amd64

    Only i386 has a four-byte alignment for uint64_t, others have
    eight-byte alignment.  This causes the structure to mismatch
    on armv7 binaries running under aarch64, breaking the aio interface.

    Fixes:          3858a1f4f501d00000447309aae14029f8133946
    Approved by:    markj (mentor)
    Reported by:    Mark Millard <marklmi26-fbsd@yahoo.com>
    Discussed with: jrtc27
    PR:             290962
    MFC after:      immediately (for 15.0)

 sys/kern/vfs_aio.c | 8 ++++++++
 1 file changed, 8 insertions(+)
Comment 15 commit-hook freebsd_committer freebsd_triage 2025-11-14 01:04:12 UTC
A commit in branch stable/15 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ea26fd52a949116d03f59066d364eee2af6c9f51

commit ea26fd52a949116d03f59066d364eee2af6c9f51
Author:     Robert Clausecker <fuz@FreeBSD.org>
AuthorDate: 2025-11-14 00:55:59 +0000
Commit:     Robert Clausecker <fuz@FreeBSD.org>
CommitDate: 2025-11-14 01:03:44 +0000

    aio: fix alignment of struct (o)aiocb32 on non-amd64

    Only i386 has a four-byte alignment for uint64_t, others have
    eight-byte alignment.  This causes the structure to mismatch
    on armv7 binaries running under aarch64, breaking the aio interface.

    Fixes:          3858a1f4f501d00000447309aae14029f8133946
    Approved by:    markj (mentor)
    Reported by:    Mark Millard <marklmi26-fbsd@yahoo.com>
    Discussed with: jrtc27
    PR:             290962
    MFC after:      immediately (for 15.0)

 sys/kern/vfs_aio.c | 8 ++++++++
 1 file changed, 8 insertions(+)
Comment 16 commit-hook freebsd_committer freebsd_triage 2025-11-14 01:05:13 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=498a4c7f48a33c60b1768b8a6daf519bad84e18d

commit 498a4c7f48a33c60b1768b8a6daf519bad84e18d
Author:     Robert Clausecker <fuz@FreeBSD.org>
AuthorDate: 2025-11-14 00:55:59 +0000
Commit:     Robert Clausecker <fuz@FreeBSD.org>
CommitDate: 2025-11-14 01:04:53 +0000

    aio: fix alignment of struct (o)aiocb32 on non-amd64

    Only i386 has a four-byte alignment for uint64_t, others have
    eight-byte alignment.  This causes the structure to mismatch
    on armv7 binaries running under aarch64, breaking the aio interface.

    Fixes:          3858a1f4f501d00000447309aae14029f8133946
    Approved by:    markj (mentor)
    Reported by:    Mark Millard <marklmi26-fbsd@yahoo.com>
    Discussed with: jrtc27
    PR:             290962
    MFC after:      immediately (for 15.0)

    (cherry picked from commit f0af21824331648a41b4e5d3323bea9216bcb7e2)

 sys/kern/vfs_aio.c | 8 ++++++++
 1 file changed, 8 insertions(+)
Comment 17 Robert Clausecker freebsd_committer freebsd_triage 2025-11-14 01:09:02 UTC
Should be fixed now.  Let's hope this gets added to 15.0.
Comment 18 commit-hook freebsd_committer freebsd_triage 2025-11-14 02:30:27 UTC
A commit in branch releng/15.0 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=cb9075718bc66844c6d25fed1df18de61af5267b

commit cb9075718bc66844c6d25fed1df18de61af5267b
Author:     Robert Clausecker <fuz@FreeBSD.org>
AuthorDate: 2025-11-14 00:55:59 +0000
Commit:     Colin Percival <cperciva@FreeBSD.org>
CommitDate: 2025-11-14 02:27:28 +0000

    aio: fix alignment of struct (o)aiocb32 on non-amd64

    Only i386 has a four-byte alignment for uint64_t, others have
    eight-byte alignment.  This causes the structure to mismatch
    on armv7 binaries running under aarch64, breaking the aio interface.

    Approved by:    re (cperciva)
    Fixes:          3858a1f4f501d00000447309aae14029f8133946
    Approved by:    markj (mentor)
    Reported by:    Mark Millard <marklmi26-fbsd@yahoo.com>
    Discussed with: jrtc27
    PR:             290962
    MFC after:      immediately (for 15.0)

    (cherry picked from commit ea26fd52a949116d03f59066d364eee2af6c9f51)

 sys/kern/vfs_aio.c | 8 ++++++++
 1 file changed, 8 insertions(+)
Comment 19 Mark Millard 2025-11-14 03:27:06 UTC
(In reply to commit-hook from comment #16)

For stable-14/ the update could change the
status of powerpc64 support for 32-bit
powerpc programs. 32-bit powerpc has a
tier 2 status for 14.* and before but
are not supported for 15.* and later.

However, I've no clue if aio is working
vs. failing for 14.3 and 13.5 32-bit
powerpc code on powerpc64's. I've not had
operational powerpc context (32-bit,
64-bit) in years. As I remember, the
little endian powerpc64's do not support
32-bit in the hardware. The old PowerMac
G5's were 64-bit and did support such.

I'll send a note to Justin Hibbits (with
you cc'd) asking if he knows the status.
Comment 20 Gunther Nikl 2025-11-14 11:35:36 UTC
(In reply to Jessica Clarke from comment #11)
> The __packed on the fields means the uint64_t has alignment 1 not 8 as should
> be the case for the ABI. It is likely there to bodge the structure layout
> for freebsd32 on amd64, as i386's ABI only uses 4-byte alignment for 8-byte
> long long.
This diagnose seems to be correct, since the commit referenced in "Fixes" had the structure in a COMPAT_IA32 block. Thus the "Fixes" reference in the recent commits is wrong.
Comment 21 Robert Clausecker freebsd_committer freebsd_triage 2025-11-14 12:58:40 UTC
(In reply to Gunther Nikl from comment #20)

The __packed attribute is there to have to struct confirm to the i386 alignment when building the kernel for amd64.  No such alignment difference exists on the other platform pairs with freebsd32 support (that is, powerpc64/powerpc and aarch64/armv7), so the __packed attribute is wrong there.  This is what my commit fixes.

The linked commit (base 3858a1f4f501d00000447309aae14029f8133946) is the one where the structures in question were introduced.
Comment 22 Mark Millard 2025-11-14 14:04:17 UTC
(In reply to Robert Clausecker from comment #21)

I think the reference is just indicating that 3858a1f4f501
being from:

committer	John Baldwin <jhb@FreeBSD.org>	2008-12-10 20:56:19 +0000

may be so old that both aarch64/armv7 and powerpc64/powerpc
might not have been part of the criteria at the time. The
error may have been from a later update that first tried to
generalize things beyond amd64/i386.

Looking . . .

committer	Nathan Whitehorn <nwhitehorn@FreeBSD.org>	2010-03-11 14:49:06 +0000
commit	841c0c7ec75bef3c9920cd811270f9f84791ee04 (patch)
. . .
Provide groundwork for 32-bit binary compatibility on non-x86 platforms,
Comment 23 Gunther Nikl 2025-11-14 14:52:12 UTC
(In reply to Robert Clausecker from comment #21)
> [...] __packed attribute is wrong there.  This is what my commit fixes.
Yes, I do not dispute that.

> The linked commit (base 3858a1f4f501d00000447309aae14029f8133946) is the one
> where the structures in question were introduced.
Correct as well (thank you for providing the reference!), *but* it was solely for COMPAT_IA32 aka i386 meaning the commit was correct for its purpose at that time. A later change is the real culprit since now the code is wrapped with COMPAT_FREEBSD32. I believe that commit is fixed here.
Comment 24 Gunther Nikl 2025-11-14 14:58:09 UTC
(In reply to Mark Millard from comment #22)
> commit	841c0c7ec75bef3c9920cd811270f9f84791ee04 (patch)
Yeah, that is the commit fixed now.
Given how old it is, the issue appears to happen rarely. It is no surprise that you stumbled about it ;-)