Bug 267788 - Go testsuite fails in armv7 jail on arm64 host, but not on armv7 host
Summary: Go testsuite fails in armv7 jail on arm64 host, but not on armv7 host
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: arm (show other bugs)
Version: 13.1-RELEASE
Hardware: arm64 Any
: --- Affects Only Me
Assignee: Olivier Houchard
URL: https://github.com/golang/go/issues/5...
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-15 15:59 UTC by Robert Clausecker
Modified: 2023-10-19 23:09 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Clausecker freebsd_committer freebsd_triage 2022-11-15 15:59:23 UTC
Running the Go 1.19.3 test suite (all.bash in the Go distribution) in an armv7 jail
on arm64 FreeBSD 13.1 (RPi 4B), I get weird test suite failures.  The same
failures do not occur when testing natively on an armv7 machine (RPi 2B) with
the same OS version.

Could this be a kernel bug?

--- FAIL: TestDCT (0.06s)
    dct_test.go:78: i=35: FDCT
        src
        {
                0x0000, 0x0000, 0x0000, 0x00a6, 0x0000, 0x0000, 0x0064, 0x0000, 
                0x00f4, 0x0000, 0x0044, 0x0046, 0x00ed, 0x0072, 0x0000, 0x0000, 
                0x0000, 0x0000, 0x0096, 0x0000, 0x0000, 0x0000, 0x0038, 0x0000, 
                0x0000, 0x0062, 0x0000, 0x00d3, 0x004e, 0x0000, 0x004b, 0x0000, 
                0x00d0, 0x0000, 0x0000, 0x0000, 0x0000, 0x002b, 0x0000, 0x0000, 
                0x0000, 0x00ec, 0x006a, 0x0023, 0x0000, 0x004b, 0x0063, 0x0000, 
                0x002e, 0x0000, 0x0000, 0x0000, 0x0000, 0x001d, 0x0000, 0x0000, 
                0x0000, 0x0086, 0x0000, 0x0000, 0x00b2, 0x0000, 0x000c, 0x00a4, 
        }
        got
        {
                0xebd8, 0x0292, 0xfee1, 0x0118, 0x00f2, 0x005c, 0xfe31, 0x0052, 
                0x00fd, 0x00ad, 0xfce9, 0x00fd, 0x01b4, 0x051e, 0x00fc, 0x00a6, 
                0x0051, 0xfd6f, 0xff67, 0xffe6, 0x022c, 0xfdc6, 0xffb9, 0x0106, 
                0xff7e, 0x0169, 0x0154, 0x013d, 0xfdaf, 0x0298, 0xff94, 0xfd54, 
                0xff9e, 0xfe51, 0x000c, 0xfef1, 0x034c, 0x0071, 0xfcdf, 0xfdca, 
                0xfc5a, 0xfdfe, 0xfdfe, 0xfbda, 0xfdc4, 0x02fc, 0xfd01, 0xfd2c, 
                0xffd4, 0xfea3, 0x007d, 0xfab7, 0xfa7c, 0xfee3, 0xfdb5, 0xffb1, 
                0xfb03, 0xffc9, 0x02ee, 0x00a8, 0x004f, 0x0262, 0x041b, 0x019a, 
        }
        want
        {
                0xebd9, 0x0292, 0xfee2, 0x0117, 0x00f2, 0x005c, 0xfe32, 0x0052, 
                0x00fd, 0x00ad, 0xfcea, 0x00fd, 0x01b4, 0x051e, 0x00fc, 0x00a5, 
                0x0051, 0xfd70, 0xff67, 0xffe7, 0x022c, 0xfdc7, 0xffba, 0x0106, 
                0xff7f, 0x0169, 0x0154, 0x013d, 0xfdb0, 0x0297, 0xff95, 0xfd56, 
                0xff9f, 0xfe52, 0x000c, 0xfef2, 0x034c, 0x0071, 0xfcdf, 0xfdcb, 
                0xfc5b, 0xfdff, 0xfdff, 0xfbdb, 0xfdc5, 0x02fc, 0xfd03, 0xfd2d, 
                0xffd5, 0xfea4, 0xffa3, 0xfab8, 0xfa7d, 0xfee4, 0xfdb7, 0xffb2, 
                0xfb04, 0xffc9, 0x02ee, 0x00a8, 0x004f, 0x0262, 0x041b, 0x019a, 
        }
FAIL
FAIL    image/jpeg      26.861s
--- FAIL: TestMUDTracking (0.18s)
    mud_test.go:82: inverse(30) = 0.5591381724253374, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5586624103885134, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5469651579616419, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5466046095998363, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5433567836763477, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.532762368527474, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5300878709399219, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5170200550184978, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5170200550184978, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5170200550184978, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5107044797157085, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5072570054458773, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5060522326208803, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5060522326208803, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5039603713275813, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.5039244467792984, not ∈ [0.27734375, 0.2783203125)
    mud_test.go:82: inverse(30) = 0.4934037493992722, not ∈ [0.27734375, 0.2783203125)
FAIL
FAIL    internal/trace  0.359s
stddev NaN != 73.90083445627211 (allowed error 0.0072168783648703235, 0.005773502691896259)
stddev NaN != 73.90083445627211 (allowed error 0.0072168783648703235, 0.005773502691896259)
stddev NaN != 73.90083445627211 (allowed error 0.0072168783648703235, 0.005773502691896259)
--- FAIL: TestReadUniformity (0.27s)
    rand_test.go:395: stddev NaN != 73.90083445627211 (allowed error 0.0072168783648703235, 0.005773502691896259)
    rand_test.go:395: stddev NaN != 73.90083445627211 (allowed error 0.0072168783648703235, 0.005773502691896259)
    rand_test.go:395: stddev NaN != 73.90083445627211 (allowed error 0.0072168783648703235, 0.005773502691896259)
FAIL
FAIL    math/rand       0.815s
(...)
FAIL    net/http        1080.370s
2022/11/14 14:11:46 http: TLS handshake error from 217.197.83.6:54231: remote error: tls: bad certificate
2022/11/14 14:11:46 http: TLS handshake error from 217.197.83.6:54238: read tcp 217.197.83.6:54237->217.19
7.83.6:54238: use of closed network connection
--- FAIL: TestServer (0.26s)
    --- FAIL: TestServer/NewTLSServer (0.10s)
        --- FAIL: TestServer/NewTLSServer/ServerClient (0.09s)
            server_test.go:154: Get "https://217.197.83.6:54230": x509: certificate is valid for 127.0.0.1
, ::1, not 217.197.83.6
    --- FAIL: TestServer/NewTLSServerManual (0.09s)
        --- FAIL: TestServer/NewTLSServerManual/ServerClient (0.09s)
            server_test.go:154: Get "https://217.197.83.6:54237": x509: certificate is valid for 127.0.0.1
, ::1, not 217.197.83.6
2022/11/14 14:11:46 http: TLS handshake error from 217.197.83.6:54249: remote error: tls: bad certificate
--- FAIL: TestTLSServerWithHTTP2 (0.13s)
    --- FAIL: TestTLSServerWithHTTP2/http2 (0.12s)
        server_test.go:287: Failed to make request: Get "https://217.197.83.6:54248": x509: certificate is
 valid for 127.0.0.1, ::1, not 217.197.83.6
2022/11/14 14:11:47 Get "https://217.197.83.6:54256": x509: certificate is valid for 127.0.0.1, ::1, not 2
17.197.83.6
FAIL    net/http/httptest       0.902s
Comment 1 Olivier Houchard freebsd_committer freebsd_triage 2022-11-15 16:12:40 UTC
Yeah that is quite probably a kernel bug, I wish there were an easier reproducer than having to use the go suite, though :)
Comment 2 Warner Losh freebsd_committer freebsd_triage 2022-11-15 17:53:14 UTC
There's a known issue with alignment of control messages being 64-bit instead of 32-bit for 32-bit binaries running on 64-bit hosts. Maybe this is the same or similar?
Comment 3 Robert Clausecker freebsd_committer freebsd_triage 2022-11-15 21:31:01 UTC
(In reply to Warner Losh from comment #2)

What's a control message?
Comment 4 Robert Clausecker freebsd_committer freebsd_triage 2022-12-04 19:43:12 UTC
The failure mode differs between single and multi-threaded execution.

$ go test math/rand
stddev 6.485239797081586 != 4 (allowed error 0.4, 0.32)
stddev 8.278192731513668 != 4 (allowed error 0.4, 0.32)
stddev 14.106526765488345 != 4 (allowed error 0.4, 0.32)
--- FAIL: TestNonStandardNormalValues (0.39s)
    rand_test.go:125: stddev 6.485239797081586 != 4 (allowed error 0.4, 0.32)
    rand_test.go:128: stddev 8.278192731513668 != 4 (allowed error 0.4, 0.32)
    rand_test.go:131: stddev 14.106526765488345 != 4 (allowed error 0.4, 0.32)
FAIL
FAIL	math/rand	13.529s
FAIL
$ GOMAXPROCS=1 go test math/rand                                                                  
stddev NaN != 16 (allowed error 1.6, 1.28)
--- FAIL: TestNonStandardNormalValues (0.39s)
    rand_test.go:125: stddev NaN != 16 (allowed error 1.6, 1.28)
FAIL
FAIL	math/rand	12.905s
FAIL

To reproduce this more easily, you can produce a test binary using

$ go test -c math/rand

Which you can then dissect as desired.
Comment 5 Robert Clausecker freebsd_committer freebsd_triage 2022-12-04 19:45:36 UTC
(In reply to Robert Clausecker from comment #4)

Also note that the math/rand test suite some times succeeds.  You can execute

$ go test -count=1 math/rand

to force it to re run.  Could there be uninitialised memory in the kernel in play?
Comment 6 commit-hook freebsd_committer freebsd_triage 2023-10-16 20:30:17 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ccd0f34d8585cba727dd17a381309855af655b82

commit ccd0f34d8585cba727dd17a381309855af655b82
Author:     Olivier Houchard <cognet@FreeBSD.org>
AuthorDate: 2023-10-16 20:18:24 +0000
Commit:     Olivier Houchard <cognet@FreeBSD.org>
CommitDate: 2023-10-16 20:29:06 +0000

    arm64/compat32: Fix handling of 32bits FP registers.

    We must consider the aarch32 FP registers as 16 128bits registers, and store
    that as the first 16 aarch64 FP registers.

    PR: 267788
    MFC After: 1 week

 sys/arm64/arm64/freebsd32_machdep.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)
Comment 7 commit-hook freebsd_committer freebsd_triage 2023-10-19 22:46:29 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=0e0a03c792542a2509702378559622efafc86548

commit 0e0a03c792542a2509702378559622efafc86548
Author:     Olivier Houchard <cognet@FreeBSD.org>
AuthorDate: 2023-10-16 20:18:24 +0000
Commit:     Glen Barber <gjb@FreeBSD.org>
CommitDate: 2023-10-19 22:45:17 +0000

    arm64/compat32: Fix handling of 32bits FP registers.

    We must consider the aarch32 FP registers as 16 128bits registers, and store
    that as the first 16 aarch64 FP registers.

    PR: 267788

    (cherry picked from commit ccd0f34d8585cba727dd17a381309855af655b82)

 sys/arm64/arm64/freebsd32_machdep.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)
Comment 8 commit-hook freebsd_committer freebsd_triage 2023-10-19 23:05:33 UTC
A commit in branch releng/14.0 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=665838d939f3d32ca851e815506989d80a207f52

commit 665838d939f3d32ca851e815506989d80a207f52
Author:     Olivier Houchard <cognet@FreeBSD.org>
AuthorDate: 2023-10-16 20:18:24 +0000
Commit:     Olivier Houchard <cognet@FreeBSD.org>
CommitDate: 2023-10-19 23:04:16 +0000

    arm64/compat32: Fix handling of 32bits FP registers.

    We must consider the aarch32 FP registers as 16 128bits registers, and store
    that as the first 16 aarch64 FP registers.

    PR: 267788

    (cherry picked from commit ccd0f34d8585cba727dd17a381309855af655b82)
    (cherry picked from commit 0e0a03c792542a2509702378559622efafc86548)
    Approved by: re (cperciva)

 sys/arm64/arm64/freebsd32_machdep.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)
Comment 9 commit-hook freebsd_committer freebsd_triage 2023-10-19 23:06:34 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=51be8675416df4aa066e823fec37ea6667494e63

commit 51be8675416df4aa066e823fec37ea6667494e63
Author:     Olivier Houchard <cognet@FreeBSD.org>
AuthorDate: 2023-10-16 20:18:24 +0000
Commit:     Olivier Houchard <cognet@FreeBSD.org>
CommitDate: 2023-10-19 23:05:26 +0000

    arm64/compat32: Fix handling of 32bits FP registers.

    We must consider the aarch32 FP registers as 16 128bits registers, and store
    that as the first 16 aarch64 FP registers.

    PR: 267788

    (cherry picked from commit ccd0f34d8585cba727dd17a381309855af655b82)
    (cherry picked from commit 0e0a03c792542a2509702378559622efafc86548)

 sys/arm64/arm64/freebsd32_machdep.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)