Bug 279354

Summary: New test kern/unix_seqpacket_test:random_eor_and_waitall reliably fails
Product: Base System Reporter: Ryan Libby <rlibby>
Component: testsAssignee: Gleb Smirnoff <glebius>
Status: New ---    
Severity: Affects Some People CC: emaste, glebius, imp
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Ryan Libby freebsd_committer freebsd_triage 2024-05-27 18:36:03 UTC
The new test kern/unix_seqpacket_test:random_eor_and_waitall reliably
fails in both CI and in my manual testing on an amd64 GENRIC VM.

The test was added here:
https://cgit.freebsd.org/src/commit/?id=eb338e2370b4644382e6404d7402bc05eef13e54
eb338e2370b4 tests/unix_seqpacket: provide random data pumping test with MSG_EOR

Here is a failure in CI from April 9, the first run I could find after
the test was committed:
https://ci.freebsd.org/job/FreeBSD-main-amd64-test/25066/testReport/sys.kern/unix_seqpacket_test/random_eor_and_waitall/
and is still failing as of the latest run on May 19:
https://ci.freebsd.org/job/FreeBSD-main-amd64-test/25240/testReport/sys.kern/unix_seqpacket_test/random_eor_and_waitall/

It fails for me every time when run with this on an amd64 GENERIC vm:
kyua debug -k /usr/tests/sys/Kyuafile kern/unix_seqpacket_test:random_eor_and_waitall

I've seen it fail in a few different ways:

> % for i in {1..10}; do kyua debug -k /usr/tests/sys/Kyuafile kern/unix_seqpacket_test:random_eor_and_waitall; done
> Using seed: 0x41fd, 0xd11e, 0x7725, 0xadf8, 0xe04f, 0x1d61,
> *** Check failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1255: len != iov.iov_len: recvmsg(MSG_WAITALL): 1132, expected 3141
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1182: send(params->sock, &params->sendbuf[off], len, flags) == len not met
> Using seed: 0x8fa8, 0xdbe5, 0x1403, 0xb14d, 0x84f8, 0xfbd0,
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1182: send(params->sock, &params->sendbuf[off], len, flags) == len not met
> Using seed: 0x5b73, 0xc363, 0x39d7, 0xc52a, 0xfa9d, 0x15ab,
> *** Check failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1255: len != iov.iov_len: recvmsg(MSG_WAITALL): 4484, expected 27917
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1269: data corruption past 4923
> Using seed: 0xaa47, 0x3831, 0xb603, 0x97df, 0xb839, 0x0109,
> *** Check failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1255: len != iov.iov_len: recvmsg(MSG_WAITALL): 4525, expected 9299
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1182: send(params->sock, &params->sendbuf[off], len, flags) == len not met
> Using seed: 0x679a, 0xc263, 0xa25f, 0x348c, 0x2d3a, 0x0cd2,
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1182: send(params->sock, &params->sendbuf[off], len, flags) == len not met
> Using seed: 0xaa1e, 0x7317, 0x2dde, 0xe299, 0x1139, 0xf8d8,
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1182: send(params->sock, &params->sendbuf[off], len, flags) == len not met
> Using seed: 0x6b58, 0x2247, 0x5d93, 0x9c57, 0x326d, 0x1614,
> *** Check failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1255: len != iov.iov_len: recvmsg(MSG_WAITALL): 682, expected 24997
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1182: send(params->sock, &params->sendbuf[off], len, flags) == len not met
> Using seed: 0x8eca, 0xc5bd, 0xc09d, 0xe15e, 0xe7c3, 0xfdad,
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1182: send(params->sock, &params->sendbuf[off], len, flags) == len not met
> Using seed: 0x2859, 0x311b, 0x69d4, 0xd44c, 0xce3d, 0xe01b,
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1182: send(params->sock, &params->sendbuf[off], len, flags) == len not met
> Using seed: 0x0329, 0x992f, 0x6937, 0x766c, 0x47e5, 0x5270,
> kern/unix_seqpacket_test:random_eor_and_waitall  ->  failed: /usr/src/freebsd/tests/sys/kern/unix_seqpacket_test.c:1182: send(params->sock, &params->sendbuf[off], len, flags) == len not met

I traced the the errno for the send() failure as EMSGSIZE.

The test should be adjusted not to produce a failure.
Comment 1 Gleb Smirnoff freebsd_committer freebsd_triage 2024-05-28 03:12:29 UTC
That's because SOCK_SEQPACKET indeed is buggy. The test was committed
together with new implementation d80a97def9a1db6f07f5d2e68f7ad62b27918947.
With that revision test reliably doesn't fail.  However, the new implementation
had three issues: aio(9) incompatibility, lack of sendfile(2) support and
finally krpc(9) incompatibility.  In my private branch I have already
covered all expept krpc.  This one is really tough.  Anyway, the plan is
that the new implementation gets finally back into the main branch and
won't be reverted.

You can assign this bug to me or just close it.
Comment 2 Warner Losh freebsd_committer freebsd_triage 2024-06-26 15:17:20 UTC
*** Bug 279994 has been marked as a duplicate of this bug. ***
Comment 3 commit-hook freebsd_committer freebsd_triage 2024-06-26 15:19:34 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=0c00c3d75b27de4a367cfbf7299c000ed0e62486

commit 0c00c3d75b27de4a367cfbf7299c000ed0e62486
Author:     Warner Losh <imp@FreeBSD.org>
AuthorDate: 2024-06-26 15:18:03 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2024-06-26 15:18:50 +0000

    test: Change bug number

    There was already a bug on this, so change to old bug

    PR: 279354
    Sponsored by:           Netflix

 tests/sys/kern/unix_seqpacket_test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 4 Warner Losh freebsd_committer freebsd_triage 2024-06-26 15:20:30 UTC
Assigned to gleb, disabled the test in CI since it sounds like it won't be fixed "shortly"