Following KTLS test can fail randomly - no matter if any HW OCF backend is registered or not: ktls_test:ktls_receive_aes128_gcm_1_3_bad_major -> failed: /usr/src/tests/sys/kern/ktls_test.c:258: close(sv[1]) == 0 not met [0.022s] ktls_test:ktls_receive_aes128_gcm_1_3_bad_minor -> failed: /usr/src/tests/sys/kern/ktls_test.c:258: close(sv[1]) == 0 not met [0.018s] ktls_test:ktls_receive_aes128_gcm_1_3_bad_type -> failed: /usr/src/tests/sys/kern/ktls_test.c:258: close(sv[1]) == 0 not met [0.020s] ktls_test:ktls_receive_aes256_cbc_1_2_sha256_bad_minor -> failed: /usr/src/tests/sys/kern/ktls_test.c:258: close(sv[1]) == 0 not met [0.017s] ktls_test:ktls_receive_chacha20_poly1305_1_3_bad_major -> failed: /usr/src/tests/sys/kern/ktls_test.c:258: close(sv[1]) == 0 not met [0.017s] Adding a few ms delay between sending request to ktls socket and calling rcvmsg makes code much more robust (10+ consecutive runs without any radom fail). usleep has been added here: (ktls_test.c) diff --git a/tests/sys/kern/ktls_test.c b/tests/sys/kern/ktls_test.c index 09fb96ed11e..8a7652a6645 100644 --- a/tests/sys/kern/ktls_test.c +++ b/tests/sys/kern/ktls_test.c @@ -1506,6 +1506,7 @@ ktls_receive_tls_error(int fd, int expected_error) msg.msg_iov = &iov; msg.msg_iovlen = 1; + usleep(5 * 1000); // 5ms ATF_REQUIRE(recvmsg(fd, &msg, 0) == -1); if (expected_error != 0) ATF_REQUIRE(errno == expected_error);
I have a series of reviews for this already. I haven't yet patched it to permit ECONNRESET errors from close(). However, it's curious that usleep() makes a difference. I would suspect that would not help as my understanding of the race is that the receiving end of the socket pair notices that the error and drops the connection before the sending end calls close() hence close() returning ECONNRESET. The usleep() should only make that worse as the error detection and drop is triggered by receiving the packets, not by the call to recvmsg(). https://github.com/CTSRD-CHERI/cheribsd/issues/1566 is the first review in the series, though I haven't yet uploaded a workaround for spurious close() errors. The third patch deals with spurious errors from shutdown().
I haven't looked at the code, but things like this are sometimes improved via IPC mechanisms like pipe(2), sem_wait(3), etc.
(In reply to John Baldwin from comment #1) Sigh, somehow the wrong link was pasted. https://reviews.freebsd.org/D37717 is the patch to ignore spurious close() errors, but you can see an earlier commit in the stack for dealing with spurious shutdown() errors as well.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3845cce70dfea11775f7ccff5290e67ade2f13aa commit 3845cce70dfea11775f7ccff5290e67ade2f13aa Author: John Baldwin <jhb@FreeBSD.org> AuthorDate: 2022-12-20 19:38:28 +0000 Commit: John Baldwin <jhb@FreeBSD.org> CommitDate: 2022-12-20 19:38:28 +0000 ktls_tests: Ignore errors from close for receive error tests. For tests that send invalid data to a TLS socket to trigger read errors the kernel may end up dropping the connection before close is called at the conclusion of the test resulting in spurious ECONNRESET errors from close. Ignore any errors from close for these tests. PR: 268390 Reported by: olivier, Michal Gulbicki <michalx.gulbicki@intel.com> Reviewed by: markj Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D37717 tests/sys/kern/ktls_test.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)