Bug 260547 - sys.netmap.ctrl-api-test.main fails in CI
Summary: sys.netmap.ctrl-api-test.main fails in CI
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: tests (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Vincenzo Maffione
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-12-19 16:43 UTC by Li-Wen Hsu
Modified: 2023-04-05 20:59 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Li-Wen Hsu freebsd_committer freebsd_triage 2021-12-19 16:43:27 UTC
Earliest failure can trace back to https://ci.freebsd.org/job/FreeBSD-main-amd64-test/20115/testReport/junit/sys.netmap/ctrl-api-test/main/ (base 6f1c8908272f3c0a6631e001bd2eb50a5b69261d) but does not necessarily have direct relationship with it.
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2023-03-11 00:37:41 UTC
I hit a different test failure in ctrl-api-test.  The problem is in the "infinite options" test.

Since commit 253b2ec199b831cacc022b58cb38c3e3c29c1a8f, nmreq_copyin() does some extra validation of the option linked list, but that appears to break the test and prevents an infinite loop only by accident.

In particular, if multiple options have the same req type, the loop just skips to the next option.  But after the first iteration, opt->nro_next will be a kernel pointer not a user point, and then the subsequent copyin() fails because netmap is trying to copyin from a kernel address.  This causes the test to fail since the ioctl returns EFAULT instead of EMSGSIZE.  If not for that bug, I think there would be an infinite loop.

There is a second bug there in that "error" is set in some of those error cases, but the next iteration can set "error" back to 0.

If I disable the infinite_options test, everything else passes for me.  So it'd be nice to fix the kernel and re-enable the netmap tests in CI.
Comment 3 Vincenzo Maffione freebsd_committer freebsd_triage 2023-03-11 16:25:42 UTC
Ack.
I've noticed this some time ago, but did not have the chance to fix it.
Comment 4 Mark Johnston freebsd_committer freebsd_triage 2023-03-11 17:05:37 UTC
(In reply to Vincenzo Maffione from comment #3)
If you have a suggestion for how to approach it, I'd be happy to work on this.

I do wonder why we keep processing the linked list after duplicate options are encountered.  If netmap just returned an error at that point, there would be no problem, I think.
Comment 5 Vincenzo Maffione freebsd_committer freebsd_triage 2023-03-14 21:35:53 UTC
I see there is a duplicate_extmem_options() test that checks for duplicate options...
So I guess we would make an API change if we returned error on duplicate options.
I'm not against that, since I also do not see reasonable use cases for duplicate options and I doubt there is people relyin on that in the wild (modulo possible errors).
I'd better include Giuseppe Lettieri who implemented the option support, to see if he has suggestions.
Comment 6 Vincenzo Maffione freebsd_committer freebsd_triage 2023-03-14 21:45:19 UTC
Sorry, I overlooked that!
Duplicated options are not meant to be accepted, indeed.
Comment 7 Vincenzo Maffione freebsd_committer freebsd_triage 2023-03-15 22:23:11 UTC
We are fixing this. You'll see a commit soon.
Comment 8 giuseppe.lettieri@unipi.it 2023-03-18 08:35:09 UTC
Hi, I have discussed the issue with Vincenzo and implemented some of the suggestions in commit c16e3ce2ffee on github. Thanks for pointing out the problems.
Comment 9 Vincenzo Maffione freebsd_committer freebsd_triage 2023-03-19 17:35:56 UTC
Thanks Giuseppe. I'll commit the fix on the FreeBSD repo.
Comment 10 commit-hook freebsd_committer freebsd_triage 2023-03-21 23:28:09 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e2a431a0ffb6894220bdf5d8fc2ca2d0ca316e85

commit e2a431a0ffb6894220bdf5d8fc2ca2d0ca316e85
Author:     Vincenzo Maffione <vmaffione@FreeBSD.org>
AuthorDate: 2023-03-21 23:23:18 +0000
Commit:     Vincenzo Maffione <vmaffione@FreeBSD.org>
CommitDate: 2023-03-21 23:23:18 +0000

    netmap: fix copyin/copyout of nmreq options list

    The previous code unsuccesfully attempted to report a precise error for
    each option in the user list. Moreover, commit 253b2ec199b broke some
    ctrl-api-test (see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260547).

    With this patch we bail out as soon as an unrecoverable error is detected and
    we properly check for copy boundaries. EOPNOTSUPP no longer immediately
    returns an error, so that any other option in the list may be examined
    by the caller code and a precise report of the (un)supported options can
    be returned to the user.

    With this patch, all ctrl-api-test unit tests pass again.

    PR:                     260547
    Submitted by:           giuseppe.lettieri@unipi.it
    Reviewed by:            vmaffione
    MFC after:              14 days

 sys/dev/netmap/netmap.c          | 69 ++++++++++++++++++++++++----------------
 tests/sys/netmap/ctrl-api-test.c | 23 ++++++++++++--
 2 files changed, 63 insertions(+), 29 deletions(-)
Comment 11 Vincenzo Maffione freebsd_committer freebsd_triage 2023-03-21 23:29:23 UTC
The issue should be gone now.
We should be able to reenable netmap tests in CI.
Comment 12 Mark Johnston freebsd_committer freebsd_triage 2023-03-21 23:45:40 UTC
(In reply to Vincenzo Maffione from comment #11)
Nice, thanks!

Here's a PR to re-enable the tests: https://github.com/freebsd/freebsd-ci/pull/162
Comment 13 commit-hook freebsd_committer freebsd_triage 2023-04-05 20:42:41 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b74063f03a834e9f22fb46f8e989a9df19823ff0

commit b74063f03a834e9f22fb46f8e989a9df19823ff0
Author:     Vincenzo Maffione <vmaffione@FreeBSD.org>
AuthorDate: 2023-03-21 23:23:18 +0000
Commit:     Vincenzo Maffione <vmaffione@FreeBSD.org>
CommitDate: 2023-04-05 20:41:55 +0000

    netmap: fix copyin/copyout of nmreq options list

    The previous code unsuccesfully attempted to report a precise error for
    each option in the user list. Moreover, commit 253b2ec199b broke some
    ctrl-api-test (see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260547).

    With this patch we bail out as soon as an unrecoverable error is detected and
    we properly check for copy boundaries. EOPNOTSUPP no longer immediately
    returns an error, so that any other option in the list may be examined
    by the caller code and a precise report of the (un)supported options can
    be returned to the user.

    With this patch, all ctrl-api-test unit tests pass again.

    PR:                     260547
    Submitted by:           giuseppe.lettieri@unipi.it
    Reviewed by:            vmaffione
    MFC after:              14 days

    (cherry picked from commit e2a431a0ffb6894220bdf5d8fc2ca2d0ca316e85)

 sys/dev/netmap/netmap.c          | 69 ++++++++++++++++++++++++----------------
 tests/sys/netmap/ctrl-api-test.c | 23 ++++++++++++--
 2 files changed, 63 insertions(+), 29 deletions(-)
Comment 14 Mark Johnston freebsd_committer freebsd_triage 2023-04-05 20:59:33 UTC
Thanks again.