It seems that there is a bug in kqueue support code of netmap which can cause kernel panic. And I can reproduce the kernel panic with this simple program on -HEAD: #include <stdio.h> #include <assert.h> #include <sys/types.h> #include <sys/event.h> #include <sys/time.h> #ifndef NETMAP_WITH_LIBS #define NETMAP_WITH_LIBS #endif #include <net/netmap_user.h> static int kq_add(int kq, int fd) { struct kevent changes[1]; int ret; EV_SET(&changes[0], fd, EVFILT_READ, EV_ADD, 0, 0, NULL); ret = kevent(kq, changes, 1, NULL, 0, NULL); assert(ret != -1); printf("[%s] success\n", __func__); return (ret); } static void kq_wait(int kq) { struct kevent events[1]; int ret; ret = kevent(kq, NULL, 0, events, 1, NULL); assert(ret != -1); printf("[%s] success\n", __func__); } int main(void) { const char *ifname = "vale0:vm1"; struct nm_desc *d; int kq; d = nm_open(ifname, NULL, 0, 0); assert(d != NULL); kq = kqueue(); assert(kq != -1); kq_add(kq, d->fd); kq_wait(kq); return (0); } And below is part of the crash summary: freebsd dumped core - see /var/crash/vmcore.0 Fri Jan 8 22:19:39 CST 2016 FreeBSD freebsd 11.0-CURRENT FreeBSD 11.0-CURRENT #2 d1f6105(master)-dirty: Fri Jan 8 21:55:33 CST 2016 btw@freebsd:/usr/obj/root/freebsd/sys/GENERIC amd64 ...... panic: mutex nm_kn_lock not owned at /root/freebsd/sys/kern/kern_event.c:2073 cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe011744c420 vpanic() at vpanic+0x182/frame 0xfffffe011744c4a0 panic() at panic+0x43/frame 0xfffffe011744c500 __mtx_assert() at __mtx_assert+0xbf/frame 0xfffffe011744c510 knlist_add() at knlist_add+0x20/frame 0xfffffe011744c540 netmap_kqfilter() at netmap_kqfilter+0x101/frame 0xfffffe011744c580 devfs_kqfilter_f() at devfs_kqfilter_f+0x81/frame 0xfffffe011744c5d0 kqueue_register() at kqueue_register+0x5bf/frame 0xfffffe011744c670 kqueue_kevent() at kqueue_kevent+0xc8/frame 0xfffffe011744c840 kern_kevent_fp() at kern_kevent_fp+0x99/frame 0xfffffe011744c890 kern_kevent() at kern_kevent+0x9f/frame 0xfffffe011744c8f0 sys_kevent() at sys_kevent+0x11c/frame 0xfffffe011744c9a0 amd64_syscall() at amd64_syscall+0x2db/frame 0xfffffe011744cab0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe011744cab0 --- syscall (363, FreeBSD ELF64, sys_kevent), rip = 0x80095bc7a, rsp = 0x7fffffffe988, rbp = 0x7fffffffe9e0 --- KDB: enter: panic ......
I don't know enough to verify as a real bug, adding some folks who might have a grasp of what's going on here.
(In reply to Sean Bruno from comment #1) I think replacing 1 with 0 in the sys/dev/netmap/netmap_freebsd.c:netmap_kqfilter(), the call to knlist_add() would fix the issue at hand. But the panic probably means that netmap authors never ever run the code with INVARIANTS and WITNESS. Also, there are other comments in the netmap_freebsd.c, esp. about 'not even munmap() on close()' which are, together with conslusions made, at least strange.
Logged as netmap issue #521: https://github.com/luigirizzo/netmap/issues/521
(In reply to Konstantin Belousov from comment #2) I've verified that the change suggested by kib@ does, indeed, avoid the panic. I've submitted a pull request upstream to make that change. I don't know if we want a local fix in the meantime or not. Adding mmacy@ for an opinion as the commit log indicates he has been syncing with upstream recently.
This bug has been fixed in upstream (netmap PR #520).
(In reply to David Bright from comment #5) This fix was (accidentally) committed in base r337812 and incorporated into the upcoming 12.0-RELEASE; I just noticed that now when I went to MFC that commit to stable/11. Noting the commit now, considerably after the fact. Also, flagging this as MFC-ed to stable/12 as it actually was committed before 12.0-RELEASE.
A commit references this bug: Author: dab Date: Fri Nov 30 02:06:32 UTC 2018 New revision: 341275 URL: https://svnweb.freebsd.org/changeset/base/341275 Log: MFC r337812,r337814,r337820,r341068: Fix several memory leaks (r337812 & r337814). The libkqueue tests have several places that leak memory by using an idiom like: puts(kevent_to_str(kevp)); Rework to save the pointer returned from kevent_to_str() and then free() it after it has been used. r337812 also fixed a bug in the netmap kevent code. The inclusion of that fix was an oversight that I didn't notice until this MFC. Reference the code review and PR here in the MFC for completeness. r337820 & r341068 were white-space only changes as a follow-up to r337812 & r337814: After r337820, which "corrected" some spaces-instead-of-tab whitespace issues in the libkqueue tests, jmg@ pointed out that these files were originally space-based, not tab-spaced, and so the correction should have been to get rid of the tabs that had been introduced in previous changes, not the spaces. This change does that. This is a whitespace only change; no functional change is intended. PR: 206053 Differential Revision: https://reviews.freebsd.org/D16531 Sponsored by: Dell EMC Isilon Changes: _U stable/11/ stable/11/sys/dev/netmap/netmap_freebsd.c stable/11/tests/sys/kqueue/libkqueue/common.h stable/11/tests/sys/kqueue/libkqueue/main.c stable/11/tests/sys/kqueue/libkqueue/proc.c stable/11/tests/sys/kqueue/libkqueue/signal.c stable/11/tests/sys/kqueue/libkqueue/timer.c stable/11/tests/sys/kqueue/libkqueue/user.c stable/11/tests/sys/kqueue/libkqueue/vnode.c
I'm going to close this as fixed in 11 & 12. I don't intend to try to MFC to 10.