Summary: | panic: mutex pipe mutex not owned at ../../../kern/sys_pipe.c:1748 | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Peter Holm <pho> | ||||
Component: | kern | Assignee: | Mark Johnston <markj> | ||||
Status: | Closed FIXED | ||||||
Severity: | Affects Only Me | CC: | cem, markj | ||||
Priority: | --- | Keywords: | crash, stress2 | ||||
Version: | CURRENT | ||||||
Hardware: | Any | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Peter Holm
![]() ![]() Could you provide a copy of k6.sh? I don't see it in stress2. (In reply to Mark Johnston from comment #1) Test scenario: misc/kevent13.sh I just updated the test scenario to include VM pressure. root@t2:~pho/stress2/misc # uname -a FreeBSD t2.osted.lan 13.0-CURRENT FreeBSD 13.0-CURRENT #2 r343705M: Sun Feb 10 19:11:22 CET 2019 pho@t2.osted.lan:/usr/src/sys/amd64/compile/PHO amd64 root@t2:~pho/stress2/misc # ./kevent13.sh panic: mutex pipe mutex not owned at ../../../kern/sys_pipe.c:1748 cpuid = 14 time = 1549896871 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d70b33d0 vpanic() at vpanic+0x1b4/frame 0xfffffe00d70b3430 panic() at panic+0x43/frame 0xfffffe00d70b3490 __mtx_assert() at __mtx_assert+0xb4/frame 0xfffffe00d70b34a0 filt_pipewrite() at filt_pipewrite+0x30/frame 0xfffffe00d70b34c0 kqueue_register() at kqueue_register+0xa03/frame 0xfffffe00d70b3550 kqueue_kevent() at kqueue_kevent+0xa2/frame 0xfffffe00d70b3820 kern_kevent_fp() at kern_kevent_fp+0x97/frame 0xfffffe00d70b3870 kern_kevent() at kern_kevent+0x9f/frame 0xfffffe00d70b38d0 kern_kevent_generic() at kern_kevent_generic+0x70/frame 0xfffffe00d70b3930 sys_kevent() at sys_kevent+0x61/frame 0xfffffe00d70b3990 amd64_syscall() at amd64_syscall+0x291/frame 0xfffffe00d70b3ab0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00d70b3ab0 --- syscall (560, FreeBSD ELF64, sys_kevent), rip = 0x80043055a, rsp = 0x7fffdfffde88, rbp = 0x7fffdfffded0 --- KDB: enter: panic [ thread pid 1369 tid 101232 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why db> This is caused by a race between one end of a pipe being closed and the registration of an EVFILT_WRITE knote for the other end of the pipe. Suppose f_attach (pipe_kqfilter()) succeeds, but before kn_list_lock() is called, the other end of the pipe is closed. pipeclose() removes the EVFILT_WRITE knote from that end's knlist, so kn_list_lock() returns NULL, and we call filt_pipewrite() without the pipe's mutex held. I think this simply means that the assertion in filt_pipewrite() is wrong. We can only assert that the pipe lock is held if wpipe->pipe_present == PIPE_ACTIVE. Created attachment 202077 [details]
proposed patch
Peter, could you test the attached patch?
BTW, I think kevent13.sh is slightly buggy: it uses EV_SET twice to initialize the same event, and declares that the change list has length 2 when it is in fact 1. For the purpose of reproducing the bug, only the EVFILT_WRITE event is needed. We'd get better coverage by registering both events and randomizing the order in which do_thread2() closes the pipe ends.
(In reply to Mark Johnston from comment #4) Sure, I'll test the patch. I'll work on adding a new test scenario. (In reply to Mark Johnston from comment #4) I ran tests for 8 hours including a buildworld. LGTM. A commit references this bug: Author: markj Date: Tue Feb 19 15:46:43 UTC 2019 New revision: 344278 URL: https://svnweb.freebsd.org/changeset/base/344278 Log: Move a racy assertion in filt_pipewrite(). EVFILT_WRITE knotes for pipes live on the knlist for the other end of the pipe. Since they do not hold a reference on the corresponding file structure, they may be removed from the knlist by pipeclose() while still remaining active. In this case, there is no knlist lock acquired before filt_pipewrite() is called, so the assertion fails. Fix the problem by first checking whether that end of the pipe has been closed. These checks are memory safe since the knote holds a reference on one end of the pipe, and the pipe structure is not freed until both ends are closed. The checks are not racy since PIPE_EOF is never cleared after being set, and pipe_present is never set back to PIPE_ACTIVE after pipeclose() has been called. PR: 235640 Reported and tested by: pho Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19224 Changes: head/sys/kern/sys_pipe.c A commit references this bug: Author: markj Date: Tue Mar 5 19:37:07 UTC 2019 New revision: 344819 URL: https://svnweb.freebsd.org/changeset/base/344819 Log: MFC r344278: Move a racy assertion in filt_pipewrite(). PR: 235640 Changes: _U stable/12/ stable/12/sys/kern/sys_pipe.c A commit references this bug: Author: markj Date: Tue Mar 5 19:39:14 UTC 2019 New revision: 344820 URL: https://svnweb.freebsd.org/changeset/base/344820 Log: MFC r344278: Move a racy assertion in filt_pipewrite(). PR: 235640 Changes: _U stable/11/ stable/11/sys/kern/sys_pipe.c |