Bug 291005 - knote_fork() NOTE_TRACK path calls kn_fop->f_event() without knlist lock
Summary: knote_fork() NOTE_TRACK path calls kn_fop->f_event() without knlist lock
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.3-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-11-14 07:35 UTC by Qiu-ji Chen
Modified: 2025-11-25 14:12 UTC (History)
1 user (show)

See Also:
markj: mfc-stable15+
markj: mfc-stable14+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Qiu-ji Chen 2025-11-14 07:35:29 UTC
This follows the earlier bug 289120 (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=289120). In that discussion, the GPIO maintainer confirmed that gpioc relies on kqueue calling f_event with the knlist lock held, because the driver reads shared data under the assumption that the callback is always invoked while locked. The kqueue(9) man page also states that "the lock will be held over f_event calls", so this expectation is part of the documented interface.

While re-reading sys/kern/kern_event.c (14.3-RELEASE), I noticed that knote_fork() violates this expectation in the NOTE_TRACK path.

In the non-NOTE_TRACK path, f_event is called while the knlist lock is held:

    if ((kn->kn_sfflags & NOTE_TRACK) == 0) {
        if (kn->kn_fop->f_event(kn, NOTE_FORK))
            KNOTE_ACTIVATE(kn, 1);
        KQ_UNLOCK(kq);
        continue;
    }

But in the NOTE_TRACK case, the code explicitly drops both locks before calling f_event:

    kn_enter_flux(kn);
    KQ_UNLOCK(kq);
    list->kl_unlock(list->kl_lockarg);

    ... two kqueue_register() calls ...

    if (kn->kn_fop->f_event(kn, NOTE_FORK)) /* f_event called unlocked */
        KNOTE_ACTIVATE(kn, 0);
    list->kl_lock(list->kl_lockarg);
    KQ_LOCK(kq);
    kn_leave_flux(kn);
    KQ_UNLOCK_FLUX(kq);

This is inconsistent with the locking pattern used everywhere else and breaks drivers that rely on "f_event is always invoked with knlist lock held".

Suggested fix:
Re-acquire both locks before calling f_event in the NOTE_TRACK tail, and switch KNOTE_ACTIVATE to islock = 1 since kq_lock will already be held:

    list->kl_lock(list->kl_lockarg);   /* moved up */
    KQ_LOCK(kq);                       /* moved up */
    if (kn->kn_fop->f_event(kn, NOTE_FORK))
        KNOTE_ACTIVATE(kn, 1);         /* already locked */
    kn_leave_flux(kn);
    KQ_UNLOCK_FLUX(kq);

This makes the NOTE_TRACK path follow the same locking rule as the non-NOTE_TRACK path and avoids unexpected unlocked f_event calls.
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2025-11-14 17:12:22 UTC
I think this is mostly harmless, but yes, we might as well acquire the proc lock (i.e., knlist lock) earlier.  It only needs to be dropped while calling kqueue_register(), everything after that can be locked.
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2025-11-14 17:13:55 UTC
I don't see any reason to move the kqueue lock up, BTW.
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2025-11-14 17:51:38 UTC
https://reviews.freebsd.org/D53762
Comment 4 Qiu-ji Chen 2025-11-14 18:27:30 UTC
(In reply to Mark Johnston from comment #2)

Thanks, the current fix looks good to me.

My earlier suggestion to move KQ_LOCK up was only for consistency with the non-NOTE_TRACK path and to avoid the extra lock/unlock done by KNOTE_ACTIVATE(kn, 0). It's a small detail, but removing one lock cycle might be slightly cleaner and marginally cheaper. Otherwise the patch is fine as-is.
Comment 5 commit-hook freebsd_committer freebsd_triage 2025-11-18 16:25:22 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d795c753e262b97a93dc353aa66b858e1b1969d1

commit d795c753e262b97a93dc353aa66b858e1b1969d1
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2025-11-18 14:22:04 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2025-11-18 16:24:21 +0000

    kevent: Hold the knlist mutex when invoking f_event(NOTE_FORK)

    In general f_event is supposed to be called with the knlist mutex held,
    so lock it earlier to follow this protocol.  Also make sure that the
    update to kn_fflags is synchronized.

    Lock the kqueue itself earlier in the case where the knote is activated,
    to avoid locking and unlocking the kqueue twice.

    PR:             291005
    Reported by:    Qiu-ji Chen <chenqiuji666@gmail.com>
    Reviewed by:    kib
    MFC after:      1 week
    Differential Revision:  https://reviews.freebsd.org/D53762

 sys/kern/kern_event.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)
Comment 6 Qiu-ji Chen 2025-11-20 07:30:22 UTC
(In reply to Mark Johnston from comment #3)

I think this commit solves this problem, thank you.
Comment 7 commit-hook freebsd_committer freebsd_triage 2025-11-25 14:03:13 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f8bf6f81d6c2b73b065befedddac3bf29f2963de

commit f8bf6f81d6c2b73b065befedddac3bf29f2963de
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2025-11-18 14:22:04 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2025-11-25 13:57:14 +0000

    kevent: Hold the knlist mutex when invoking f_event(NOTE_FORK)

    In general f_event is supposed to be called with the knlist mutex held,
    so lock it earlier to follow this protocol.  Also make sure that the
    update to kn_fflags is synchronized.

    Lock the kqueue itself earlier in the case where the knote is activated,
    to avoid locking and unlocking the kqueue twice.

    PR:             291005
    Reported by:    Qiu-ji Chen <chenqiuji666@gmail.com>
    Reviewed by:    kib
    MFC after:      1 week
    Differential Revision:  https://reviews.freebsd.org/D53762

    (cherry picked from commit d795c753e262b97a93dc353aa66b858e1b1969d1)

 sys/kern/kern_event.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)
Comment 8 commit-hook freebsd_committer freebsd_triage 2025-11-25 14:03:13 UTC
A commit in branch stable/15 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=23ddcd227ab71b1d687f9ea298696f7d082c9cd5

commit 23ddcd227ab71b1d687f9ea298696f7d082c9cd5
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2025-11-18 14:22:04 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2025-11-25 14:02:20 +0000

    kevent: Hold the knlist mutex when invoking f_event(NOTE_FORK)

    In general f_event is supposed to be called with the knlist mutex held,
    so lock it earlier to follow this protocol.  Also make sure that the
    update to kn_fflags is synchronized.

    Lock the kqueue itself earlier in the case where the knote is activated,
    to avoid locking and unlocking the kqueue twice.

    PR:             291005
    Reported by:    Qiu-ji Chen <chenqiuji666@gmail.com>
    Reviewed by:    kib
    MFC after:      1 week
    Differential Revision:  https://reviews.freebsd.org/D53762

    (cherry picked from commit d795c753e262b97a93dc353aa66b858e1b1969d1)

 sys/kern/kern_event.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)
Comment 9 Mark Johnston freebsd_committer freebsd_triage 2025-11-25 14:12:41 UTC
Thank you for the report.