reproduce the issue on the latest 15.0-CURRENT ``` [root@freebsd-main ~]# uname -a FreeBSD freebsd-main 15.0-CURRENT FreeBSD 15.0-CURRENT #13 main-n269920-7929aeebbde1: Mon May 6 20:44:10 CST 2024 root@freebsd-main:/usr/obj/root/workspace/freebsd-src/amd64.amd64/sys/GENERIC amd64 ``` test steps: select hpet as timecounter hardware ``` [root@freebsd-main ~]# sysctl kern.timecounter.hardware=HPET kern.timecounter.hardware: TSC -> HPET ``` when HPET is chosen as timecounter, libc's VDSO implementation will map `/dev/hpet0` into process's mmap, then we could observe `cdev->si_refcount` leakage occurs ``` [root@freebsd-main ~]# dtrace -n 'fbt::dev_ref:entry {printf("[%s]: invoke dev_ref: %s, refcount:%d", execname, args[0]->si_name, args[0]->si_refcount)}' dtrace: description 'fbt::dev_ref:entry ' matched 1 probe CPU ID FUNCTION:NAME 1 43845 dev_ref:entry [sshd]: invoke dev_ref: hpet0, refcount:11 0 43845 dev_ref:entry [sshd]: invoke dev_ref: hpet0, refcount:12 0 43845 dev_ref:entry [bash]: invoke dev_ref: hpet0, refcount:13 1 43845 dev_ref:entry [resizewin]: invoke dev_ref: hpet0, refcount:14 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:15 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:16 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:17 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:18 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:19 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:20 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:21 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:22 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:23 1 43845 dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:24 1 43845 dev_ref:entry [sh]: invoke dev_ref: hpet0, refcount:25 1 43845 dev_ref:entry [atrun]: invoke dev_ref: hpet0, refcount:26 ``` this cdev->si_refcount leak might have kernel panic risk if enable KASSERT(), see dev_rel() ``` void dev_rel(struct cdev *dev) { int flag = 0; dev_lock_assert_unlocked(); dev_lock(); dev->si_refcount--; KASSERT(dev->si_refcount >= 0, ("dev_rel(%s) gave negative count", devtoname(dev))); if (dev->si_devsw == NULL && dev->si_refcount == 0) { LIST_REMOVE(dev, si_list); flag = 1; } dev_unlock(); if (flag) devfs_free(dev); } ```
Why do you claim that this is a leakage?
(In reply to Konstantin Belousov from comment #1) When a process exits, it's expected to release /dev/hpet0 via munmap, which should decrease the si_refcount accordingly. However, it appears this decrement does not occur. If the count isn't properly reduced, it could erroneously overflow and potentially turn negative. That's why I believe there is a leakage. Please correct me if I've misunderstood the scenario.
No, si_refcount is not a reference existing during the mapping lifetime. The si_refcount is the transient refcount taken while a kernel thread operates on the cdev. For instance, in case of d_mmap, it is taken by dev_refthread() in old_dev_pager_fault() around the call to csw->d_mmap, and released by the nearby call(s) to dev_relthread().
Normally, when a process exits, the si_refcount associated with /dev/hpet0 should return to its expected value, assuming it’s incremented and decremented correctly during cdev ops. However, we can inflate the si_refcount using a simple bash script below: while true; do uname >/dev/null;done This unusually high count suggests that the si_refcount isn't being managed properly ``` [root@freebsd-main ~]# while true; do uname >/dev/null;done ^C [root@freebsd-main ~]# dtrace -n 'fbt::dev_ref:entry {printf("[%s]: invoke dev_ref: %s, refcount:%d", execname, args[0]->si_name, args[0]->si_refcount)}' dtrace: description 'fbt::dev_ref:entry ' matched 1 probe CPU ID FUNCTION:NAME 1 43845 dev_ref:entry [uname]: invoke dev_ref: hpet0, refcount:42439 ```
What I see, even with your script running, on a machine with the timecounter set to HPET: (gdb) p/x (((struct hpet_softc *)((char *)timecounter - (unsigned long)(&(((struct hpet_softc *)0))->tc)))->pdev)->si_threadcount $16 = 0x0
(In reply to Konstantin Belousov from comment #5) will you also check si_refcount? I believe si_threadcount is correct but si_refcount is not. below is my dtrace output: [root@freebsd-main ~]# dtrace -n 'fbt::dev_ref:entry {printf("[%s]: invoke dev_ref: %s, refcount:%d threadcount:%d", execname, args[0]->si_name, args[0]->si_refcount, args[0]->si_threadcount)}' dtrace: description 'fbt::dev_ref:entry ' matched 1 probe CPU ID FUNCTION:NAME 1 43845 dev_ref:entry [uname]: invoke dev_ref: hpet0, refcount:42494 threadcount:2 1 43845 dev_ref:entry [sh]: invoke dev_ref: hpet0, refcount:42495 threadcount:2 1 43845 dev_ref:entry [atrun]: invoke dev_ref: hpet0, refcount:42496 threadcount:2
I see, thanks for the persistence. https://reviews.freebsd.org/D45113
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=e93404065177d6c909cd64bf7d74fe0d8df35edf commit e93404065177d6c909cd64bf7d74fe0d8df35edf Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-05-07 13:23:28 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2024-05-12 01:13:00 +0000 cdev_pager_allocate(): ensure that the cdev_pager_ops ctr is called only once per allocated vm_object. Otherwise, since constructors are not idempotent, we e.g. leak device reference in case of non-managed pager. PR: 278826 Reported by: Austin Zhang <austin.zhang@dell.com> Reviewed by: alc, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D45113 sys/vm/device_pager.c | 70 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 51 insertions(+), 19 deletions(-)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4018bcdea8e1934eedba4b800e6feb2099b1091d commit 4018bcdea8e1934eedba4b800e6feb2099b1091d Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-05-07 13:23:28 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2024-05-19 00:57:54 +0000 cdev_pager_allocate(): ensure that the cdev_pager_ops ctr is called only once PR: 278826 (cherry picked from commit e93404065177d6c909cd64bf7d74fe0d8df35edf) sys/vm/device_pager.c | 70 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 51 insertions(+), 19 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=2eeb0e9fc1306f40ec684af1ea56a8871a9a3684 commit 2eeb0e9fc1306f40ec684af1ea56a8871a9a3684 Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-05-07 13:23:28 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2024-05-19 00:59:13 +0000 cdev_pager_allocate(): ensure that the cdev_pager_ops ctr is called only once PR: 278826 (cherry picked from commit e93404065177d6c909cd64bf7d74fe0d8df35edf) sys/vm/device_pager.c | 70 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 51 insertions(+), 19 deletions(-)