Bug 278826 - [hpet] cdev->si_refcount leakage when enable hpet as timecounter hardware
Summary: [hpet] cdev->si_refcount leakage when enable hpet as timecounter hardware
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 15.0-CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-07 05:19 UTC by Austin Zhang
Modified: 2024-05-19 01:01 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Austin Zhang 2024-05-07 05:19:57 UTC
reproduce the issue on the latest 15.0-CURRENT
```
[root@freebsd-main ~]# uname -a
FreeBSD freebsd-main 15.0-CURRENT FreeBSD 15.0-CURRENT #13 main-n269920-7929aeebbde1: Mon May  6 20:44:10 CST 2024     root@freebsd-main:/usr/obj/root/workspace/freebsd-src/amd64.amd64/sys/GENERIC amd64
```

test steps:
select hpet as timecounter hardware
```
[root@freebsd-main ~]# sysctl kern.timecounter.hardware=HPET
kern.timecounter.hardware: TSC -> HPET
```
when HPET is chosen as timecounter, libc's VDSO implementation will map `/dev/hpet0` into process's mmap, then we could observe `cdev->si_refcount` leakage occurs

```
[root@freebsd-main ~]# dtrace -n 'fbt::dev_ref:entry {printf("[%s]: invoke dev_ref: %s, refcount:%d", execname, args[0]->si_name, args[0]->si_refcount)}'
dtrace: description 'fbt::dev_ref:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  1  43845                    dev_ref:entry [sshd]: invoke dev_ref: hpet0, refcount:11
  0  43845                    dev_ref:entry [sshd]: invoke dev_ref: hpet0, refcount:12
  0  43845                    dev_ref:entry [bash]: invoke dev_ref: hpet0, refcount:13
  1  43845                    dev_ref:entry [resizewin]: invoke dev_ref: hpet0, refcount:14
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:15
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:16
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:17
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:18
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:19
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:20
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:21
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:22
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:23
  1  43845                    dev_ref:entry [sysctl]: invoke dev_ref: hpet0, refcount:24
  1  43845                    dev_ref:entry [sh]: invoke dev_ref: hpet0, refcount:25
  1  43845                    dev_ref:entry [atrun]: invoke dev_ref: hpet0, refcount:26
```

this cdev->si_refcount leak might have kernel panic risk if enable KASSERT(), see dev_rel()
```
void
dev_rel(struct cdev *dev)
{
	int flag = 0;

	dev_lock_assert_unlocked();
	dev_lock();
	dev->si_refcount--;
	KASSERT(dev->si_refcount >= 0,
	    ("dev_rel(%s) gave negative count", devtoname(dev)));
	if (dev->si_devsw == NULL && dev->si_refcount == 0) {
		LIST_REMOVE(dev, si_list);
		flag = 1;
	}
	dev_unlock();
	if (flag)
		devfs_free(dev);
}
```
Comment 1 Konstantin Belousov freebsd_committer freebsd_triage 2024-05-07 09:00:25 UTC
Why do you claim that this is a leakage?
Comment 2 Austin Zhang 2024-05-07 09:14:48 UTC
(In reply to Konstantin Belousov from comment #1)

When a process exits, it's expected to release /dev/hpet0 via munmap, which should decrease the si_refcount accordingly. However, it appears this decrement does not occur. If the count isn't properly reduced, it could erroneously overflow and potentially turn negative. That's why I believe there is a leakage.
Please correct me if I've misunderstood the scenario.
Comment 3 Konstantin Belousov freebsd_committer freebsd_triage 2024-05-07 09:42:45 UTC
No, si_refcount is not a reference existing during the mapping lifetime.
The si_refcount is the transient refcount taken while a kernel thread operates
on the cdev.  For instance, in case of d_mmap, it is taken by dev_refthread()
in old_dev_pager_fault() around the call to csw->d_mmap, and released by the
nearby call(s) to dev_relthread().
Comment 4 Austin Zhang 2024-05-07 11:47:34 UTC
Normally, when a process exits, the si_refcount associated with /dev/hpet0 should return to its expected value, assuming it’s incremented and decremented correctly during cdev ops. However, we can inflate the si_refcount using a simple bash script below:

while true; do uname >/dev/null;done


This unusually high count suggests that the si_refcount isn't being managed properly
```
[root@freebsd-main ~]# while true; do uname >/dev/null;done
^C
[root@freebsd-main ~]# dtrace -n 'fbt::dev_ref:entry {printf("[%s]: invoke dev_ref: %s, refcount:%d", execname, args[0]->si_name, args[0]->si_refcount)}'
dtrace: description 'fbt::dev_ref:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  1  43845                    dev_ref:entry [uname]: invoke dev_ref: hpet0, refcount:42439
```
Comment 5 Konstantin Belousov freebsd_committer freebsd_triage 2024-05-07 12:20:46 UTC
What I see, even with your script running, on a machine with the timecounter
set to HPET:

(gdb) p/x (((struct hpet_softc *)((char *)timecounter - (unsigned long)(&(((struct hpet_softc *)0))->tc)))->pdev)->si_threadcount
$16 = 0x0
Comment 6 Austin Zhang 2024-05-07 12:28:54 UTC
(In reply to Konstantin Belousov from comment #5)

will you also check si_refcount? I believe si_threadcount is correct but si_refcount is not.

below is my dtrace output:
[root@freebsd-main ~]# dtrace -n 'fbt::dev_ref:entry {printf("[%s]: invoke dev_ref: %s, refcount:%d threadcount:%d", execname, args[0]->si_name, args[0]->si_refcount, args[0]->si_threadcount)}'
dtrace: description 'fbt::dev_ref:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  1  43845                    dev_ref:entry [uname]: invoke dev_ref: hpet0, refcount:42494 threadcount:2
  1  43845                    dev_ref:entry [sh]: invoke dev_ref: hpet0, refcount:42495 threadcount:2
  1  43845                    dev_ref:entry [atrun]: invoke dev_ref: hpet0, refcount:42496 threadcount:2
Comment 7 Konstantin Belousov freebsd_committer freebsd_triage 2024-05-07 13:28:13 UTC
I see, thanks for the persistence.

https://reviews.freebsd.org/D45113
Comment 8 commit-hook freebsd_committer freebsd_triage 2024-05-12 01:14:19 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e93404065177d6c909cd64bf7d74fe0d8df35edf

commit e93404065177d6c909cd64bf7d74fe0d8df35edf
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2024-05-07 13:23:28 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2024-05-12 01:13:00 +0000

    cdev_pager_allocate(): ensure that the cdev_pager_ops ctr is called only once

    per allocated vm_object.  Otherwise, since constructors are not
    idempotent, we e.g. leak device reference in case of non-managed pager.

    PR:     278826
    Reported by:    Austin Zhang <austin.zhang@dell.com>
    Reviewed by:    alc, markj
    Tested by:      pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week
    Differential revision:  https://reviews.freebsd.org/D45113

 sys/vm/device_pager.c | 70 +++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 51 insertions(+), 19 deletions(-)
Comment 9 commit-hook freebsd_committer freebsd_triage 2024-05-19 00:58:32 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4018bcdea8e1934eedba4b800e6feb2099b1091d

commit 4018bcdea8e1934eedba4b800e6feb2099b1091d
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2024-05-07 13:23:28 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2024-05-19 00:57:54 +0000

    cdev_pager_allocate(): ensure that the cdev_pager_ops ctr is called only once

    PR:     278826

    (cherry picked from commit e93404065177d6c909cd64bf7d74fe0d8df35edf)

 sys/vm/device_pager.c | 70 +++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 51 insertions(+), 19 deletions(-)
Comment 10 commit-hook freebsd_committer freebsd_triage 2024-05-19 01:00:35 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=2eeb0e9fc1306f40ec684af1ea56a8871a9a3684

commit 2eeb0e9fc1306f40ec684af1ea56a8871a9a3684
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2024-05-07 13:23:28 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2024-05-19 00:59:13 +0000

    cdev_pager_allocate(): ensure that the cdev_pager_ops ctr is called only once

    PR:     278826

    (cherry picked from commit e93404065177d6c909cd64bf7d74fe0d8df35edf)

 sys/vm/device_pager.c | 70 +++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 51 insertions(+), 19 deletions(-)