281825 – SDT tracepoints are not cleaned up when a module is unloaded

Bug 281825 - SDT tracepoints are not cleaned up when a module is unloaded

Summary: SDT tracepoints are not cleaned up when a module is unloaded

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	Unspecified
Hardware:	Any Any

Importance:	--- Affects Some People
Assignee:	John Baldwin

URL:	https://reviews.freebsd.org/D46890
Keywords:

Depends on:
Blocks:

Reported:	2024-10-02 20:39 UTC by John Baldwin
Modified:	2024-10-26 13:01 UTC (History)
CC List:	3 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description John Baldwin freebsd_committer

2024-10-02 20:39:52 UTC

Kernel modules may reference SDT probes defined in another module (or the kernel itself).  A specific example of this are all the mbuf probes in <sys/mbuf.h> for functions like m_get().  Kernel modules which use these inline functions will include a tracepoint that gets registered during kldload in sdt_kld_load_probes.  However, sdt_kldunload_try() doesn't cleanup any of the state initialized in sdt_kld_load_probes, only the state initialized in set_kld_load_providers().  As a result, this can leave dangling pointers (e.g. in the tp->probe->tracepoint_list) when a module is unloaded.

The panic I've seen is when re-loading a previously-unloaded module that crashes in sdt_kld_load_probes() when it walks off an invalid pointer in the STAILQ_INSERT_TAIL of the tracepoint_list.  However, that panic is a bit finicky and not easy to reproduce.  A simpler reproducer is below:

kldload sdt
kldload nvmft
kldunload nvmft
dtrace -n m-get

Panic:

Fatal trap 12: page fault while in kernel mode
cpuid = 6; apic id = 06
fault virtual address   = 0xffffffff8283d008
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff82816b96
stack pointer           = 0x28:0xfffffe00dc1e9730
frame pointer           = 0x28:0xfffffe00dc1e9740
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 1115 (dtrace)
rdi: 0000000000000001 rsi: ffffffff80f3a4fc rdx: 000000000000000f
rcx: 0000000080040033  r8: 0000000000000016  r9: 00000000000f4240
rax: 0000000080050033 rbx: fffffe00dc1e98e8 rbp: fffffe00dc1e9740
r10: 0000000000000000 r11: 0000000000000000 r12: 0000000000000000
r13: ffffffff82816b20 r14: ffffffff8283d000 r15: 0000000000000000
trap number             = 12
panic: page fault
cpuid = 6
time = 1727901518
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00dc1e9400
vpanic() at vpanic+0x13f/frame 0xfffffe00dc1e9530
panic() at panic+0x43/frame 0xfffffe00dc1e9590
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00dc1e95f0
trap_pfault() at trap_pfault+0xa0/frame 0xfffffe00dc1e9660
calltrap() at calltrap+0x8/frame 0xfffffe00dc1e9660
--- trap 0xc, rip = 0xffffffff82816b96, rsp = 0xfffffe00dc1e9730, rbp = 0xfffffe00dc1e9740 ---
sdt_probe_update_cb() at sdt_probe_update_cb+0x76/frame 0xfffffe00dc1e9740
smp_rendezvous_action() at smp_rendezvous_action+0x9d/frame 0xfffffe00dc1e9780
smp_rendezvous_cpus() at smp_rendezvous_cpus+0x145/frame 0xfffffe00dc1e9840
smp_rendezvous() at smp_rendezvous+0x34/frame 0xfffffe00dc1e98d0
sdt_enable() at sdt_enable+0xae/frame 0xfffffe00dc1e9910
dtrace_ecb_create_enable() at dtrace_ecb_create_enable+0xee8/frame 0xfffffe00dc1e99a0
dtrace_match() at dtrace_match+0x444/frame 0xfffffe00dc1e9a80
dtrace_enabling_match() at dtrace_enabling_match+0xc8/frame 0xfffffe00dc1e9b10
dtrace_ioctl() at dtrace_ioctl+0x178b/frame 0xfffffe00dc1e9c00
devfs_ioctl() at devfs_ioctl+0xd1/frame 0xfffffe00dc1e9c50
vn_ioctl() at vn_ioctl+0xbc/frame 0xfffffe00dc1e9cc0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00dc1e9ce0
kern_ioctl() at kern_ioctl+0x286/frame 0xfffffe00dc1e9d40
sys_ioctl() at sys_ioctl+0x12d/frame 0xfffffe00dc1e9e00
amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00dc1e9f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00dc1e9f30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0xc9cf811a9fa, rsp = 0xc9ced0c5c28, rbp = 0xc9ced0c5c70 ---

Comment 1 John Baldwin freebsd_committer

2024-10-03 18:06:13 UTC

This seems to reproduce the original panic reported to me:

kldload nvmft
kldload dtraceall
kldunload nvmft
kldload ctl
kldload nvmft

Fatal trap 12: page fault while in kernel mode
cpuid = 11; apic id = 0b
fault virtual address   = 0xffffffff8281d078
fault code              = supervisor write data, protection violation
instruction pointer     = 0x20:0xffffffff828f4761
stack pointer           = 0x28:0xfffffe00dc30b8f0
frame pointer           = 0x28:0xfffffe00dc30ba80
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1084 (kldload)
rdi: ffffffff8285e70e rsi: ffffffff8285ed86 rdx: 0000000000000000
rcx: ffffffff8281d078  r8: 0000000000000004  r9: 0000000000000000
rax: ffffffff82865018 rbx: ffffffff82865000 rbp: fffffe00dc30ba80
r10: 0000000000010000 r11: 0000000000000001 r12: fffff80008085c00
r13: fffff8013dd2a220 r14: fffff8003cee6628 r15: fffff80003f37000
trap number             = 12
panic: page fault
cpuid = 11
time = 1727978657
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00dc30b5c0
vpanic() at vpanic+0x13f/frame 0xfffffe00dc30b6f0
panic() at panic+0x43/frame 0xfffffe00dc30b750
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00dc30b7b0
trap_pfault() at trap_pfault+0xa0/frame 0xfffffe00dc30b820
calltrap() at calltrap+0x8/frame 0xfffffe00dc30b820
--- trap 0xc, rip = 0xffffffff828f4761, rsp = 0xfffffe00dc30b8f0, rbp = 0xfffffe00dc30ba80 ---
sdt_kld_load_probes() at sdt_kld_load_probes+0x3c1/frame 0xfffffe00dc30ba80
linker_load_module() at linker_load_module+0xe90/frame 0xfffffe00dc30bd80
kern_kldload() at kern_kldload+0x16e/frame 0xfffffe00dc30bdd0
sys_kldload() at sys_kldload+0x5c/frame 0xfffffe00dc30be00
amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00dc30bf30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00dc30bf30
--- syscall (304, FreeBSD ELF64, kldload), rip = 0x1085b13898da, rsp = 0x1085aeefe008, rbp = 0x1085aeefe580 ---

Comment 2 John Baldwin freebsd_committer

2024-10-03 18:07:58 UTC

Patch at review fixes both panics for me.

Comment 3 commit-hook freebsd_committer

2024-10-16 17:52:01 UTC

A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=47f49dd4bbb4a72e53d31046964ce3c111ee0d12

commit 47f49dd4bbb4a72e53d31046964ce3c111ee0d12
Author:     John Baldwin <jhb@FreeBSD.org>
AuthorDate: 2024-10-16 17:50:37 +0000
Commit:     John Baldwin <jhb@FreeBSD.org>
CommitDate: 2024-10-16 17:50:37 +0000

    sdt: Tear down probes in kernel modules during kldunload

    Previously only providers in kernel modules were removed leaving
    dangling pointers to tracepoints, etc. in unloaded kernel modules.

    PR:             281825
    Reported by:    Sony Arpita Das <sonyarpitad@chelsio.com>
    Reviewed by:    markj
    Fixes:          ddf0ed09bd8f sdt: Implement SDT probes using hot-patching
    Sponsored by:   Chelsio Communications
    Differential Revision:  https://reviews.freebsd.org/D46890

 sys/cddl/dev/sdt/sdt.c | 111 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 102 insertions(+), 9 deletions(-)

Comment 4 Mark Johnston freebsd_committer

2024-10-26 13:01:53 UTC

I believe this can be closed, as the commit which introduced the bug was not MFCed.