|Summary:||Kernel cannot fork new process after calling pmc_deatch|
|Product:||Base System||Reporter:||Dom <dom>|
|Component:||kern||Assignee:||freebsd-bugs (Nobody) <bugs>|
|Severity:||Affects Some People||CC:||cem, dom, emaste|
Description Dom 2018-03-28 17:16:12 UTC
When the kernel has the hwpmc module loaded (and likely when compiled with hwpmc support too) calling pmc_detach with a pid of 0 (or NULL) followed by calling pmc_release prevents the OS from forking any new processes for any user afterwards - existing processes seem to continue to run, but the system won't even exec "reboot". Nothing is printed to the console or logs. The manpage for pmc_attach(3) states that: Function pmc_detach() is used to detach a process scope PMC specified by argument pmcid from a process specified by argument pid. Argument pid may be zero to denote the current process. This behaviour seems to be fine for pmc_attach, but not for pmc_detach. If security.bsd.unprivileged_proc_debug is non-zero (the default?) this can be triggered from a userland process. Tested on FreeBSD 11.1-RELEASE-p8 running on amd64 with hwpmc loaded at runtime but probably applies to other versions and architectures. Reproducer at https://github.com/domodwyer/pmc-crash/blob/master/pmc-crash.c
Comment 1 Dom 2018-03-28 17:29:05 UTC
Sorry, "triggered from a userland process" should obviously be "triggered from an unprivileged process". It also seems this bug is triggered when using *any* pid, not just 0.
Comment 2 Conrad Meyer 2018-03-28 23:21:08 UTC
The situation described sounds like a deadlock or livelock. If you reproduce it with an INVARIANTS+WITNESS kernel, do you get a LOR warning? Does the pmc-crash program return/complete? Basic investigation: The userspace libpmc functions pmc_detach/pmc_release() translate pretty directly into the (gigantic) kernel syscall pmc_syscall_handler(), PMC_OP_PMCDETACH and PMC_OP_PMCRELEASE. If 0 is passed as pid, the current thread's pid is substituted. pfind() acquires proc lock after pmc sx xlock. The proc lock is dropped and then the process is detached via pmc_detach_process(). Are you sure if pmc_release() is required? It doesn't look like it does anything special with locking. I have not investigated deeply.
Comment 3 Dom 2018-03-29 14:35:03 UTC
Hi Conrad, thanks for the quick reply. I can't see any LORs when reproducing this issue and I can't seem to dtrace my way to a culprit either. If pmc_release() is called the system livelocks immediately every time, however if it's left out the first run of pmc-crash does not crash, and the second run will either force an immediate reboot (again with nothing in the console) or run successfully, but attempting to unload hwpmc livelocks. If pmc-crash successfully exits subsequent runs pmc_allocate() returns EINVAL. At least one pmc_read() must be performed for either of these livelocks to occur. After either, pressing the power button starts to cleanly power off but deadlocks after geli detaches my encrypted swap. I've also discovered two almost definitely unrelated LORs: - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227065 - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=196799
Comment 4 Conrad Meyer 2018-03-29 14:52:33 UTC
Comment 5 Dom 2021-06-20 21:07:57 UTC
I can confirm this is still reproducible on 13.0-RELEASE-p2.