Summary: | pthread_cancel() doesn't cancel a thread that's currently in pause() | ||
---|---|---|---|
Product: | Base System | Reporter: | наб <nabijaczleweli> |
Component: | threads | Assignee: | freebsd-threads (Nobody) <threads> |
Status: | New --- | ||
Severity: | Affects Some People | CC: | kib, nabijaczleweli, vedad |
Priority: | --- | ||
Version: | 13.3-RELEASE | ||
Hardware: | amd64 | ||
OS: | Any |
Description
наб
2024-12-03 16:17:52 UTC
Sending the process SIGUSR1 does also seem to make it unstuck. Sending the process SIGUSR1 does also seem to make it unstuck, so. There must be something else. Or the bug is fixed in 13.4+. I tried the proposed test case on stable/13 (several weeks old), stable/14 (two weeks old), and HEAD (fresh). In all cases cancellation worked. I don't think I made a solid enough point of this in the OP, but it does work most of the time. The described deadlock only happens like once a week or two on average, accd'g to the original reporter. (In reply to наб from comment #4) Are you stating that you observed the proposed test program hanging approx. in a week if running in loop? Or that you think that the proposed program is similar to the code that hangs, but you did not checked the standalone test? No, the original reporter observed his jobs using zfs send hanging around 1/wk. I haven't tested the minimised version under that angle. I believe the minimised version is equivalent (to the precision of the sleep(1); and fprintf(stderr, "unpaused\n"); calls to what zfs send does. Hi, I'm one of the reporters. This is occurring on 14.1-RELEASE, on a QEMU-powered vps, roughly once a week on a sequential `zfs send` x 30 datasets, scheduled to run once per hour. Never had issues with a similar replication schedule on other servers running 14.1-RELEASE (including two other QEMU-powered & 8 bare-metal), nor previous releases. |