Bug 283101 - pthread_cancel() doesn't cancel a thread that's currently in pause()
Summary: pthread_cancel() doesn't cancel a thread that's currently in pause()
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: threads (show other bugs)
Version: 13.3-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-threads (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-12-03 16:17 UTC by наб
Modified: 2025-01-04 20:42 UTC (History)
5 users (show)

See Also:


Attachments
Reproducing pthread_cancel() deadlock (1.37 KB, text/plain)
2024-12-16 21:11 UTC, vedad
no flags Details
Runner for the pthread.c (193 bytes, application/x-shellscript)
2024-12-16 21:12 UTC, vedad
no flags Details
Debug SIGCANCEL delivery (943 bytes, patch)
2024-12-19 16:39 UTC, Konstantin Belousov
no flags Details | Diff
Supposed fix (1.21 KB, patch)
2024-12-24 14:35 UTC, Konstantin Belousov
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description наб 2024-12-03 16:17:52 UTC
Downstream bug: https://github.com/openzfs/zfs/issues/16731

On FreeBSD, zfs send sometimes hangs in a deadlock after pthread_cancel(tid); pthread_join(tid, NULL); while the thread is in pause().

The program looks something like this:

#include <pthread.h>
#include <signal.h>
#include <stdbool.h>
#include <stdio.h>
#include <unistd.h>

static void send_progress_thread_act(int sig, siginfo_t *info, void *ucontext) {}

static void timer_delete_cleanup(void *timer) {}

void *send_progress_thread(void *arg1) {
  const struct sigaction signal_action =
      {.sa_sigaction = send_progress_thread_act, .sa_flags = SA_SIGINFO};
  sigaction(SIGUSR1, &signal_action, NULL);

  pthread_cleanup_push(timer_delete_cleanup, NULL);

  for(;;) {
    pause();  // <- thread blocked here
    // zfs_send_progress(), time(), localtime_r(), setproctitle(), fprintf()
    fprintf(stderr, "unpaused\n");
  }

  pthread_cleanup_pop(true);
  return NULL;
}

int main() {
  pthread_t tid;
  pthread_create(&tid, NULL, send_progress_thread, NULL);

  sigset_t new;
  sigemptyset(&new);
  sigaddset(&new, SIGUSR1);
  pthread_sigmask(SIG_BLOCK, &new, NULL);

  sleep(1);  // some sort of work

  pthread_cancel(tid);
  pthread_join(tid, NULL);  // <- thread blocked here
}

(In reality there's also a timer that may send SIGUSR1s and SIGINFO is also consumed but this is largely irrelevant.)

The backtrace looks like this:

Thread 2 (LWP 259843 of process 95307):
#0  0x000000082872c6aa in _sigsuspend () from /lib/libc.so.7
#1  0x0000000825f47fb6 in ?? () from /lib/libthr.so.3
#2  0x00000008286a2a65 in pause () from /lib/libc.so.7
#3  0x0000000820e3d7d5 in send_progress_thread (arg=0x8208cf2c0) at lib/libzfs/libzfs_sendrecv.c:1018
#4  0x0000000825f3eb05 in ?? () from /lib/libthr.so.3
#5  0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x82ce7a000

Thread 1 (LWP 100403 of process 95307):
#0  0x0000000825f3be2c in ?? () from /lib/libthr.so.3
#1  0x0000000825f40a2e in ?? () from /lib/libthr.so.3
#2  0x0000000820e3da4f in send_progress_thread_exit (hdl=0x349d6143c000, ptid=0x349d61412700, oldmask=oldmask@entry=0x8208cf1b0) at lib/libzfs/libzfs_sendrecv.c:1074
#3  0x0000000820e40367 in dump_snapshot (zhp=0x349d6145a500, arg=arg@entry=0x8208d0ec8) at lib/libzfs/libzfs_sendrecv.c:1291
#4  0x0000000820e3014a in zfs_iter_snapshots_sorted_v2 (zhp=zhp@entry=0x349d6145a000, flags=flags@entry=0, callback=0x820e3ff30 <dump_snapshot>, data=data@entry=0x8208d0ec8, min_txg=8889024, max_txg=8890857) at lib/libzfs/libzfs_iter.c:352
#5  0x0000000820e3fd34 in dump_filesystem (zhp=0x349d6145a000, sdd=0x8208d0ec8) at lib/libzfs/libzfs_sendrecv.c:1364
#6  0x0000000820e3ec99 in dump_filesystems (rzhp=0x349d61412700, rzhp@entry=0x349d6145a000, sdd=0x2, sdd@entry=0x8208d0ec8) at lib/libzfs/libzfs_sendrecv.c:1432
#7  0x0000000820e3e413 in zfs_send_cb_impl (zhp=0x349d6145a000, fromsnap=<optimized out>, tosnap=0x349d61419106 "zpair-02749186397145403432-cartman-674eaa54-537c2abe", flags=0x8208d16f0, outfd=1, filter_func=0x0, cb_arg=0x8208d16d0, debugnvp=0x0) at lib/libzfs/libzfs_sendrecv.c:2506
#8  0x0000000820e3c26e in zfs_send_cb (outfd=0, arg=<optimized out>) at lib/libzfs/libzfs_sendrecv.c:2559
#9  0x0000000820e3c237 in zfs_send (zhp=zhp@entry=0x349d6145a000, fromsnap=fromsnap@entry=0x349d614190c0 "zpair-02749186397145403432-cartman-674e8e34-16208f9f", tosnap=tosnap@entry=0x349d61419106 "zpair-02749186397145403432-cartman-674eaa54-537c2abe", flags=0x0, flags@entry=0x8208d16f0, outfd=0, outfd@entry=1, filter_func=0x0, cb_arg=0x8208d16d0, debugnvp=0x0) at lib/libzfs/libzfs_sendrecv.c:2577
#10 0x0000000000215754 in zfs_do_send (argc=<optimized out>, argv=<optimized out>) at cmd/zfs/zfs_main.c:5099
#11 0x000000000020fd18 in main (argc=6, argv=0x8208d1850) at cmd/zfs/zfs_main.c:9249

This is obviously wrong, since the thread is /sleeping in a cancellation point/. 
Adding pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL); to the top of the thread doesn't change anything. In this case, it's obviously wrong because pthread_cancel()... didn't cancel the thread.
Comment 1 наб 2024-12-03 16:30:49 UTC
Sending the process SIGUSR1 does also seem to make it unstuck.
Comment 2 наб 2024-12-03 16:31:15 UTC
Sending the process SIGUSR1 does also seem to make it unstuck, so.
Comment 3 Konstantin Belousov freebsd_committer freebsd_triage 2024-12-03 19:51:11 UTC
There must be something else.  Or the bug is fixed in 13.4+.

I tried the proposed test case on stable/13 (several weeks old), stable/14
(two weeks old), and HEAD (fresh).  In all cases cancellation worked.
Comment 4 наб 2024-12-03 20:12:02 UTC
I don't think I made a solid enough point of this in the OP, but it does work most of the time. The described deadlock only happens like once a week or two on average, accd'g to the original reporter.
Comment 5 Konstantin Belousov freebsd_committer freebsd_triage 2024-12-03 20:35:24 UTC
(In reply to наб from comment #4)
Are you stating that you observed the proposed test program hanging approx. in
a week if running in loop?
Or that you think that the proposed program is similar to the code that hangs,
but you did not checked the standalone test?
Comment 6 наб 2024-12-03 20:47:42 UTC
No, the original reporter observed his jobs using zfs send hanging around 1/wk.
I haven't tested the minimised version under that angle. I believe the minimised version is equivalent (to the precision of the sleep(1); and fprintf(stderr, "unpaused\n"); calls to what zfs send does.
Comment 7 vedad 2024-12-03 21:40:15 UTC
Hi, I'm one of the reporters.

This is occurring on 14.1-RELEASE, on a QEMU-powered vps, roughly once a week on a sequential `zfs send` x 30 datasets, scheduled to run once per hour.

Never had issues with a similar replication schedule on other servers running 14.1-RELEASE (including two other QEMU-powered & 8 bare-metal), nor previous releases.
Comment 8 vedad 2024-12-16 21:11:20 UTC
Created attachment 255901 [details]
Reproducing pthread_cancel() deadlock
Comment 9 vedad 2024-12-16 21:12:32 UTC
Created attachment 255902 [details]
Runner for the pthread.c
Comment 10 vedad 2024-12-16 21:14:44 UTC
Hi, I'm able to reproduce this using the attached pthread.c and run shell script.

Reproduced on:
* QEMU
* Intel(R) Celeron(R) CPU G3900T
* Intel(R) Xeon(R) CPU E5-2470 v2

Instructions:

// For quick results (usually under 10 minutes), invoke `./run` from a dozen
// consoles or GNU screen windows in parallel.
Comment 11 Konstantin Belousov freebsd_committer freebsd_triage 2024-12-16 22:03:03 UTC
If you have an easy to reproduce setup, please rebuild system libraries
with full debug info, like this:
# for x in lib/libsys liexec/rtld-elf lib/libc lib/libthr ; do make -C /usr/src/$x DEBUG_FLAGS=-g clean all install; done
then reproduce it and show the backtraces from both threads.  Also useful are
the printouts of 'struct thread' for them.
Comment 12 vedad 2024-12-19 09:50:39 UTC
(In reply to Konstantin Belousov from comment #11)

(gdb) thread apply all bt full

Thread 2 (LWP 215869 of process 56582):
#0  _sigsuspend () at _sigsuspend.S:4
No locals.
#1  0x000000082268cfb6 in __thr_sigsuspend (set=0x8269d3f90) at /usr/src/lib/libthr/thread/thr_sig.c:688
        newset = {__bits = {0, 0, 0, 0}}
        curthread = 0x3a2768013500
        old = 0
        ret = <optimized out>
#2  0x0000000823a09a65 in __pause () at /usr/src/lib/libc/gen/pause.c:51
        oset = {__bits = {0, 0, 0, 0}}
#3  0x0000000000201ac5 in thread ()
No symbol table info available.
#4  0x0000000822683b05 in thread_start (curthread=0x3a2768013500) at /usr/src/lib/libthr/thread/thr_create.c:289
        set = {__bits = {0, 0, 660918232, 8}}
#5  0x0000000000000000 in ?? ()
No symbol table info available.
Backtrace stopped: Cannot access memory at address 0x8269d4000

Thread 1 (LWP 100557 of process 56582):
#0  _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:38
No locals.
#1  0x0000000822690cdb in _thr_umtx_wait (mtx=0x3a2768013500, id=215869, timeout=0x0) at /usr/src/lib/libthr/thread/thr_umtx.c:198
No locals.
#2  0x0000000822685a2e in join_common (pthread=0x3a2768013500, thread_return=0x820817e20, abstime=0x0, peek=<optimized out>) at /usr/src/lib/libthr/thread/thr_join.c:144
        __cup = {prev = 0x0, routine = 0x822685ba0 <backout_join>, routine_arg = 0x3a2768013500, onheap = 0}
        ts = {tv_sec = 34905095632, tv_nsec = 34937006492}
        ts2 = {tv_sec = 100557, tv_nsec = 63940923039744}
        curthread = 0x3a2768012000
        ret = <optimized out>
        tid = 215869
        tsp = 0x34b3d
        tmp = <optimized out>
#3  0x0000000000201a4a in main ()
No symbol table info available.
(gdb) frame 2
#2  0x0000000822685a2e in join_common (pthread=0x3a2768013500, thread_return=0x820817e20, abstime=0x0, peek=<optimized out>) at /usr/src/lib/libthr/thread/thr_join.c:144
144                     ret = _thr_umtx_wait(&pthread->tid, tid, tsp);
(gdb) print *pthread
$2 = {tid = 215869, lock = {m_owner = 0, m_flags = 0, m_ceilings = {0, 0}, m_rb_lnk = 0, m_spare = {0, 0}}, cycle = 0, locklevel = 0, critical_count = 0, sigblock = 0, fsigblock = 0, tle = {tqe_next = 0x3a2768012000, tqe_prev = 0x822694580 <_thread_list>}, gcle = {
    tqe_next = 0x0, tqe_prev = 0x0}, hle = {le_next = 0x0, le_prev = 0x822694e58}, wle = {tqe_next = 0x0, tqe_prev = 0x0}, refcount = 1, start_routine = 0x201ab0 <thread>, arg = 0x0, attr = {sched_policy = 2, sched_inherit = 4, prio = 0, suspend = 0, flags = 2,
    stackaddr_attr = 0x8267d4000, stacksize_attr = 2097152, guardsize_attr = 4096, cpuset = 0x0, cpusetsize = 0}, cancel_enable = 1, cancel_pending = 1, cancel_point = 1, no_cancel = 0, cancel_async = 0, cancelling = 0, sigmask = {__bits = {0, 0, 0, 0}},
  unblock_sigcancel = 0, in_sigsuspend = 1, deferred_siginfo = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {
        _trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, _capsicum = {_syscall = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, deferred_sigmask = {__bits = {0, 0, 0, 0}}, deferred_sigact = {
    __sigaction_u = {__sa_handler = 0x0, __sa_sigaction = 0x0}, sa_flags = 0, sa_mask = {__bits = {0, 0, 0, 0}}}, deferred_run = 0, force_exit = 0, state = PS_RUNNING, error = 0, joiner = 0x3a2768012000, flags = 0, tlflags = 2, mq = {{tqh_first = 0x0,
      tqh_last = 0x3a27680136a8}, {tqh_first = 0x0, tqh_last = 0x3a27680136b8}, {tqh_first = 0x0, tqh_last = 0x3a27680136c8}, {tqh_first = 0x0, tqh_last = 0x3a27680136d8}, {tqh_first = 0x0, tqh_last = 0x3a27680136e8}, {tqh_first = 0x0, tqh_last = 0x3a27680136f8}},
  ret = 0x0, specific = 0x0, specific_data_count = 0, rdlock_count = 0, rtld_bits = 0, tcb = 0x3a276860b120, cleanup = 0x0, ex = {exception_class = 0, exception_cleanup = 0x0, private_1 = 0, private_2 = 0}, unwind_stackend = 0x8269d4000, unwind_disabled = 0,
  magic = 3499860245, report_events = 0, event_mask = 0, event_buf = {event = TD_EVENT_NONE, th_p = 0, data = 0}, wchan = 0x0, mutex_obj = 0x0, will_sleep = 0, nwaiter_defer = 0, robust_inited = 0, robust_list = 0, priv_robust_list = 0, inact_mtx = 0, defer_waiters = {
    0x0 <repeats 50 times>}, wake_addr = 0x3a2768600030, sleepqueue = 0x3a27680190c0, name = 0x0, dlerror_msg = '\000' <repeats 511 times>, dlerror_seen = 0}
Comment 13 vedad 2024-12-19 09:57:52 UTC
Local *curthread from the frame #2:

(gdb) print *curthread
$3 = {tid = 100557, lock = {m_owner = 0, m_flags = 0, m_ceilings = {0, 0}, m_rb_lnk = 0, m_spare = {0, 0}}, cycle = 0, locklevel = 0, critical_count = 0, sigblock = 0, fsigblock = 0, tle = {tqe_next = 0x0, tqe_prev = 0x3a2768013540}, gcle = {tqe_next = 0x0,
    tqe_prev = 0x0}, hle = {le_next = 0x0, le_prev = 0x822694db0}, wle = {tqe_next = 0x0, tqe_prev = 0x0}, refcount = 0, start_routine = 0x0, arg = 0x0, attr = {sched_policy = 2, sched_inherit = 4, prio = 0, suspend = 0, flags = 258, stackaddr_attr = 0x800819000,
    stacksize_attr = 536870912, guardsize_attr = 4096, cpuset = 0x0, cpusetsize = 0}, cancel_enable = 1, cancel_pending = 0, cancel_point = 1, no_cancel = 0, cancel_async = 0, cancelling = 0, sigmask = {__bits = {0, 0, 0, 0}}, unblock_sigcancel = 0, in_sigsuspend = 0,
  deferred_siginfo = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0},
      _mesgq = {_mqd = 0}, _poll = {_band = 0}, _capsicum = {_syscall = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, deferred_sigmask = {__bits = {0, 0, 0, 0}}, deferred_sigact = {__sigaction_u = {__sa_handler = 0x0, __sa_sigaction = 0x0},
    sa_flags = 0, sa_mask = {__bits = {0, 0, 0, 0}}}, deferred_run = 0, force_exit = 0, state = PS_RUNNING, error = 0, joiner = 0x0, flags = 0, tlflags = 2, mq = {{tqh_first = 0x0, tqh_last = 0x3a27680121a8}, {tqh_first = 0x0, tqh_last = 0x3a27680121b8}, {
      tqh_first = 0x0, tqh_last = 0x3a27680121c8}, {tqh_first = 0x0, tqh_last = 0x3a27680121d8}, {tqh_first = 0x0, tqh_last = 0x3a27680121e8}, {tqh_first = 0x0, tqh_last = 0x3a27680121f8}}, ret = 0x0, specific = 0x0, specific_data_count = 0, rdlock_count = 0,
  rtld_bits = 0, tcb = 0x3a2767c1c120, cleanup = 0x820817da0, ex = {exception_class = 0, exception_cleanup = 0x0, private_1 = 0, private_2 = 0}, unwind_stackend = 0x820819000, unwind_disabled = 0, magic = 3499860245, report_events = 0, event_mask = 0, event_buf = {
    event = TD_EVENT_NONE, th_p = 0, data = 0}, wchan = 0x0, mutex_obj = 0x0, will_sleep = 0, nwaiter_defer = 0, robust_inited = 0, robust_list = 0, priv_robust_list = 0, inact_mtx = 0, defer_waiters = {0x0 <repeats 50 times>}, wake_addr = 0x8226950e0,
  sleepqueue = 0x3a2768019000, name = 0x0, dlerror_msg = "Invalid shared object handle 0x822694bf0", '\000' <repeats 471 times>, dlerror_seen = 1}
Comment 14 Konstantin Belousov freebsd_committer freebsd_triage 2024-12-19 16:39:13 UTC
So the victim (would be cancelled) thread has cancel_pending = 1, which mean
that the main thread successfully executed _thr_cancel() against it.  The
actions of _thr_cancel() are setting cancel_pending, and sending SIGCANCEL
to the target using thr_kill(2).

Since the victim is happily sleeping in sigsuspend(), either SIGCANCEL was not
delivered at all, or delivered but ignored due to convoluted threaded handler.
I tried to find any point where SIGCANCEL could be lost, but so far I do not
see it.

Can you please apply the debugging patch below and repeat the same data
gathering?  I want to see if the victim thread' userspace got SIGCANCEL at all.
Comment 15 Konstantin Belousov freebsd_committer freebsd_triage 2024-12-19 16:39:41 UTC
Created attachment 255955 [details]
Debug SIGCANCEL delivery
Comment 16 vedad 2024-12-20 11:10:13 UTC
(In reply to Konstantin Belousov from comment #14)

Here it is:

(gdb) print *pthread
$1 = {tid = 394413, lock = {m_owner = 0, m_flags = 0, m_ceilings = {0, 0}, m_rb_lnk = 0, m_spare = {0, 0}}, cycle = 0, locklevel = 0, critical_count = 0, sigblock = 0, fsigblock = 0, tle = {tqe_next = 0xf1713c12000, tqe_prev = 0x821f39590 <_thread_list>}, gcle = {
    tqe_next = 0x0, tqe_prev = 0x0}, hle = {le_next = 0x0, le_prev = 0x821f39e20}, wle = {tqe_next = 0x0, tqe_prev = 0x0}, refcount = 1, start_routine = 0x201a70 <thread>, arg = 0x0, attr = {sched_policy = 2, sched_inherit = 4, prio = 0, suspend = 0, flags = 2,
    stackaddr_attr = 0x823a8f000, stacksize_attr = 2097152, guardsize_attr = 4096, cpuset = 0x0, cpusetsize = 0}, sigcancel_cnt = 1, sigcancel_crit_cnt = 1, cancel_enable = 1, cancel_pending = 1, cancel_point = 1, no_cancel = 0, cancel_async = 0, cancelling = 0,
  sigmask = {__bits = {0, 0, 0, 0}}, unblock_sigcancel = 0, in_sigsuspend = 1, deferred_siginfo = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0,
      sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, _capsicum = {_syscall = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, deferred_sigmask = {__bits = {0,
      0, 0, 0}}, deferred_sigact = {__sigaction_u = {__sa_handler = 0x0, __sa_sigaction = 0x0}, sa_flags = 0, sa_mask = {__bits = {0, 0, 0, 0}}}, deferred_run = 0, force_exit = 0, state = PS_RUNNING, error = 0, joiner = 0xf1713c12000, flags = 0, tlflags = 2, mq = {{
      tqh_first = 0x0, tqh_last = 0xf1713c12fb0}, {tqh_first = 0x0, tqh_last = 0xf1713c12fc0}, {tqh_first = 0x0, tqh_last = 0xf1713c12fd0}, {tqh_first = 0x0, tqh_last = 0xf1713c12fe0}, {tqh_first = 0x0, tqh_last = 0xf1713c12ff0}, {tqh_first = 0x0,
      tqh_last = 0xf1713c13000}}, ret = 0x0, specific = 0x0, specific_data_count = 0, rdlock_count = 0, rtld_bits = 0, tcb = 0xf171420f120, cleanup = 0x0, ex = {exception_class = 0, exception_cleanup = 0x0, private_1 = 0, private_2 = 0}, unwind_stackend = 0x823c8f000,
  unwind_disabled = 0, magic = 3499860245, report_events = 0, event_mask = 0, event_buf = {event = TD_EVENT_NONE, th_p = 0, data = 0}, wchan = 0x0, mutex_obj = 0x0, will_sleep = 0, nwaiter_defer = 0, robust_inited = 0, robust_list = 0, priv_robust_list = 0,
  inact_mtx = 0, defer_waiters = {0x0 <repeats 50 times>}, wake_addr = 0xf1714200018, sleepqueue = 0xf1713c19080, name = 0x0, dlerror_msg = '\000' <repeats 511 times>, dlerror_seen = 0}
(gdb) print *curthread
$2 = {tid = 100589, lock = {m_owner = 0, m_flags = 0, m_ceilings = {0, 0}, m_rb_lnk = 0, m_spare = {0, 0}}, cycle = 0, locklevel = 0, critical_count = 0, sigblock = 0, fsigblock = 0, tle = {tqe_next = 0x0, tqe_prev = 0xf1713c12e40}, gcle = {tqe_next = 0x0,
    tqe_prev = 0x0}, hle = {le_next = 0x0, le_prev = 0x821f39db0}, wle = {tqe_next = 0x0, tqe_prev = 0x0}, refcount = 0, start_routine = 0x0, arg = 0x0, attr = {sched_policy = 2, sched_inherit = 4, prio = 0, suspend = 0, flags = 258, stackaddr_attr = 0x8003a5000,
    stacksize_attr = 536870912, guardsize_attr = 4096, cpuset = 0x0, cpusetsize = 0}, sigcancel_cnt = 0, sigcancel_crit_cnt = 0, cancel_enable = 1, cancel_pending = 0, cancel_point = 1, no_cancel = 0, cancel_async = 0, cancelling = 0, sigmask = {__bits = {0, 0, 0, 0}},
  unblock_sigcancel = 0, in_sigsuspend = 0, deferred_siginfo = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {
        _trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, _capsicum = {_syscall = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, deferred_sigmask = {__bits = {0, 0, 0, 0}}, deferred_sigact = {
    __sigaction_u = {__sa_handler = 0x0, __sa_sigaction = 0x0}, sa_flags = 0, sa_mask = {__bits = {0, 0, 0, 0}}}, deferred_run = 0, force_exit = 0, state = PS_RUNNING, error = 0, joiner = 0x0, flags = 0, tlflags = 2, mq = {{tqh_first = 0x0, tqh_last = 0xf1713c121b0}, {
      tqh_first = 0x0, tqh_last = 0xf1713c121c0}, {tqh_first = 0x0, tqh_last = 0xf1713c121d0}, {tqh_first = 0x0, tqh_last = 0xf1713c121e0}, {tqh_first = 0x0, tqh_last = 0xf1713c121f0}, {tqh_first = 0x0, tqh_last = 0xf1713c12200}}, ret = 0x0, specific = 0x0,
  specific_data_count = 0, rdlock_count = 0, rtld_bits = 0, tcb = 0xf171381c120, cleanup = 0x8203a39f0, ex = {exception_class = 0, exception_cleanup = 0x0, private_1 = 0, private_2 = 0}, unwind_stackend = 0x8203a5000, unwind_disabled = 0, magic = 3499860245,
  report_events = 0, event_mask = 0, event_buf = {event = TD_EVENT_NONE, th_p = 0, data = 0}, wchan = 0x0, mutex_obj = 0x0, will_sleep = 0, nwaiter_defer = 0, robust_inited = 0, robust_list = 0, priv_robust_list = 0, inact_mtx = 0, defer_waiters = {
    0x0 <repeats 50 times>}, wake_addr = 0x821f3a0e0, sleepqueue = 0xf1713c19000, name = 0x0, dlerror_msg = "Invalid shared object handle 0x821f39bf0", '\000' <repeats 471 times>, dlerror_seen = 1}
Comment 17 Konstantin Belousov freebsd_committer freebsd_triage 2024-12-20 17:52:40 UTC
(In reply to vedad from comment #16)
We seems to getting close, but still not quite.  The victim thread was
caught by SIGCANCEL inside the critical section.  Then, returning from
the section should have caused cancellation to trigger.  Since it does
not, and I verified manually all places where we leave critical counters,
lets try with more debugging to see what happens.

Please provide me with both the backtrace and the p *curthread on the
victim thread.

diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c
index b953c430158c..9f7fcbe33566 100644
--- a/lib/libthr/thread/thr_sig.c
+++ b/lib/libthr/thread/thr_sig.c
@@ -191,8 +191,13 @@ sigcancel_handler(int sig __unused,
        struct pthread *curthread = _get_curthread();
        int err;
 
-       if (THR_IN_CRITICAL(curthread))
+       if (THR_IN_CRITICAL(curthread)) {
+               void *p[1];
+
+               p[0] = curthread;
+               abort2("thr_in_critical %p", 1, p);
                return;
+       }
        err = errno;
        check_suspend(curthread);
        check_cancel(curthread, ucp);
Comment 18 vedad 2024-12-21 11:15:35 UTC
(In reply to Konstantin Belousov from comment #17)

Your latest patch didn't apply over the previous one, so I git-checkout'd first.

> PID 33078, iteration 340...pthread_cancel() succeeded
> OK
> Abort trap (core dumped)

I guess the abort() occurred at iteration 341, whose stdout wasn't flushed.

dmesg:
> pthread(pid 33078 uid 0) aborted: thr_in_critical %p(0x1fc231a12700)


(gdb) thread apply all bt full

Thread 2 (LWP 337498):
#0  _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:38
No locals.
#1  0x0000000821978d4b in _thr_umtx_wait (mtx=0x1fc231a12700, id=688152, timeout=0x0) at /usr/src/lib/libthr/thread/thr_umtx.c:198
No locals.
#2  0x000000082196da7e in join_common (pthread=0x1fc231a12700, thread_return=0x82043ebc0, abstime=0x0, peek=<optimized out>) at /usr/src/lib/libthr/thread/thr_join.c:144
        __cup = {prev = 0x0, routine = 0x82196dbf0 <backout_join>, routine_arg = 0x1fc231a12700, onheap = 0}
        ts = {tv_sec = 34901060464, tv_nsec = 34923276780}
        ts2 = {tv_sec = 0, tv_nsec = 34918916759552}
        curthread = 0x1fc231a12000
        ret = <optimized out>
        tid = 688152
        tsp = 0xa8018
        tmp = <optimized out>
#3  0x0000000000201aee in thread_cancel_and_join (ptid=0x1fc231a12700) at pthread.c:28
        status = 0x0
        error = 0
#4  0x0000000000201a2c in main (argc=1, argv=0x82043ec98) at pthread.c:73
        ptid = 0x1fc231a12700
        err = 0
        iteration = 341
        pid = 33078

Thread 1 (LWP 688152):
#0  abort2 () at abort2.S:4
No locals.
#1  0x0000000821974b0e in sigcancel_handler (sig=<optimized out>, info=<optimized out>, ucp=<optimized out>) at /usr/src/lib/libthr/thread/thr_sig.c:198
        p = {0x1fc231a12700}
        curthread = 0x1fc231a12700
        err = <optimized out>
#2  <signal handler called>
No symbol table info available.
#3  _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:38
No locals.
#4  0x0000000821978a30 in __thr_umutex_lock (mtx=0x1fc231a12708, id=688152) at /usr/src/lib/libthr/thread/thr_umtx.c:79
        owner = <optimized out>
#5  0x000000082196c30a in _thr_umutex_lock (mtx=0x1fc231a12708, id=17) at /usr/src/lib/libthr/thread/thr_umtx.h:123
No locals.
#6  exit_thread () at /usr/src/lib/libthr/thread/thr_exit.c:301
        curthread = 0x1fc231a12700
#7  0x000000082196c1fe in _pthread_exit_mask (status=status@entry=0x1, mask=mask@entry=0x0) at /usr/src/lib/libthr/thread/thr_exit.c:263
        curthread = 0x1fc231a12700
        cleanup = <optimized out>
#8  0x000000082196c06b in _Tthr_exit (status=0x1fc231a12708, status@entry=0x1) at /usr/src/lib/libthr/thread/thr_exit.c:206
No locals.
#9  0x000000082196a400 in testcancel (curthread=<optimized out>) at /usr/src/lib/libthr/thread/thr_cancel.c:51
No locals.
#10 _thr_cancel_enter (curthread=curthread@entry=0x1fc231a12700) at /usr/src/lib/libthr/thread/thr_cancel.c:149
No locals.
#11 0x0000000821975002 in __thr_sigsuspend (set=0x822ca5f80) at /usr/src/lib/libthr/thread/thr_sig.c:692
        newset = {__bits = {583688064, 8, 0, 0}}
        curthread = 0x1fc231a12700
        old = 0
        ret = <optimized out>
#12 0x0000000821fcea65 in __pause () at /usr/src/lib/libc/gen/pause.c:51
        oset = {__bits = {0, 0, 0, 0}}
#13 0x0000000000201a81 in thread (arg=0x0) at pthread.c:12
No locals.
#14 0x000000082196bb55 in thread_start (curthread=0x1fc231a12700) at /usr/src/lib/libthr/thread/thr_create.c:289
        set = {__bits = {0, 0, 596938712, 8}}
#15 0x0000000000000000 in ?? ()


(gdb) frame 1
#1  0x0000000821974b0e in sigcancel_handler (sig=<optimized out>, info=<optimized out>, ucp=<optimized out>) at /usr/src/lib/libthr/thread/thr_sig.c:198
198                     abort2("thr_in_critical %p", 1, p);
(gdb) print *curthread
$1 = {tid = 688152, lock = {m_owner = 0, m_flags = 0, m_ceilings = {0, 0}, m_rb_lnk = 0, m_spare = {0, 0}}, cycle = 0, locklevel = 1, critical_count = 0, sigblock = 0, fsigblock = 0, tle = {tqe_next = 0x1fc231a12000, tqe_prev = 0x82197c600 <_thread_list>}, gcle = {
    tqe_next = 0x0, tqe_prev = 0x0}, hle = {le_next = 0x0, le_prev = 0x82197ce68}, wle = {tqe_next = 0x0, tqe_prev = 0x0}, refcount = 1, start_routine = 0x201a70 <thread>, arg = 0x0, attr = {sched_policy = 2, sched_inherit = 4, prio = 0, suspend = 0, flags = 2,
    stackaddr_attr = 0x822aa6000, stacksize_attr = 2097152, guardsize_attr = 4096, cpuset = 0x0, cpusetsize = 0}, cancel_enable = 1, cancel_pending = 1, cancel_point = 0, no_cancel = 1, cancel_async = 0, cancelling = 1, sigmask = {__bits = {0, 0, 0, 0}},
  unblock_sigcancel = 0, in_sigsuspend = 1, deferred_siginfo = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {
        _trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, _capsicum = {_syscall = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, deferred_sigmask = {__bits = {0, 0, 0, 0}}, deferred_sigact = {
    __sigaction_u = {__sa_handler = 0x0, __sa_sigaction = 0x0}, sa_flags = 0, sa_mask = {__bits = {0, 0, 0, 0}}}, deferred_run = 0, force_exit = 0, state = PS_RUNNING, error = 0, joiner = 0x1fc231a12000, flags = 0, tlflags = 2, mq = {{tqh_first = 0x0,
      tqh_last = 0x1fc231a128a8}, {tqh_first = 0x0, tqh_last = 0x1fc231a128b8}, {tqh_first = 0x0, tqh_last = 0x1fc231a128c8}, {tqh_first = 0x0, tqh_last = 0x1fc231a128d8}, {tqh_first = 0x0, tqh_last = 0x1fc231a128e8}, {tqh_first = 0x0, tqh_last = 0x1fc231a128f8}},
  ret = 0x1, specific = 0x0, specific_data_count = 0, rdlock_count = 0, rtld_bits = 0, tcb = 0x1fc232003120, cleanup = 0x0, ex = {exception_class = 0, exception_cleanup = 0x0, private_1 = 0, private_2 = 0}, unwind_stackend = 0x822ca6000, unwind_disabled = 0,
  magic = 3499860245, report_events = 0, event_mask = 0, event_buf = {event = TD_EVENT_NONE, th_p = 0, data = 0}, wchan = 0x0, mutex_obj = 0x0, will_sleep = 0, nwaiter_defer = 0, robust_inited = 0, robust_list = 0, priv_robust_list = 0, inact_mtx = 0, defer_waiters = {
    0x0 <repeats 50 times>}, wake_addr = 0x1fc232000000, sleepqueue = 0x1fc231a19040, name = 0x0, dlerror_msg = "Undefined symbol \"_Unwind_ForcedUnwind\"", '\000' <repeats 472 times>, dlerror_seen = 0}
Comment 19 Konstantin Belousov freebsd_committer freebsd_triage 2024-12-21 18:43:20 UTC
(In reply to vedad from comment #18)
Hm, no, this case is not interesting: there is THR_LOCK() call in the thread
exit path, and the debugging patch was too rough to avoid this case.  When
the victim exiting, the deadlock cannot occur, since we are already past
the pause()/sigsuspend() sleep.

I slightly improved the detector part of the patch.  Also, it is available
at https://kib.kiev.ua/git/deviant3.git branch cancel.  Lets try again.

diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c
index b953c430158c..a393e51b746a 100644
--- a/lib/libthr/thread/thr_sig.c
+++ b/lib/libthr/thread/thr_sig.c
@@ -191,8 +191,15 @@ sigcancel_handler(int sig __unused,
 	struct pthread *curthread = _get_curthread();
 	int err;
 
-	if (THR_IN_CRITICAL(curthread))
+	if (THR_IN_CRITICAL(curthread)) {
+		if (!curthread->cancelling) {
+			void *p[1];
+			
+			p[0] = curthread;
+			abort2("thr_in_critical %p", 1, p);
+		}
 		return;
+	}
 	err = errno;
 	check_suspend(curthread);
 	check_cancel(curthread, ucp);
Comment 20 vedad 2024-12-22 00:04:07 UTC
(In reply to Konstantin Belousov from comment #19)

(gdb) thread apply all bt full

Thread 2 (LWP 100391):
#0  _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:38
No locals.
#1  0x00000008215fdd4b in _thr_umtx_wait (mtx=0xd7be813c00, id=707144, timeout=0x0) at /usr/src/lib/libthr/thread/thr_umtx.c:198
No locals.
#2  0x00000008215f2a7e in join_common (pthread=0xd7be813c00, thread_return=0x820da5ca0, abstime=0x0, peek=<optimized out>) at /usr/src/lib/libthr/thread/thr_join.c:144
        __cup = {prev = 0x0, routine = 0x8215f2bf0 <backout_join>, routine_arg = 0xd7be813c00, onheap = 0}
        ts = {tv_sec = 34910919760, tv_nsec = 34919627244}
        ts2 = {tv_sec = 0, tv_nsec = 926614102016}
        curthread = 0xd7be812000
        ret = <optimized out>
        tid = 707144
        tsp = 0xaca48
        tmp = <optimized out>
#3  0x0000000000201aee in thread_cancel_and_join (ptid=0xd7be813c00) at pthread.c:28
        status = 0x0
        error = 0
#4  0x0000000000201a2c in main (argc=1, argv=0x820da5d78) at pthread.c:73
        ptid = 0xd7be813c00
        err = 0
        iteration = 499
        pid = 31870

Thread 1 (LWP 707144):
#0  abort2 () at abort2.S:4
No locals.
#1  0x00000008215f9b0d in sigcancel_handler (sig=<optimized out>, info=<optimized out>, ucp=<optimized out>) at /usr/src/lib/libthr/thread/thr_sig.c:199
        p = {0xd7be813c00}
        curthread = 0xd7be813c00
        err = <optimized out>
#2  <signal handler called>
No symbol table info available.
#3  _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:38
No locals.
#4  0x00000008215fda30 in __thr_umutex_lock (mtx=0xd7be813c08, id=707144) at /usr/src/lib/libthr/thread/thr_umtx.c:79
        owner = <optimized out>
#5  0x00000008215f0aa4 in _thr_umutex_lock (mtx=0xd7be813c08, id=17) at /usr/src/lib/libthr/thread/thr_umtx.h:123
No locals.
#6  thread_start (curthread=0xd7be813c00) at /usr/src/lib/libthr/thread/thr_create.c:264
        set = {__bits = {0, 0, 617922520, 8}}
#7  0x0000000000000000 in ?? ()
No symbol table info available.
Backtrace stopped: Cannot access memory at address 0x82452e000
(gdb) frame 1
#1  0x00000008215f9b0d in sigcancel_handler (sig=<optimized out>, info=<optimized out>, ucp=<optimized out>) at /usr/src/lib/libthr/thread/thr_sig.c:199
199                             abort2("thr_in_critical %p", 1, p);
(gdb) print *curthread
$1 = {tid = 707144, lock = {m_owner = 0, m_flags = 0, m_ceilings = {0, 0}, m_rb_lnk = 0, m_spare = {0, 0}}, cycle = 0, locklevel = 1, critical_count = 0, sigblock = 0, fsigblock = 0, tle = {tqe_next = 0xd7be812000, tqe_prev = 0x821601600 <_thread_list>}, gcle = {
    tqe_next = 0x0, tqe_prev = 0x0}, hle = {le_next = 0x0, le_prev = 0x821601f10}, wle = {tqe_next = 0x0, tqe_prev = 0x0}, refcount = 1, start_routine = 0x201a70 <thread>, arg = 0x0, attr = {sched_policy = 2, sched_inherit = 4, prio = 0, suspend = 0, flags = 2,
    stackaddr_attr = 0x82432e000, stacksize_attr = 2097152, guardsize_attr = 4096, cpuset = 0x0, cpusetsize = 0}, cancel_enable = 1, cancel_pending = 1, cancel_point = 0, no_cancel = 0, cancel_async = 0, cancelling = 0, sigmask = {__bits = {0, 0, 0, 0}},
  unblock_sigcancel = 0, in_sigsuspend = 0, deferred_siginfo = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {
        _trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, _capsicum = {_syscall = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, deferred_sigmask = {__bits = {0, 0, 0, 0}}, deferred_sigact = {
    __sigaction_u = {__sa_handler = 0x0, __sa_sigaction = 0x0}, sa_flags = 0, sa_mask = {__bits = {0, 0, 0, 0}}}, deferred_run = 0, force_exit = 0, state = PS_RUNNING, error = 0, joiner = 0xd7be812000, flags = 0, tlflags = 2, mq = {{tqh_first = 0x0,
      tqh_last = 0xd7be813da8}, {tqh_first = 0x0, tqh_last = 0xd7be813db8}, {tqh_first = 0x0, tqh_last = 0xd7be813dc8}, {tqh_first = 0x0, tqh_last = 0xd7be813dd8}, {tqh_first = 0x0, tqh_last = 0xd7be813de8}, {tqh_first = 0x0, tqh_last = 0xd7be813df8}}, ret = 0x0,
  specific = 0x0, specific_data_count = 0, rdlock_count = 0, rtld_bits = 0, tcb = 0xd7bee07120, cleanup = 0x0, ex = {exception_class = 0, exception_cleanup = 0x0, private_1 = 0, private_2 = 0}, unwind_stackend = 0x0, unwind_disabled = 0, magic = 3499860245,
  report_events = 0, event_mask = 0, event_buf = {event = TD_EVENT_NONE, th_p = 0, data = 0}, wchan = 0x0, mutex_obj = 0x0, will_sleep = 0, nwaiter_defer = 0, robust_inited = 0, robust_list = 0, priv_robust_list = 0, inact_mtx = 0, defer_waiters = {
    0x0 <repeats 50 times>}, wake_addr = 0xd7bee00048, sleepqueue = 0xd7be819100, name = 0x0, dlerror_msg = '\000' <repeats 511 times>, dlerror_seen = 0}
Comment 21 Konstantin Belousov freebsd_committer freebsd_triage 2024-12-24 14:35:05 UTC
Created attachment 256100 [details]
Supposed fix

Please try this (no need to apply any previous debugging patches)
Comment 22 vedad 2024-12-25 19:30:34 UTC
(In reply to Konstantin Belousov from comment #21)

Great, no lock-ups after ~24 hours. This seems to have fixed it.
Comment 23 Paul Floyd 2024-12-25 19:52:40 UTC
This might also fix

https://github.com/paulfloyd/freebsd_valgrind/issues/199

If I have time I'll try it over the holidays.
Comment 24 commit-hook freebsd_committer freebsd_triage 2024-12-28 17:21:17 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=9f78c837d94f73f1485e2ce012a8e64197dae9d5

commit 9f78c837d94f73f1485e2ce012a8e64197dae9d5
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2024-12-24 00:08:17 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2024-12-28 17:01:58 +0000

    check_cancel: when in_sigsuspend, send SIGCANCEL unconditionally

    not only when check_cancel is called from the signal handler.
    _thr_wake() is not enough to get the sigsuspend(2) syscall out of the
    loop, since it only terminates sleep when an unblocked signal is
    delivered.

    PR:     283101
    In collaboration with:  pho
    Reviewed by:    markj
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week
    Differential revision:  https://reviews.freebsd.org/D48200

 lib/libthr/thread/thr_sig.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
Comment 25 commit-hook freebsd_committer freebsd_triage 2025-01-04 03:58:10 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=85671cefd93fff9baf5245b6aae0b44d30621b94

commit 85671cefd93fff9baf5245b6aae0b44d30621b94
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2024-12-24 00:08:17 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2025-01-04 03:57:24 +0000

    check_cancel: when in_sigsuspend, send SIGCANCEL unconditionally

    PR:     283101

    (cherry picked from commit 9f78c837d94f73f1485e2ce012a8e64197dae9d5)

 lib/libthr/thread/thr_sig.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
Comment 26 Paul Floyd 2025-01-04 20:42:07 UTC
(In reply to Paul Floyd from comment #23)

No it doesn't. Oh well.