The bug was introduced in r349951 r350478 which was also MFCed to 12.x as r351789 Previously, any unexpected cases will lead to just exiting syscall, leaving the decision to the userspace sem wrapper (which was done that correctly). After these patches, non-zero sem->_has_waiters will lead to going straightly to umtxq_sleep() without checking if sem->_count is non-zero. Non-zero sem->_has_waiters may happen spuriously due to the same code: it doesn't clear this flag after cancelling sleep due to non-zero sem->_count. 1. make sem_wait() and sem_post() race: - sem_wait() check _count in userspace, sees it is zero, calls kernel -> do_sem_wait() - sem_post() sets _count to 1 and calls do_sem_wake() which does nothing because waiter-thread still not in queue - do_sem_wait() sets _has_waiters=1, then sees _count==1 and exits - userspace sem_wait() decrements _count back to 0 _has_waiters left non-zero 2. make them race again: - sem_wait() check _count in userspace, sees it is zero, calls kernel -> do_sem_wait() - sem_post() sets _count to 1 again, not touching _has_waiters, and calls do_sem_wake() which does nothing again, because waiter-thread not in queue - do_sem_wait() fails to set _has_waiters 0->1 because it already non-zero, and then goes to umtxq_sleep() - umtxq_sleep() has no chance to wake up without another sem_post(), despite the fact that _count==1
Differential revision: https://reviews.freebsd.org/D36272
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=768f6373eb3d60e346d3bfa495e04315aeed8ff9 commit 768f6373eb3d60e346d3bfa495e04315aeed8ff9 Author: firk <firk@cantconnect.ru> AuthorDate: 2022-08-26 08:05:56 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2022-08-26 17:34:29 +0000 Fix compat10 semaphore interface race Wrong has-waiters and missing unconditional _count==0 check may cause infinite waiting with already non-zero count. 1) properly clear _has_waiters flag when waiting failed to start 2) always check _count before start waiting PR: 265997 Reviewed by: kib MFC after: 1 week Differential revision: https://reviews.freebsd.org/D36272 sys/kern/kern_umtx.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7e8fa3c9a2102cff1a76cc74c4c739e039c59d23 commit 7e8fa3c9a2102cff1a76cc74c4c739e039c59d23 Author: firk <firk@cantconnect.ru> AuthorDate: 2022-08-26 08:05:56 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2022-08-31 01:20:29 +0000 Fix compat10 semaphore interface race PR: 265997 (cherry picked from commit 768f6373eb3d60e346d3bfa495e04315aeed8ff9) sys/kern/kern_umtx.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3e39cbb10e0d4cf9db17ac38a0cca39aa3c5e403 commit 3e39cbb10e0d4cf9db17ac38a0cca39aa3c5e403 Author: firk <firk@cantconnect.ru> AuthorDate: 2022-08-26 08:05:56 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2022-08-31 01:33:38 +0000 Fix compat10 semaphore interface race PR: 265997 (cherry picked from commit 768f6373eb3d60e346d3bfa495e04315aeed8ff9) sys/kern/kern_umtx.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-)