Bug 237195 - pthread_mutex_unlock crash as unlocked mutex destroyed by signaled thread
Summary: pthread_mutex_unlock crash as unlocked mutex destroyed by signaled thread
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: threads (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Many People
Assignee: freebsd-threads (Nobody)
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2019-04-11 09:12 UTC by freebsd
Modified: 2019-04-19 14:27 UTC (History)
2 users (show)

See Also:
koobs: mfc-stable12?
koobs: mfc-stable11?


Attachments
A simple program to reproduce the issue. (14.72 KB, text/plain)
2019-04-11 09:12 UTC, freebsd
no flags Details
Do not access mutex memory after unlock. (1.02 KB, patch)
2019-04-11 09:37 UTC, Konstantin Belousov
no flags Details | Diff
various gdb dumps of the issue (6.72 KB, text/plain)
2019-04-18 14:20 UTC, freebsd
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description freebsd 2019-04-11 09:12:08 UTC
Created attachment 203579 [details]
A simple program to reproduce the issue.

I have this program where N threads are communicating with one thread through messages.
Some of theses messages are used to ensure synchronisation (e.g.: flush).
Each messsage contain a mutex and a condition.
When testing on FreeBSD 12.0, the program randomly crashed.
Every clue was saying the mutex was destroyed while being unlocked.
Given said program is pretty huge, I've written a small, simplified, code with similar behaviour and it crashed the same way.

#0  0x000000080065c93a in ?? () from /lib/libthr.so.3
#1  0x0000000000401dd8 in mcv_progress (mcv=0x801c1ac00) at mutex_test.c:365 (calling pthread_mutex_unlock)
#2  0x0000000000401f46 in read_thread (arg=0x7fffffffdac0) at mutex_test.c:412
#3  0x0000000800654776 in ?? () from /lib/libthr.so.3

I suspect what happens looks something like this:

client/writer                   server/reader

1) queues message               
2) locks message.mutex         1) dequeues message
                               2) process message
3) waits for message.condition
                               3) locks message.mutex
                               4) signals message.condition
4) unlocks message.mutex       5) unlocks message.mutex
5) destroy message memory      6) somehow, still in pthread_mutex_unlock after
                                  the client got the hand back, maybe accesses
6) frees message memory           mutex content and crashes as it has been
                                  trashed

The same program works fine under any load and parameters on several Linux versions and OSX 10.14.4.

If this hypothesis verifies, I suppose it only affects programs rapidly creating and destroying mutexes(+conditions?).

Please find attached the test program.

To reproduce the issue, I ctrl-C the program after 5 seconds if it didn't crash and restart it immediately.  Four or five tries of this are usually enough.
Comment 1 Konstantin Belousov freebsd_committer 2019-04-11 09:37:18 UTC
I was unable to reproduce the issue with your test program.

Regardless of it, there is indeed an access to the mutex memory after the unlock.  Please try the attached patch and report back.  Note that seemingly similar check in _mutex_leave_robust() only compares addresses.
Comment 2 Konstantin Belousov freebsd_committer 2019-04-11 09:37:54 UTC
Created attachment 203581 [details]
Do not access mutex memory after unlock.
Comment 3 commit-hook freebsd_committer 2019-04-12 17:28:09 UTC
A commit references this bug:

Author: kib
Date: Fri Apr 12 17:27:19 UTC 2019
New revision: 346158
URL: https://svnweb.freebsd.org/changeset/base/346158

Log:
  Do not access mutex memory after unlock.

  PR:	237195
  Reported by:	freebsd@hurrikhan.eu
  Sponsored by:	The FreeBSD Foundation
  MFC after:	1 week

Changes:
  head/lib/libthr/thread/thr_mutex.c
Comment 4 freebsd 2019-04-18 14:20:58 UTC
Created attachment 203764 [details]
various gdb dumps of the issue
Comment 5 freebsd 2019-04-18 14:21:17 UTC
Thanks for the patch.

I was able to apply and test it.
This specific issue appears to be fixed, but now the other side seems to have an issue.
I either get an error telling the mutex is invalid, either it crashes during the log.

I'm attaching the stack trace, disassembly and register dump right after this message.
Comment 6 commit-hook freebsd_committer 2019-04-19 12:30:46 UTC
A commit references this bug:

Author: kib
Date: Fri Apr 19 12:30:15 UTC 2019
New revision: 346371
URL: https://svnweb.freebsd.org/changeset/base/346371

Log:
  MFC r346158:
  Do not access mutex memory after unlock.

  PR:	237195

Changes:
_U  stable/12/
  stable/12/lib/libthr/thread/thr_mutex.c
Comment 7 commit-hook freebsd_committer 2019-04-19 12:31:53 UTC
A commit references this bug:

Author: kib
Date: Fri Apr 19 12:31:17 UTC 2019
New revision: 346372
URL: https://svnweb.freebsd.org/changeset/base/346372

Log:
  MFC r346158:
  Do not access mutex memory after unlock.

  PR:	237195

Changes:
_U  stable/11/
  stable/11/lib/libthr/thread/thr_mutex.c
Comment 8 Konstantin Belousov freebsd_committer 2019-04-19 14:27:05 UTC
(In reply to freebsd from comment #5)
It sounds like an access to the freed memory.

Anyway, it is somewhat labor-intensive to read disassembly to track your issue, and since you did not disassemble the whole functions bodies, it is impossible.
Compile libc/libthr/rtld with debugging enabled:
 make -C lib/libc DEBUG_FLAGS=-g all install
same for lib/libthr, libexec/rtld-elf.  Then reproduce the issue and show
the backtrace with debugging symbols, so that the source line numbers are
easily seen.