Seeing many TrueNAS (previously FreeNAS) users dump core on the main middlewared process (python) starting with our version 12.0 release. Relevant OS information: 12.2-RELEASE-p6 FreeBSD 12.2-RELEASE-p6 f2858df162b(HEAD) TRUENAS amd64 Python versions that experience the core dump: Python 3.8.7 Python 3.9.4 When initially researching this, I did find a regression with threading and python 3.8 on freeBSD and was able to resolve that particular problem by backporting the commits: https://github.com/python/cpython/commit/4d96b4635aeff1b8ad41d41422ce808ce0b971c8 and https://github.com/python/cpython/commit/9ad58acbe8b90b4d0f2d2e139e38bb5aa32b7fb6. The reason why I backported those commits is because all of the core dumps that I've analyzed are panic'ing in the same spot (or very close to it). For example, here are 2 backtraces showing null-ptr dereference. Core was generated by `python3.8: middlewared'. Program terminated with signal SIGSEGV, Segmentation fault. #0 cond_signal_common (cond=<optimized out>) at /truenas-releng/freenas/_BE/os/lib/libthr/thread/thr_cond.c:457 warning: Source file is more recent than executable. 457 mp = td->mutex_obj; [Current thread is 1 (LWP 100733)] (gdb) list 452 _sleepq_unlock(cvp); 453 return (0); 454 } 455 456 td = _sleepq_first(sq); 457 mp = td->mutex_obj; 458 cvp->__has_user_waiters = _sleepq_remove(sq, td); 459 if (PMUTEX_OWNER_ID(mp) == TID(curthread)) { 460 if (curthread->nwaiter_defer >= MAX_DEFER_WAITERS) { 461 _thr_wake_all(curthread->defer_waiters, (gdb) p *td Cannot access memory at address 0x0 and another one Core was generated by `python3.8: middlewared'. Program terminated with signal SIGSEGV, Segmentation fault. #0 cond_signal_common (cond=<optimized out>) at /truenas-releng/freenas/_BE/os/lib/libthr/thread/thr_cond.c:459warning: Source file is more recent than executable. 459 if (PMUTEX_OWNER_ID(mp) == TID(curthread)) { [Current thread is 1 (LWP 101105)] (gdb) list 454 } 455 456 td = _sleepq_first(sq); 457 mp = td->mutex_obj; 458 cvp->__has_user_waiters = _sleepq_remove(sq, td); 459 if (PMUTEX_OWNER_ID(mp) == TID(curthread)) { 460 if (curthread->nwaiter_defer >= MAX_DEFER_WAITERS) { 461 _thr_wake_all(curthread->defer_waiters, 462 curthread->nwaiter_defer); 463 curthread->nwaiter_defer = 0; (gdb) p *mp Cannot access memory at address 0x0 I'm trying to instrument a program to "stress" test threading (tearing down and recreating etc etc) but I've been unsuccessful at tickling this particular problem. The end-users that have seen this core dump sometimes go 1month + without a problem. Hoping someone more knowledgeable can at least give me a pointer or help me figure this one out. I have access to my VM that has all the relevant core dumps available so if someone needs remote access to it to "poke" around, please let me know. You can reach me at caleb [at] ixsystems.com
(In reply to yocalebo@gmail.com from comment #0) Thanks for reporting this. To narrow down the cause and to verify the fix. Do you think it's possible to: - Have a minimal test case to reproduce the issue. - Test it on vanilla FreeBSD, here it is 12.2-RELEASE, it will be nice if it can be also tested on 13.0-RELEASE and -CURRENT. I know that debugging threading issues are hard and you're trying to do the first item, but we firstly need to reproduce this on an unmodified FreeBSD. Also you mentioned back porting the two commits, does it help?
(In reply to Li-Wen Hsu from comment #1) Hi Li-Wen Hsu, thanks for the feedback. I will definitely use stock GENERIC freeBSD 12.2 kernel. The hard part is instrumenting a program that causes the SIGSEV :(. I can, however, confirm that stock 12.2 has a regression with threads on py3.8.7. You can use this script to test it. from multiprocessing.pool import ThreadPool from datetime import datetime class Broken(object): def __init__(self): self.pool = ThreadPool() def __del__(self): self.pool.close() self.pool.join() ohno = Broken() date = datetime.now() print(str(date), ohno) Running that 52 times caused an indefinite hang ~19.2% of the time. After cherry-picking the 2 commits I linked above, I confirmed that it resolved the "hangs". It never did reproduce a core dump. Anyways, I'll keep trying to write a program that tickles the problem. If you or anyone has any ideas reading this, I'm all ears at this point and will happily try anything at this point :)
(In reply to yocalebo@gmail.com from comment #2) > Running that 52 times caused an indefinite hang ~19.2% of the time. After cherry-picking the 2 commits I linked above, I confirmed that it resolved the "hangs". It never did reproduce a core dump. If this is the case, I think we need to backport this to current python 3.8 (and 3.9?) ports. Have this been done in the upstream? We can do it locally but we also need to check it them won't cause other regressions. It would be nice to work with the upstream to reduce the possible regressions, and let the next version of 3.8 includes the fix.
(In reply to Li-Wen Hsu from comment #3) Upstream python 3.9 includes these patches by default already, they were never backported to 3.8. I'm fairly confident there are no regressions because: 1. they were commited to upstream python for version 3.9 2. as of yesterday, there are 21,736 users running py3.8.7 with those 2 backported commits and we've seen no further reports of regressions.
(In reply to yocalebo@gmail.com from comment #4) Thanks for the numbers and it sounds we should have those patches included. Do you think you can ask if they also want to backport to 3.8, or do you mean they have been asked but don't want to do so?
(In reply to Li-Wen Hsu from comment #5) Unfortunately, I'm not sure why they did not include the threading patch in python 3.8. I suspect because the use-case is relatively small (daemon=True threads and how they're destroyed/cleaned up by the calling thread) and they were very late in the "release" process for pushing out 3.8?? I'm not sure, hard to know.
(In reply to yocalebo@gmail.com from comment #4) Upstream has a latest minor and back 2 minors merge policy. If you can identify the issues and/or PR's associated with those two missing merge commits, I can take care of liaising with upstream to get them merged in the long term Thank you for the excellent report, assessment and upstream research.
(In reply to yocalebo@gmail.com from comment #6) I run FreeBSD 12.x and CURRENT for upstream Cpython, which tests all non EoL branches (latest-and-back-2-versions), so I'm surprised with this report. This would appear to hint at something else contributing to the issue.
(In reply to Kubilay Kocak from comment #7) Hi Kubilay, thanks for adding Victor Stinner to this. Victor will more than likely be able to give a more definitive reason on why those 2 commits did not get backported to python3.8 since Victor is the author of those commits :) Anyways, I have the upstream python bugs for those 2 commits: https://github.com/python/cpython/commit/9ad58acbe8b90b4d0f2d2e139e38bb5aa32b7fb6 links to https://bugs.python.org/issue19466 https://github.com/python/cpython/commit/4d96b4635aeff1b8ad41d41422ce808ce0b971c8 links to https://bugs.python.org/issue39511 Without those 2 commits, if you take the python script I posted earlier and simply run a while true loop in the shell repeatedly calling it, you will notice that it will eventually hang indefinitely. With the 2 commits, the while true loop runs forever as expected.
(In reply to Kubilay Kocak from comment #8) I was quite surprised at the regression with py3.8 as well. But that was before I started looking at some of the changes from py3.7 to 3.8. Some of the changes are very intrusive. For example, here is a quote from one of the release note entries. "The ``PyGC_Head`` struct has changed completely. All code that touched the struct member should be rewritten. (See :issue:`33597`.)" That struct is the main struct for pythons garbage collection object (AFAICT), and I understand that it's a very "low-level" change and shouldn't effect too much etc etc, but seems strange to me for something (for what I perceive to be) such an intrusive change for a minor release. But I will admit, I do not follow the release process for python very closely and this very well could be par for the course. :)
Update on the matter. I was able to analyze a python3.9.4 core dump and I've narrowed the fact down that the thread on CPU that crashed is in concurrent.future._base.Executor class. Core was generated by `python3.9: middlewared'. Program terminated with signal SIGSEGV, Segmentation fault. #0 pymalloc_alloc (ctx=<optimized out>, nbytes=56) at Objects/obmalloc.c:1621 warning: Source file is more recent than executable. 1621 if (UNLIKELY((pool->freeblock = *(block **)bp) == NULL)) { [Current thread is 1 (LWP 100893)] (gdb) info locals size = 3 pool = 0x83757b000 bp = 0x0 (gdb) py-bt Traceback (most recent call first): File "/usr/local/lib/python3.9/xml/etree/ElementPath.py", line 282, in select yield elem File "/usr/local/lib/python3.9/xml/etree/ElementPath.py", line 389, in find return next(iterfind(elem, path, namespaces), None) File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/disk_/disk_info_freebsd.py", line 96, in label_to_disk part = geom.class_by_name('PART').xml.find(f'.//provider[name="{dev}"]/../name') File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1521, in call_sync ) File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 416, in transform_topology disk = self.middleware.call_sync('disk.label_to_disk', *args) File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 435, in transform_topology x[i] = self.transform_topology(x[i], dict(options, geom_scan=False)) File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 432, in transform_topology x[key] = self.transform_topology(x[key], dict(options, geom_scan=False)) File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 435, in transform_topology x[i] = self.transform_topology(x[i], dict(options, geom_scan=False)) File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 432, in transform_topology x[key] = self.transform_topology(x[key], dict(options, geom_scan=False)) File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 469, in pool_extend 'topology': self.transform_topology(zpool['groups']), File "/usr/local/lib/python3.9/site-packages/middlewared/utils/io_thread_pool_executor.py", line 25, in run result = self.fn(*self.args, **self.kwargs) File "/usr/local/lib/python3.9/site-packages/middlewared/utils/io_thread_pool_executor.py", line 308, in _target File "/usr/local/lib/python3.9/threading.py", line 892, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.9/threading.py", line 954, in _bootstrap_inner self.run() File "/usr/local/lib/python3.9/threading.py", line 912, in _bootstrap self._bootstrap_inner()
Created attachment 224534 [details] python38-fix-threading-v1.patch At the outset: Thanks for the report and the detailed descriptions. The threading issue with Python 3.8 (tested with 3.8.9 from the Ports tree) can be reproduced via the script from comment #2. Following vanilla releases of FreeBSD were tested: - 11.4-RELEASE-p9 [amd64/i386] - 12.2-RELEASE-p6 [amd64/i386] - 13.0-RELEASE-p0 [amd64] - 14.0-CURRENT main-n246037-ca6e1fa3ce8 [amd64] If the attached patch is used, which contains the two backported commits, as noted in comment #0 (BPO-19466 + BPO39511), the regression doesn't occur on any platform listed above. A side note: If a VM has only one vCPU, the threading issue doesn't seem to be reproducible with an unpatched version of Python 3.8.9.
Maybe some progress....not sure but I'm cautiously optimistic. I analyzed 7 core dumps again to see if I spotted anything interesting. 6 of them were py3.8 and 1 was py3.9. 100% of them have a common pattern. The threads *tstate has lxml.etree._ParserDictionaryContext in the frames. Either the thread that was on CPU when it core dumped or some other thread. Since this is seemingly memory corruption, lxml has become my suspect since it uses it's own C bindings for the libxml2 and libxslt libraries. I've instrumented a somewhat complicated script that daemonizes, creates a concurrent.futures._base.Executor class and calls methods that use lxml library to parse geom xml information in a while true loop. Am I looking at a red herring?? Idk but this grabbed my attention so I'm running the script to see if it'll tickle the problem. Maybe the version of py-lxml we're using has a subtle issue with py3.8+. (Queue my overly dramatic rant about py3.8 changes to PyGC_Head struct.)
Using the script of Comment 2, I cannot reproduce the hang with Python 3.9.4 on FreeBSD 13. I used: for i in $(seq 1 100); do python3.9 script.py || break; done
(In reply to Victor Stinner from comment #14) Hi Victor, thanks for testing it out. Your results are expected since the 2 commits that fix the problem were included with python3.9. Have you tried using python3.8?
I can reproduce the issue with Python 3.8 on Linux. Well, first, don't rely on implicit resource management: don't rely on __del__() destructor. You should use "with pool:" and then call pool.join(). -- I'm not sure if it's related, but daemon threads are evil and should be avoided if possible. It's another variant of "implicit resource management" (don't join the thread, expect it to disappear magically at exit). I fixed many race conditions in daemon threads: * https://vstinner.github.io/daemon-threads-python-finalization-python32.html * https://vstinner.github.io/threading-shutdown-race-condition.html * https://vstinner.github.io/gil-bugfixes-daemon-threads-python39.html Python 3.9 avoids daemon threads in concurrent.futures: https://bugs.python.org/issue39812 But multiprocessing still uses daemon threads. -- I also fixed multiple bugs in the Python finalization to make it more reliable. My notes on that topic: https://pythondev.readthedocs.io/finalization.html
Backporting "bpo-19466: Py_Finalize() clears daemon threads earlier (GH-18848)" sounds dangerous. This fix relies on many other fixes. Without the other fixes, clearing daemon threads earlier can hang or crash. I suggest you to not rely on implicit resource management. If you really have to, use Python 3.9 which seems to work for your issue.
(In reply to Victor Stinner from comment #17) Thanks for the tips and the links to your notes. However, the panics are still happening on py3.9.4. Unfortunately, I've analyzed 11 core dumps at this point. 10 from py3.8 and 1 from py3.9.4. All of their tstate has a reference to lxml.etree._ParserDictionaryContext. I actually just analyzed a multiprocessing core dump and it is panic'ing during garbage collection (visit_decref) but the story is the same. _ParserDictionaryContext is in the thread at time of dump.
(In reply to Victor Stinner from comment #17) Thank you for the feedback Victor. Does this also mean (in your opinion) that carrying local patches for these commits in FreeBSD Ports is not a wise resolution?