Bug 235158 - lang/lua53 no longer linked against pthread
Summary: lang/lua53 no longer linked against pthread
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: David Naylor
URL: https://reviews.freebsd.org/D18939
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-23 14:32 UTC by i+fbsd
Modified: 2019-06-17 06:28 UTC (History)
3 users (show)

See Also:
bugzilla: maintainer-feedback? (russ.haley)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description i+fbsd 2019-01-23 14:32:28 UTC
Prior to this, maintaining my own package repository, building lua53, apparently it was linked against pthreads. Changes in the FreeBSD 12 ports system may have caused this?

The issue is that it *was* previously by default linked against pthreads, so when I did an upgrade to FreeBSD 12 and rebuilt the package repo and upgraded all the packages, all of a sudden some of my lua code stopped working.

If it is intended to not by default link against pthreads, then perhaps a package message notifying of this change would be a good idea so people dependent upon the old behaviour will know they need to build their own lua if they depended on anything that might create new lua states in threads.

If this was unintended then fixing the patch files in the ports to patch the makefile again with '-pthread' in ldflags would be sufficient likely to mitigate this issue.

For now, the only workaround is to build your own lua.

When lua is not linked against pthreads, and a thread is started, and a new lua state is created in a thread, lua will hang completely.
Comment 1 i+fbsd 2019-01-23 15:05:17 UTC
Additional notes:

This bug is applicable to the FreeBSD pkg repository, not just personally built repositories.
Comment 2 Russell Haley 2019-01-23 20:14:20 UTC
Hi, 

I'm the Lua53 package maintainer. I'm relatively new to maintaining a package (and not great at C to be honest) so you'll have to bare with me. 

I have notes in the review about the decision to remove -pthread: https://reviews.freebsd.org/D13690

I did a fair bit of research, though obviously it wasn't complete: 
- Conversations on the Lua mailing list indicated that there should be no need to include pthread as Lua itself has no requirement for it. The Lua interpreter is not involved in the threading, it's simply a client library.
- I found a conversation on a Fedora list where they removed pthread which seemed to validate the conversation above. From memory, I believe the Fedora user in question changed their linking strategy to accommodate threading.
- As noted in the review, the -pthread issue seemed to be with a FreeBSD bug and my testing indicated that bug was no longer an issue so pthread was removed (details in the review noted above).

In the end, I felt that requiring pthread was outside the use case for a general package and was non-standard enough that it should be removed and left to a developer to build their own. 

I also assumed that if it becomes an issue, someone would speak up, so here you are!

I see a couple of avenues available:
1) Perhaps you should ask on the lua mailing list about your specific use case and if pthread *should* be required. That would validate this as a FreeBSD issue.
2) A deep dive on a mailing list or IRC to determine what about FreeBSD is causing the problem.
3) If required, an option/flag could be added to the port to toggle pthread. 

If a deeper understanding of the problem is required, there is another user that may be able to help but I'll hold off on volunteering him until we get more clarity. 

I'm happy to help in any way I can. Thoughts?

Russ
Comment 3 i+fbsd 2019-01-24 02:02:22 UTC
Hiya,

For as long as I can remember, back before I came to FreeBSD around 2012, I was dealing with having to compile my own lua to link against pthreads to stop some pet projects from hanging on creating a new thread.

I'm still reading the review you linked but figured it would be okay to go ahead and reply with some history and an idea of the issue and how I came to the conclusion that caused me to report this.

I don't think any ports in the FreeBSD repos use lua with threading, nor do any lua libraries in ports, so it would be a fairly difficult issue to catch. Of course, I may not be looking hard enough because I've only looked for lua lanes, llthread2, and cqueues, which are not in ports that I can see.

As to requiring pthread when building...
cqueues and llthreads2 both hang on thread creation with lua53 built from ports, or installed from FreeBSD package repos. I only have one example test for cqueues. The way I came to this conclusion was to bang my head at it all night, make tons of changes to the lua source, cqueues source, sprinkling in fprintf's, and finding that luaL_newstate failed. When I started editing the lua source code I put the fprintfs around the alloc and found that it printed before the allocation in the new thread, but never after, so was at that point, hung. I didn't realize that night that not being linked against pthreads was the issue.

Then this morning I started looking at the bugzilla for lua and pthreads and so forth, then the svnweb to see if it still had the pthreads link flag in the patch and couldn't see it in the patch so felt that this might be the problem. I then built lua, with -pthread, and ran the same test, and it worked, then proceeded to try against my pet project and found it worked correctly again instead of hanging after several log lines.

Debugging malloc is out of my depth, I definitely would be at a loss there. If you have the time and want to try duplicating the issue, install lua from ports, and (not using luarocks, it'll break it) install cqueues. Dealing with cqueues you need either openssl111 or base openssl (libressl and other openssl yield a broken cqueues). If you have any other ssl library you can build cqueues with 'ALL_LDFLAGS="-L/usr/lib" make all' and that seems to make it select base openssl. At that point copy the following into a file and lua53 $file. It'll hang, be prepared to have another terminal open to pkill -9 lua53. To see it then work, get lua source, and build it with "make MYLIBS=-pthread freebsd" and then run "src/lua $file" and it'll finish out fine.

file contents below:
  local cqueues = require 'cqueues'
  local thread = require 'cqueues.thread'
  
  local function print_data(sock, data) print(data) end
  local function start() thread.start(print_data, 'data') end
  
  local loop = cqueues.new()
  loop:wrap(start)
  loop:loop()

While building without pthreads enabled may work on other operating systems, I'd be interested to see if when running "ldd `which lua`" does their lua show itself linked against libthr or not (I'll see if I can get a hold of some people to check this on their non-FreeBSD systems, or see about spinning up a VM here/there to check that out too).

If more is needed in the form of a way to test this outside of what I've listed above, or perhaps a test for llthreads2, I can work towards that as well. If you need a test to run to validate things in the future as to threads and lua I can work with you on that, though I might need some input and probably would need to update the test at each lua version (or just depend on a test for lua 5.3 giving the results needed for knowing if such still needs to be done for other lua versions). If you have any other ideas on how to look deeper into this I'll help where I can. I don't currently possess the knowledge to debug malloc and friends but with some guidance can probably look although I doubt I'd be useful in that regard.

Let me know if testing on non-FreeBSD systems (checking ldd, and running the test) is unecessary so I don't venture too far into that and find that I've done it for naught, otherwise, I'll update this report with that info as soon as I can.

Thanks
Comment 4 Russell Haley 2019-01-24 05:57:45 UTC
Hi, 

I've added pthread as a port option:

https://reviews.freebsd.org/D18939

The test as you have shown passes on FreeBSD 11.1. Does this patch satisfy your use case? 

Thanks for reporting this bug, cqueues is one of my favourite libraries and I use lua-http all the time too. 

Cheers, 
Russ
Comment 5 Andrew "RhodiumToad" Gierth 2019-01-24 07:48:35 UTC
(In reply to Russell Haley from comment #4)

This needs a proper investigation, not just blindly adding the pthread option back, because it's important that the lua-5.3.so not have a dependency on pthreads.
Comment 6 Andrew "RhodiumToad" Gierth 2019-01-24 07:57:17 UTC
(In reply to i+fbsd from comment #3)

Hi,

What is the output of ldd on your _cqueues.so?

When you say that luarocks breaks cqueues, what specifically do you mean? If you built cqueues yourself, how exactly did you do so?

Your test program works on my (freebsd 11) system (on which cqueues _was_ installed via luarocks) without any errors or hangs.
Comment 7 i+fbsd 2019-01-24 12:04:30 UTC
(comment #6)
ldd /usr/local/lib/lua/5.3/_cqueues.so
/usr/local/lib/lua/5.3/_cqueues.so:
        libssl.so.111 => /usr/lib/libssl.so.111 (0x8006be000)
        libcrypto.so.111 => /lib/libcrypto.so.111 (0x800e00000)
        libthr.so.3 => /lib/libthr.so.3 (0x800753000)
        libm.so.5 => /lib/libm.so.5 (0x80077e000)
        libc.so.7 => /lib/libc.so.7 (0x800248000)

When in a FreeBSD 12-RELEASE system, luarocks, installed from source, the latest release on their page, has an issue wherein unzip fails. Installing unzip from ports didn't seem to fix it, and can't remember what I actually did to fix it. Only ran into this on one machine, on the other machine I think I took 3.0.1 and went with it to outright avoid the issue with luarocks.

cqueues fails to build properly with libressl and openssl from ports (when installed from a repo that has set them as the default SSL library). cqueues was built from source (not luarocks for each of these tests). For libressl and openssl it complained of an undefined symbol. For libressl the undefined symbol was CRYPTO_THREAD_run_once. For openssl I sadly can't remember at this moment, and likely would have to rebuild ports again against it just to pull back out the error. The thing is, cqueues would build (make, make install), but when 'require("cqueues")' is ran it would complain about the aforementioned symbols.

cqueues built with base ssl or openssl111 when built with luarocks. When loaded by lua however complained of missing 'inotify' symbols. That could be as simple as a badly configured luarocks but am not all too sure since I've never had luarocks consider itself running on linux and somehow directing things to be built against linux and being built against linux stuff work, and things be reduced to a runtime undefined symbol issue.

On a FreeBSD 11.2-RELEASE jail, luarocks+cqueues (albeit an older version of luarocks) work and generate a working cqueues seemlessly (recently tested). This is actually with a lua53 from FreeBSD pkg repos, where "ldd lua" does not show libthr. So consequently, it would seem that -pthread *is not needed* on FreeBSD 11.2-RELEASE? So we might can say that this bug only effects FreeBSD 12-RELEASE.

With a system where libressl is the default ssl under FreeBSD 12-RELEASE, I built cqueues like so: "ALL_LDFLAGS=-L/usr/lib make all" then "make install".

With a system where the ssl was base or openssl111 I just did "make all" then "make install". There was of course a fetch the release and untar it somewhere in there. The "make install" was ran as root. 

When luarocks attempts to fetch cqueues for lua-http which I also use, I let it, then tell luarocks to "remove --force --local cqueues" and let luarocks complain about dependencies in the future. I basically get things working and use luarocks where I can, and where I can't, install myself if possible. I came up with this method when I saw a pull request still in the works in regards to libressl but never merged over at wahern/cqueues and I had systems that were using libressl as default.

(comment #4)
I'll need to patch ports and do a poudriere run, but from a quick glimpse at the diff I'm guessing it will work. I'll update with information as to this after patching ports and generating a new package.

(comment #5)
I'll naively ask why it's important that lua-5.3.so not have a dependency on pthreads? I can't exactly think of a case where it would render a degradation in performance or cause an issue, but I'd like to understand the reasoning behind that. That said, the review linked gives the option of linking against pthreads. I would argue for it to be built against pthreads by default (the default option), since the FreeBSD pkg repos will use the defaults in the port, and without it as default, everyone who uses lua at this point with threads will need to build lua from ports or from source. That won't effect me (I'll just build with pthreads on my own pkg repos) but would effect others. Also I would assume the test code given in the review probably would work under lua 5.1 and 5.2. I'm pretty sure about 5.2, but I've actually actively avoided lua 5.1 and came to lua using 5.2, so wouldn't know for sure.
Comment 8 i+fbsd 2019-01-24 13:04:56 UTC
(comment #4)
Patched ports, poudriere ran, repo updated, installed new lua53, cqueues test in review passed. FreeBSD 12-RELEASE. Prior under same FreeBSD version failed. 

That will fix my use case since I build my own packages with my own options set. Without pthread as the default option, users of FreeBSD's default package repos will be left hanging.

Thanks, that'll fix it for me.
Comment 9 Russell Haley 2019-01-25 05:52:43 UTC
>>! In D18939#404544, @andrew_tao173.riddles.org.uk wrote:
> Russ, did you check whether that test actually failed without the patch? Because it works fine for me without changing a thing.

Well this is fun. I have an 11.1 jail that has only ever had lua53 installed from pkg (on my server, where I tested the bug initially). The test fails in that jail every time. 

The VirtualBox VM on my laptop is 11.1 as well; it's my dev computer. The test failed when I ran Lua53 from ports prior to building with this patch. I noted after installing the patched version, the test *failed* the first time I ran lua with the test script but *succeeded* every time after that. I assumed that since "running a script is deterministic" I had clearly done something weird. Now check the two different shell outputs:


```
russellh@g1 ~/P/n/b/lua> lua53
fish: Unknown command 'lua53'
russellh@g1 ~/P/n/b/lua> sudo pkg install lua53
Updating GhostBSD repository catalogue...
GhostBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
	lua53: 5.3.5

Number of packages to be installed: 1

The process will require 1 MiB more space.

Proceed with this action? [y/N]: y
[1/1] Installing lua53-5.3.5...
[1/1] Extracting lua53-5.3.5: 100%
russellh@g1 ~/P/n/b/lua> lua53
Lua 5.3.5  Copyright (C) 1994-2018 Lua.org, PUC-Rio
> ⏎                                                                             russellh@g1 ~/P/n/b/lua> cd /tmp
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
data
russellh@g1 /tmp> lua53 pth-test.lua 
data
russellh@g1 /tmp> 
```

I (naturally) expected this to fail every time because the installation is from pkg (with no -pthread)? I then removed the pkg installation and built using the patched make file, but without -pthread. Here is my link step for proof:


```
> --- liblua-5.3.so ---
> cc -o liblua-5.3.so -Wall -Wextra -DLUA_COMPAT_5_2 -DLUA_USE_POSIX -DLUA_USE_DLOPEN -DLUA_USE_READLINE_DL -isystem /usr/local/include -O2 -pipe  -fPIC -fstack-protector -isystem /usr/local/include  -fstack-protector -fstack-protector -shared -Wl,-soname=liblua-5.3.so lapi.o lcode.o lctype.o ldebug.o ldo.o ldump.o lfunc.o lgc.o llex.o lmem.o lobject.o lopcodes.o lparser.o lstate.o lstring.o ltable.o ltm.o lundump.o lvm.o lzio.o lauxlib.o lbaselib.o lbitlib.o lcorolib.o ldblib.o liolib.o lmathlib.o loslib.o lstrlib.o ltablib.o lutf8lib.o loadlib.o linit.o -lm
> --- liblua-5.3.a ---
> ar -crD liblua-5.3.a lapi.o lcode.o lctype.o ldebug.o ldo.o ldump.o lfunc.o lgc.o llex.o  lmem.o lobject.o lopcodes.o lparser.o lstate.o lstring.o ltable.o  ltm.o lundump.o lvm.o lzio.o lauxlib.o lbaselib.o lbitlib.o lcorolib.o ldblib.o liolib.o  lmathlib.o loslib.o lstrlib.o ltablib.o lutf8lib.o loadlib.o linit.o 
> ranlib liblua-5.3.a
> --- lua53 ---
> cc -o lua53  -fstack-protector lua.o liblua-5.3.a -lm -Wl,-E -L/usr/local/lib
> --- luac53 ---
> cc -o luac53  -fstack-protector luac.o liblua-5.3.a -lm -Wl,-E -L/usr/local/lib
> 

```


Now look at the following tests: 


```
russellh@g1 /tmp> lua53 pth-test.lua 
data
russellh@g1 /tmp> lua53 pth-test.lua 
data
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
data
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
russellh@g1 /tmp> lua53 pth-test.lua 
data


``` 

uuuh, what? For counterpoint, here's my server that has only had lua from pkg and has never linked to -pthread:


```
russellh@freebird:~/lua_bug$ uname -a
FreeBSD freebird 11.1-RELEASE FreeBSD 11.1-RELEASE #0: Sat Jun  2 22:49:42 PDT 2018     russellh@sylvester:/usr/obj/usr/src/sys/GENERIC  amd64
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$ lua53 bugtest.lua
russellh@freebird:~/lua_bug$

```
Comment 10 Andrew "RhodiumToad" Gierth 2019-01-25 05:54:24 UTC
(In reply to i+fbsd from comment #7)

You have to understand that this is not a bug in the lua port, but rather a bug in the base system; changing the lua port is a *workaround*, not a fix.

It's important for lua-5.3.so (as distinct from the lua53 binary) not to be linked against pthreads because of exactly the problem you're reporting. If it were true that linking the .so against pthreads did not cause any issue, then your bug report would not exist; if you link the .so against pthreads as part of a workaround to fix *your* bug, then it breaks anyone *else* who is trying to use Lua in dynamic plugins in non-threaded programs. (Lua itself has no interaction with pthreads whatsoever.)

(Or to put it another way: if it's safe to link the .so against pthreads then it is also unnecessary to do so, while if it's not safe then obviously it must not be done. Therefore, either way you don't do it.)

Would you be prepared to try out some non-Lua test cases for me to try and reproduce the underlying bug?
Comment 11 Russell Haley 2019-01-25 06:10:56 UTC
(In reply to andrew from comment #5)
Thanks for stepping in. :)
Comment 12 Andrew "RhodiumToad" Gierth 2019-01-25 06:19:52 UTC
(In reply to Russell Haley from comment #9)

The result's not deterministic because the program never actually waits for the thread to run, so it can exit the main program even before the thread gets going.

Here is a corrected test:

local cqueues = require 'cqueues'
local thread = require 'cqueues.thread'

local function print_data(sock, data) print(data) end
local function start()
	local thr,sock = thread.start(print_data, 'data')
	thr:join()
end

local loop = cqueues.new()
loop:wrap(start)
assert(loop:loop())
Comment 13 Andrew "RhodiumToad" Gierth 2019-01-25 10:30:26 UTC
(In reply to i+fbsd from comment #3)

Can you run your test (using the original unpatched lua) using

MALLOC_CONF="utrace:true" ktrace -t +w lua53 ...

If/when it hangs, then kill it with -9

Then make the output of  kdump -RH  available (it'll probably be a couple of megabytes).

And just to check, can you make sure that adding the thread:join() call as in my version of the test script doesn't change the result for you.
Comment 14 i+fbsd 2019-01-25 13:26:04 UTC
First thing:

Could we be sure to test this against FreeBSD 12 as well? I'm unsure as to why it apparently worked just fine in an 11.2 jail under 12, but it did. I probably can bring up an 11.1 jail too. If you need someone to help when checking against 12, I volunteer (I keep seeing you both ref 11.* and it sort of seems as if neither of you have a 12 system to easily test against). That said, tell me what exactly to test and how, and I'll jump to and get back to you with results ASAP.

(comment #9)
Sorry as to the test, I apparently depended on a race condition. Glad you found a way to make it fully deterministic. Threading is hard. Creating a threading test is harder.

(comment #10)
I really would like at least a workaround for the bug in base in the port until base is fixed, otherwise we'll be waiting on a patch release or FreeBSD 13 no?

I can understand the reason as to lua-5.3.so now. Is there perhaps a way to build the lua cli only with pthread, and the rest without?

I would be more than willing to try out any test cases you wish. Do note I'm running FreeBSD 12 as a host system and can likely get any 11.* jail up you wish (latest patch only though). So if you wish I can run the tests under *each* release you deem needed so long as you specify.

(comment #13)
I'll run that test soon and get you the results. I will by default run tests you specify under FreeBSD 12 only. If you specify other FreeBSD releases to attempt (probably 11+, I might can even do 10, but no guarantee there) I'll make every effort to get you the results across ASAP. Also, please note if something needs to be done outside of a jail (in which case... only FreeBSD 12 is available to me at this time for non-jail cases, although I might can get a VM going of other releases just that would delay things).
Comment 15 i+fbsd 2019-01-25 14:24:02 UTC
# original lua (unpatched makefile, pkg repo == mine) proof
# below
ldd /usr/local/bin/lua53
/usr/local/bin/lua53:
        libm.so.5 => /lib/libm.so.5 (0x80027e000)
        libc.so.7 => /lib/libc.so.7 (0x8002b0000)
-----------------------------------------------------------------------------
# Test = pthread, lua (unpatched makefile, pkg repo == mine)
lua53 pthr_test.lua
Killed
-----------------------------------------------------------------------------
# Test = MALLOC_CONF, lua (unpatched makefile, pkg repo = mine)
MALLOC_CONF="utrace:true" ktrace -t +w lua53 pthr_test.lua
Killed
-----------------------------------------------------------------------------
# ran, kdump -RH output linked
# disallows attachments over 1000KB, file is 1606KB

https://builds.amlegion.org/freebsd_test/malloc_conf_test_unpatched_lua.txt
-----------------------------------------------------------------------------
Comment 16 Andrew "RhodiumToad" Gierth 2019-01-25 17:13:02 UTC
(In reply to i+fbsd from comment #15)

I put up a little test prog at https://github.com/RhodiumToad/dynthr-test

Can you try it and see if it hangs the same way? If it doesn't, try removing some of the progress messages or adjusting the size/number of mallocs. (No need to test it on versions other than 12)

If it does hang, can you ktrace it the same way as before?

(This is an attempt to reproduce the issue without involving lua at all)
Comment 17 Andrew "RhodiumToad" Gierth 2019-01-25 17:18:06 UTC
(In reply to i+fbsd from comment #14)

Getting the lua53 binary linked against libthr is I think the way to go at this time - it doesn't even need to be optional since (unlike with the .so) there's no hazard to non-threaded code to worry about. I've already mentioned this on the review.
Comment 18 i+fbsd 2019-01-25 19:30:15 UTC
(In reply to andrew from comment #16)
https://builds.amlegion.org/freebsd_test/dynthr_ktrace.txt
That is the ktrace as per the method specified when we did the lua ktrace.

This does not exactly hang the same as the lua test case, but that's because it doesn't mask all signals like cqueues does when creating a thread (if I remember correctly). I do know that cqueues did have the signals masked and that's why it hung and a kill -9 was required and even a CTRL+C (SIGTERM) didn't kill it.

The failure exhibited itself with no changes to the code whatsoever, the SIGALARM takes the place of having to CTRL+C and with no signals masked although the thread is hung, one can still get this to exit. No messages about thrashing ever displayed whatosever, and it should have either from the main thread or child thread.
Comment 19 Andrew "RhodiumToad" Gierth 2019-01-26 05:24:25 UTC
(In reply to i+fbsd from comment #18)

OK. There are some small differences in the ktrace between the test program and the lua example, but the test program is sufficient to demonstrate the existence of a base system bug.

Can you try the test again with RTLD_LOCAL changed to RTLD_LOCAL|RTLD_NOW ?

And if that still hangs, then fetch the latest version of the test and try #defining PRELOAD_LIBTHR and see if that makes any difference.

Would you be prepared to report this against the base system? I would do it myself except that, as you deduced, I do not have a working 12 system yet (and the bug doesn't manifest on 11). It's most likely an issue in the interaction between libthr and malloc, or in the interaction between libthr and rtld. (The test with RTLD_LOCAL changed to RTLD_LOCAL|RTLD_NOW might show which of these it is.)

(When libthr is loaded into a process, it has to ensure a bunch of symbol references are pre-resolved by rtld because otherwise they would cause deadlocks. I'm wondering if it is simply that some reference of this kind has been omitted by mistake.)
Comment 20 Andrew "RhodiumToad" Gierth 2019-01-26 05:55:46 UTC
(In reply to i+fbsd from comment #18)

Don't bother with my request to file a base system bug - kevans already did that for us (thanks).
Comment 21 i+fbsd 2019-01-26 12:34:29 UTC
Fails with both RTLD_NOW and PRELOAD:

ktrace = RTLD_NOW: http://builds.amlegion.org/freebsd_test/dynthr_ktrace_rtld_now.txt

ktrace = PRELOAD:
http://builds.amlegion.org/freebsd_test/dynthr_ktrace_preload.txt

Awesome that the report has already been filed.
Comment 22 Russell Haley 2019-01-27 06:24:45 UTC
(In reply to i+fbsd from comment #14)
"Could we be sure to test this against FreeBSD 12 as well?"

I've got a 12-RELEASE vm up now.
Comment 23 Russell Haley 2019-04-28 23:53:20 UTC
This should probably be closed. The patch was committed r496471.
Comment 24 Kubilay Kocak freebsd_committer freebsd_triage 2019-06-17 06:28:26 UTC
Assign to committer that resolved