Summary: | bhyve: Guest fails to run: boot/userboot.so: Undefined symbol "getsecs" with WITH_BIND_NOW=yes | ||
---|---|---|---|
Product: | Base System | Reporter: | tech-lists |
Component: | bhyve | Assignee: | Kyle Evans <kevans> |
Status: | Closed FIXED | ||
Severity: | Affects Some People | CC: | dim, editor, imp, jhb, kevans, manu, toolchain, virtualization, warlock |
Priority: | --- | Keywords: | needs-qa, regression |
Version: | 13.1-RELEASE | Flags: | koobs:
mfc-stable13?
koobs: mfc-stable12- |
Hardware: | amd64 | ||
OS: | Any | ||
URL: | https://reviews.freebsd.org/D34758 |
Description
tech-lists
2022-03-30 06:22:59 UTC
The latest change involving getsecs in the timeframe is https://cgit.freebsd.org/src/commit/?h=releng/13.1&id=4003cdd81b8776cb451395ffa53423ad52328bc9 I don't see how my commit could have caused this as getsecs() was called before too, I've just reduced the number of calls. (In reply to Emmanuel Vadot from comment #2) I don't know to debug this further. userboot.so hasn't changed according to cgit since 2019 in https://cgit.freebsd.org/src/commit/?id=68861a62f5363e6984ba96efe6463e882a9c4896 bhyvectl hasn't changed since the beginning of 2019 in https://cgit.freebsd.org/src/commit/?id=62c47c7f6cdc6defa1dcd6a2ef39abb34a4c0351 the same goes for bhyve in 13.0-p6, getsecs isn't present in userboot.so and bhyve works as expected. Your change to getsecs is the only documented one in the timeframe 13.0 => 13.1 Where else to look? I'm happy to update sources to latest stable/13 and then removing your change, enabling WITH_BIND_NOW= in /etc/src then building a new world, kernel and then rebooting, but I don't know enough about git to cherry-pick to revert just your change leaving everything else as it is. If you could tell me how to do this, i'll do it, test then get back to you. In case it wasn't clear in the initial report, the problem can be sidestepped by *not* enabling WITH_BIND_NOW in /etc/src.conf and rebuilding. This might not be an issue with getsecs at all, but solely with WITH_BIND_NOW= and how it interacts with other libraries. If it is, I'm happy to make another PR. The fact that it's historically worked is the shocking part. libsa has used getsecs for ages, and it hasn't been implemented in userboot for ages. Maybe we used to end up clobbering BIND_NOW's flags for stand and have stopped doing so. From the bhyve call: Are your sources, userland, kernel, ports in sync, and you have cleanly booted to everything? (In reply to Michael Dexter from comment #5) That's not really relevant, the current implementation of userboot simply cannot work with BIND_NOW because of how the option works -- the symbol is clearly not defined, BIND_NOW is working as expected and blowing up in rtld. CC toolchain@, maybe. I suspect the networking bits of libsa were all optimized out of userboot before (it can't do network bits), and a toolchain change might've stopped that. I suspect we should just add a bogus getsecs() to userboot that panic()s. I considered a weak getsecs() in libsa, but I think that would hide it from presenting as a compile-time issue in all of the parts of libsa that aren't linked as shared objects. It's not really worth the bhyveload change for something that isn't going to be used to implement it properly. CC imp@ and jhb@ for a second opinion. (In reply to Kyle Evans from comment #7) s/all the parts of libsa/all the parts of stand (other loaders)/ If I dump userboot.so with readelf, I see more than on UND symbol: Symbol table '.dynsym' contains 1094 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND memmem 2: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND time 3: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND getsecs Shouldn't these be defined somewhere in libsa? E.g. stuff like strcpy() and such are defined in one of the utility .a files that get statically linked into userboot.so. Not sure why this worked before, but then again, this is a question that seems to come up often... :) (In reply to Kyle Evans from comment #7) Would be it helpful if i reverted the hash and rebuilt userboot? [1] Or is the problem more extensive than simply that [1] if true, please tell me exactly how, and i'll report back with results (In reply to Dimitry Andric from comment #9) Right, memmem and time end up resolving from libc in the bhyveload that dlopen's it. That's a good point, though; we essentially have time() available and could implement a proper getsecs(). (In reply to tech-lists from comment #10) No, we've no idea what commit in particular causes this failure without a proper bisect. It doesn't really matter, though. I suspect WITH_BIND_NOW=yes is just not as typical a use case and that this has probably been broken for a while. An implementation of getsecs in userboot is probably the simplest fix. I didn't see any relevant review opened, so I'll take it -- see: https://reviews.freebsd.org/D34758, this fixes our lua test harness WITH_BIND_NOW. (In reply to Kyle Evans from comment #13) Hi, I'll be able to test this in the next couple of days and I'll get back to you. thanks, A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=660c1892d5c90500d37f98185326c6287b2b61be commit 660c1892d5c90500d37f98185326c6287b2b61be Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2022-04-13 00:29:54 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2022-04-13 00:33:54 +0000 loader: userboot: provide a getsecs() implementation We don't need it for userboot, but it avoids issues with BIND_NOW, so just provide it. time(3) isn't defined but ends up being provided by libc linked into the host process, which is generally fine. PR: 262920 Reviewed by: imp, jhb MFC after: 3 days Diferential Revision: https://reviews.freebsd.org/D34758 stand/userboot/userboot/main.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) It looks like the MFC of 3 days, so should have landed on ~4/16? but I don't see it in 13/stable or releng/13.1 yet. Is it going to make it into 13.1? A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=53fefea53f12f13ef53b639c7d2073ffc84523ab commit 53fefea53f12f13ef53b639c7d2073ffc84523ab Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2022-04-13 00:29:54 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2022-04-21 22:33:21 +0000 loader: userboot: provide a getsecs() implementation We don't need it for userboot, but it avoids issues with BIND_NOW, so just provide it. time(3) isn't defined but ends up being provided by libc linked into the host process, which is generally fine. PR: 262920 Reviewed by: imp, jhb (cherry picked from commit 660c1892d5c90500d37f98185326c6287b2b61be) stand/userboot/userboot/main.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=c85cf4929417ce6e11a84d1dfed13654b14c6ae7 commit c85cf4929417ce6e11a84d1dfed13654b14c6ae7 Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2022-04-13 00:29:54 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2022-04-21 22:35:01 +0000 loader: userboot: provide a getsecs() implementation We don't need it for userboot, but it avoids issues with BIND_NOW, so just provide it. time(3) isn't defined but ends up being provided by libc linked into the host process, which is generally fine. PR: 262920 Reviewed by: imp, jhb (cherry picked from commit 660c1892d5c90500d37f98185326c6287b2b61be) stand/userboot/userboot/main.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) |