Bug 262920 - bhyve: Guest fails to run: boot/userboot.so: Undefined symbol "getsecs" with WITH_BIND_NOW=yes
Summary: bhyve: Guest fails to run: boot/userboot.so: Undefined symbol "getsecs" with ...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bhyve (show other bugs)
Version: 13.1-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Kyle Evans
URL: https://reviews.freebsd.org/D34758
Keywords: needs-qa, regression
Depends on:
Blocks:
 
Reported: 2022-03-30 06:22 UTC by tech-lists
Modified: 2024-01-20 22:13 UTC (History)
9 users (show)

See Also:
koobs: mfc-stable13?
koobs: mfc-stable12-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description tech-lists 2022-03-30 06:22:59 UTC
Hi,

Upgraded to stable/13-n250147-60338b80693

This server is host to several freebsd VMs.

They're started from a shell script that runs like this:

[...]
sh vmrun.sh -c 2 -m 32768M -t tap2 -d /data/freebsdvm.img freebsdvm
[...]

Upgraded (source upgrade) from 13.0-stable to 13.1-stable, rebooted, tried to run the vm (the system hosts five freebsd vms), tried with any of them, get this error:

 "/boot/userboot.so: Undefined symbol "getsecs"

strings showed getsecs was in userboot.so but not in bhyve

The fix was to build world and kernel *without* this:

WITH_BIND_NOW=yes
Comment 1 tech-lists 2022-03-30 21:12:39 UTC
The latest change involving getsecs in the timeframe is https://cgit.freebsd.org/src/commit/?h=releng/13.1&id=4003cdd81b8776cb451395ffa53423ad52328bc9
Comment 2 Emmanuel Vadot freebsd_committer freebsd_triage 2022-03-31 08:55:56 UTC
I don't see how my commit could have caused this as getsecs() was called before too, I've just reduced the number of calls.
Comment 3 tech-lists 2022-03-31 13:50:55 UTC
(In reply to Emmanuel Vadot from comment #2)

I don't know to debug this further. 

userboot.so hasn't changed according to cgit since 2019 in https://cgit.freebsd.org/src/commit/?id=68861a62f5363e6984ba96efe6463e882a9c4896

bhyvectl hasn't changed since the beginning of 2019 in https://cgit.freebsd.org/src/commit/?id=62c47c7f6cdc6defa1dcd6a2ef39abb34a4c0351

the same goes for bhyve

in 13.0-p6, getsecs isn't present in userboot.so and bhyve works as expected.
Your change to getsecs is the only documented one in the timeframe 13.0 => 13.1

Where else to look?

I'm happy to update sources to latest stable/13 and then removing your change, enabling WITH_BIND_NOW= in /etc/src then building a new world, kernel and then rebooting, but I don't know enough about git to cherry-pick to revert just your change leaving everything else as it is.

If you could tell me how to do this, i'll do it, test then get back to you.

In case it wasn't clear in the initial report, the problem can be sidestepped by *not* enabling WITH_BIND_NOW in /etc/src.conf and rebuilding.

This might not be an issue with getsecs at all, but solely with WITH_BIND_NOW= and how it interacts with other libraries.

If it is, I'm happy to make another PR.
Comment 4 Kyle Evans freebsd_committer freebsd_triage 2022-03-31 15:14:46 UTC
The fact that it's historically worked is the shocking part. libsa has used getsecs for ages, and it hasn't been implemented in userboot for ages. Maybe we used to end up clobbering BIND_NOW's flags for stand and have stopped doing so.
Comment 5 Michael Dexter freebsd_triage 2022-03-31 17:45:53 UTC
From the bhyve call: Are your sources, userland, kernel, ports in sync, and you have cleanly booted to everything?
Comment 6 Kyle Evans freebsd_committer freebsd_triage 2022-03-31 17:53:56 UTC
(In reply to Michael Dexter from comment #5)

That's not really relevant, the current implementation of userboot simply cannot work with BIND_NOW because of how the option works -- the symbol is clearly not defined, BIND_NOW is working as expected and blowing up in rtld.
Comment 7 Kyle Evans freebsd_committer freebsd_triage 2022-03-31 18:28:59 UTC
CC toolchain@, maybe. I suspect the networking bits of libsa were all optimized out of userboot before (it can't do network bits), and a toolchain change might've stopped that.

I suspect we should just add a bogus getsecs() to userboot that panic()s. I considered a weak getsecs() in libsa, but I think that would hide it from presenting as a compile-time issue in all of the parts of libsa that aren't linked as shared objects. It's not really worth the bhyveload change for something that isn't going to be used to implement it properly. CC imp@ and jhb@ for a second opinion.
Comment 8 Kyle Evans freebsd_committer freebsd_triage 2022-03-31 18:29:35 UTC
(In reply to Kyle Evans from comment #7)

s/all the parts of libsa/all the parts of stand (other loaders)/
Comment 9 Dimitry Andric freebsd_committer freebsd_triage 2022-03-31 19:06:59 UTC
If I dump userboot.so with readelf, I see more than on UND symbol:

Symbol table '.dynsym' contains 1094 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND memmem
     2: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND time
     3: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND getsecs

Shouldn't these be defined somewhere in libsa? E.g. stuff like strcpy() and such are defined in one of the utility .a files that get statically linked into userboot.so.

Not sure why this worked before, but then again, this is a question that seems to come up often... :)
Comment 10 tech-lists 2022-03-31 21:15:17 UTC
(In reply to Kyle Evans from comment #7)

Would be it helpful if i reverted the hash and rebuilt userboot? [1]

Or is the problem more extensive than simply that

[1] if true, please tell me exactly how, and i'll report back with results
Comment 11 Kyle Evans freebsd_committer freebsd_triage 2022-03-31 21:26:03 UTC
(In reply to Dimitry Andric from comment #9)

Right, memmem and time end up resolving from libc in the bhyveload that dlopen's it. That's a good point, though; we essentially have time() available and could implement a proper getsecs().

(In reply to tech-lists from comment #10)

No, we've no idea what commit in particular causes this failure without a proper bisect. It doesn't really matter, though.
Comment 12 John Baldwin freebsd_committer freebsd_triage 2022-04-01 18:49:49 UTC
I suspect WITH_BIND_NOW=yes is just not as typical a use case and that this has probably been broken for a while.  An implementation of getsecs in userboot is probably the simplest fix.
Comment 13 Kyle Evans freebsd_committer freebsd_triage 2022-04-03 01:53:53 UTC
I didn't see any relevant review opened, so I'll take it -- see: https://reviews.freebsd.org/D34758, this fixes our lua test harness WITH_BIND_NOW.
Comment 14 tech-lists 2022-04-12 23:30:25 UTC
(In reply to Kyle Evans from comment #13)

Hi, I'll be able to test this in the next couple of
days and I'll get back to you. 

thanks,
Comment 15 commit-hook freebsd_committer freebsd_triage 2022-04-13 00:34:26 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=660c1892d5c90500d37f98185326c6287b2b61be

commit 660c1892d5c90500d37f98185326c6287b2b61be
Author:     Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2022-04-13 00:29:54 +0000
Commit:     Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2022-04-13 00:33:54 +0000

    loader: userboot: provide a getsecs() implementation

    We don't need it for userboot, but it avoids issues with BIND_NOW, so
    just provide it.  time(3) isn't defined but ends up being provided by
    libc linked into the host process, which is generally fine.

    PR:     262920
    Reviewed by:    imp, jhb
    MFC after:      3 days
    Diferential Revision:   https://reviews.freebsd.org/D34758

 stand/userboot/userboot/main.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)
Comment 16 John Kennedy 2022-04-21 14:15:22 UTC
It looks like the MFC of 3 days, so should have landed on ~4/16?  but I don't see it in 13/stable or releng/13.1 yet.  Is it going to make it into 13.1?
Comment 17 commit-hook freebsd_committer freebsd_triage 2022-04-21 22:35:39 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=53fefea53f12f13ef53b639c7d2073ffc84523ab

commit 53fefea53f12f13ef53b639c7d2073ffc84523ab
Author:     Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2022-04-13 00:29:54 +0000
Commit:     Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2022-04-21 22:33:21 +0000

    loader: userboot: provide a getsecs() implementation

    We don't need it for userboot, but it avoids issues with BIND_NOW, so
    just provide it.  time(3) isn't defined but ends up being provided by
    libc linked into the host process, which is generally fine.

    PR:     262920
    Reviewed by:    imp, jhb

    (cherry picked from commit 660c1892d5c90500d37f98185326c6287b2b61be)

 stand/userboot/userboot/main.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)
Comment 18 commit-hook freebsd_committer freebsd_triage 2022-04-21 22:38:41 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=c85cf4929417ce6e11a84d1dfed13654b14c6ae7

commit c85cf4929417ce6e11a84d1dfed13654b14c6ae7
Author:     Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2022-04-13 00:29:54 +0000
Commit:     Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2022-04-21 22:35:01 +0000

    loader: userboot: provide a getsecs() implementation

    We don't need it for userboot, but it avoids issues with BIND_NOW, so
    just provide it.  time(3) isn't defined but ends up being provided by
    libc linked into the host process, which is generally fine.

    PR:     262920
    Reviewed by:    imp, jhb

    (cherry picked from commit 660c1892d5c90500d37f98185326c6287b2b61be)

 stand/userboot/userboot/main.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)