To support capsicum, rtld right now offers the env var LD_LIBRARY_PATH_FDS to specify a list of file descriptors. That works for shared libraries, but it doesn't work for plugins. Plugins shouldn't be mixed with shared libraries. An extra env var could be used to map specific plugin library paths fds to plugins path names (e.g. fd 4 mapping to /usr/local/lib/gawk). In this case, if a dlopen() call is done against /usr/local/lib/gawk, the fd 4 would be used. In my scenario, I need this because dlopen() already executes untrusted code and for a certain piece of software I want to do this in capsicum mode.
That's an interesting case. Do you have a ready test case for your scenario? If not, I will prepare it myself.
> Do you have a ready test case for your scenario? If not, I will prepare it myself. Not really a test case. I have a project for Lua programmers that implements the actor model and can spawn each actor in a capsicum sandbox. Actors start empty so plugins already loaded in the main process won't affect new actors. Therefore new actors will try to load necessary plugins again: https://gitlab.com/emilua/emilua/-/blob/v0.6.0/src/state.ypp#L479 (dll::import() will just call dlopen()). I've been experimenting with sandboxing technologies a lot and opened other issues in FreeBSD's bugzilla that were fixed already. rtld is the new challenge for my experiments. Here's an introduction if you need context on what's my use-case: https://docs.emilua.org/api/0.6/tutorial/sandboxes.html
It seems to me this use case may be well served by fdlopen() - you would hoist path mapping and directory fd logic from rtld into the application. We could provide utility routines (in libcasper?) to aid in this.
> It seems to me this use case may be well served by fdlopen() I'll have to think about it (give me a week?). I wasn't aware of fdlopen(). I reserve the right to change my mind, but here are my initial thoughts: * Does fdlopen() avoid duplicates as well? How is that done? Does it compare inode numbers using fstat()? * This change isn't automatic and it'd require changes across every software that makes use of dlopen() (e.g. GAWK, Python, ...). I'm maintaining my own software so I can cope with any changes necessary to support capsicum, but we should balance the weights as a whole nonetheless. * Right now a few of my plugins load other plugins for whatever reason that must be done. For instance, a plugin linked against Qt will be split in two where one has no dependencies against the Qt libraries and then creates a thread to load the plugin that will bring Qt code to the process: <https://gitlab.com/emilua/qt6/-/blob/v1.0.1/src/qt_handle.cpp#L29> (this is done because Qt only works in the "main thread" and we need to lie to Qt about which one is the main thread as global variables constructed at load time will remember the initializing thread). LD_LIBRARY_PATH_FDS will help with the paths for the shared Qt libraries here, but I still have to think on how to handle this scenario. * I'll have to think about how to deal with cycles (unless fdlopen() already takes care of that for me). Give me some time to think.
(In reply to vini.ipsmaker from comment #4) Both dlopen and fdlopen call into load_object, and both paths do indeed compare device and inode: if (fstat(fd, &sb) == -1) { _rtld_error("Cannot fstat \"%s\"", printable_path(path)); close(fd); free(path); return (NULL); } TAILQ_FOREACH(obj, &obj_list, next) { if (obj->marker || obj->doomed) continue; if (obj->ino == sb.st_ino && obj->dev == sb.st_dev) break; } however it looks like the dedup indeed requires the name: if (obj != NULL && name != NULL) { object_add_name(obj, name); free(path); close(fd); return (obj); } so it indeed seems like there's some more investigation needed here still. I don't object to having a new env var akin to LD_LIBRARY_PATH_FDS and it is likely an easier/faster way to address existing software. fdlopen (perhaps with a dedup change in FreeBSD?) may be a conceptually cleaner path for new software or where platform-specific support is reasonable.
(In reply to Ed Maste from comment #5) https://reviews.freebsd.org/D44019
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=452c5e9995ab4cd6c7ea230cffe0c53bfa65c1ab commit 452c5e9995ab4cd6c7ea230cffe0c53bfa65c1ab Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-02-22 01:18:06 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2024-02-22 01:27:09 +0000 fdlopen(3): do not create a new object mapping if already loaded This is expected behavior for both dlopen(3) and fdlopen(3). PR: 277169 Reviewed by: emaste Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D44019 libexec/rtld-elf/rtld.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=11137153ab90c38cc5e9993f5c65491c47705024 commit 11137153ab90c38cc5e9993f5c65491c47705024 Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-02-22 01:18:06 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2024-02-29 00:24:06 +0000 fdlopen(3): do not create a new object mapping if already loaded PR: 277169 (cherry picked from commit 452c5e9995ab4cd6c7ea230cffe0c53bfa65c1ab) libexec/rtld-elf/rtld.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=1545d8732d2e076d085c08a13d3b6fc981b7f7fb commit 1545d8732d2e076d085c08a13d3b6fc981b7f7fb Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-02-22 01:18:06 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2024-02-29 00:24:43 +0000 fdlopen(3): do not create a new object mapping if already loaded PR: 277169 (cherry picked from commit 452c5e9995ab4cd6c7ea230cffe0c53bfa65c1ab) libexec/rtld-elf/rtld.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
> It seems to me this use case may be well served by fdlopen() I gave it some thought, and I think fdlopen() should work perfectly to me. I'll proceed with it.
Do you still see value in an env var for dlopen path-fd mapping? Or, shall we close this bug report (now that kib has fixed the bug found as a side-effect of the investigation)? I've opened a review to add a Capsicum mention in fdlopen's man page: https://reviews.freebsd.org/D45108
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d84fd89ecd404ffbf629381d2dde14fd79b39402 commit d84fd89ecd404ffbf629381d2dde14fd79b39402 Author: Ed Maste <emaste@FreeBSD.org> AuthorDate: 2024-05-07 01:45:50 +0000 Commit: Ed Maste <emaste@FreeBSD.org> CommitDate: 2024-05-07 13:09:59 +0000 dlopen(3): mention fdlopen for capsicum(4) Capsicum-sandboxed applications generally cannot use dlopen, as absolute and cwd-relative paths cannot be accessed. Mention that fdlopen is useful for sandboxed applications. PR: 277169 Reviewed by: markj, oshogbo Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D45108 lib/libc/gen/dlopen.3 | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
> Do you still see value in an env var for dlopen path-fd mapping? Or, shall we close this bug report (now that kib has fixed the bug found as a side-effect of the investigation)? You can close the bug report. I prefer to rely on fdlopen().
Closing as requested; there is now a hint in the dlopen/fdlopen man page that will hopefully give a hint to someone else who may have a similar question (as well as a bugfix found as a result of this).
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=f2109683ce37927d02965fc97e5757761caf89ed commit f2109683ce37927d02965fc97e5757761caf89ed Author: Ed Maste <emaste@FreeBSD.org> AuthorDate: 2024-05-07 01:45:50 +0000 Commit: Ed Maste <emaste@FreeBSD.org> CommitDate: 2024-05-13 14:22:34 +0000 dlopen(3): mention fdlopen for capsicum(4) Capsicum-sandboxed applications generally cannot use dlopen, as absolute and cwd-relative paths cannot be accessed. Mention that fdlopen is useful for sandboxed applications. PR: 277169 Reviewed by: markj, oshogbo Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D45108 (cherry picked from commit d84fd89ecd404ffbf629381d2dde14fd79b39402) lib/libc/gen/dlopen.3 | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3c05a91910d7ee809f884ca6ed46944bfda260e4 commit 3c05a91910d7ee809f884ca6ed46944bfda260e4 Author: Ed Maste <emaste@FreeBSD.org> AuthorDate: 2024-05-07 01:45:50 +0000 Commit: Ed Maste <emaste@FreeBSD.org> CommitDate: 2024-05-13 14:24:57 +0000 dlopen(3): mention fdlopen for capsicum(4) Capsicum-sandboxed applications generally cannot use dlopen, as absolute and cwd-relative paths cannot be accessed. Mention that fdlopen is useful for sandboxed applications. PR: 277169 Reviewed by: markj, oshogbo Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D45108 (cherry picked from commit d84fd89ecd404ffbf629381d2dde14fd79b39402) (cherry picked from commit f2109683ce37927d02965fc97e5757761caf89ed) lib/libc/gen/dlopen.3 | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
I've been using fdlopen() successfully under capsicum. It works. However if a plugin depends on libraries not yet loaded, they fail to load (expectedly). I can fill LD_LIBRARY_PATH_FDS easily before a call to fdlopen(), but I cannot easily find the library search paths for the current process. Is there a way to query which paths (including builtin search paths) the rtld for the current running process will use? I'd then manually open these directories and fill LD_LIBRARY_PATH_FDS. If not, can we have a new function to do this query in CAPSICUM_HELPERS(3)?
dlinfo(RTLD_DI_SERINFO) returns the configured search path. See dlinfo(3) which also contains the example of usage.