Bug 281440 - devel/llvm18: -Wl,--version-script failing in 'configure' script tests / silent unexpected ABI changes
Summary: devel/llvm18: -Wl,--version-script failing in 'configure' script tests / sile...
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Brooks Davis
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-09-11 14:10 UTC by John Hein
Modified: 2024-09-11 20:57 UTC (History)
2 users (show)

See Also:
bugzilla: maintainer-feedback? (brooks)


Attachments
[patch] address newer lld changes which choke on undefined symbols listed in linker scripts (--version-script) (4.28 KB, patch)
2024-09-11 18:27 UTC, John Hein
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Hein 2024-09-11 14:10:05 UTC
Some ports that do configure script testing for support of --version-script in the linker are now producing libraries that no longer have symbol versioning enabled for the libraries that are produced.

some examples: security/libtasn1, security/p11-kit

After updating a FreeBSD system from 13.2 to 13.4, I started seeing linking failures due to libraries that USED to have symbol versioning information that now do not.  13.2's base C compiler was llvm15 and 13.4 has llvm18.

Here is (basically) what the configure script is doing:

% cat > conftest.map << eof
VERS_1 {
        global: sym;
};

VERS_2 {
        global: sym;
} VERS_1;
eof
% cat > conftest.c << eof
int main(void) { ; return0; }
eof
% cc -Wl,--version-script=conftest.map -o dummy conftest.c


clang15 had no problem with that, but clang18 does not like the undefined symbols:

ld: error: version script assignment of 'VERS_1' to symbol 'sym' failed: symbol not defined
ld: error: version script assignment of 'VERS_2' to symbol 'sym' failed: symbol not defined
cc: error: linker command failed with exit code 1 (use -v to see invocation)


As a result, the configure script disable symbol versioning, and the libraries that are produced no longer have the version definitions that are listed in the linker script that would have been used if symbol versioning were enabled.

This was noticed when ports were rebuilt without rebuilding their dependent ports when no ABI SHOULD have changed (e.g., changing man/ to share/man in security/libtasn1).  The new lib that was built no longer had the version definitions from the linker script.  And dependent libraries broke due to the unexpected ABI change as a result.  For example:

ld-elf.so.1: /usr/local/lib/libtasn1.so.6: version LIBTASN1_0_3 required by /usr/local/lib/libwebkit2gtk-4.0.so.37

One fix for this is to add -Wl,--undefined-version when doing the configure test (see for instance the recent heimdal linker failure in bug 275979).

But I don't know all the ports that are affected.  It was noticed (so far) for security/p11-kit and security/libtasn1.  If it is just those two, maybe it's manageable.

It's worse (from a detection problem point of view) if the affected port doesn't even have a PORTREVISION bump but gets rebuilt for some other reason that causes the new versions of the libraries to get the ABI change without the linker script symbols.

Maybe the autoconf m4 scripts should be updated to produce a test that includes using --undefined-version for the linker (or adds an attempt to try with --undefined-version when the first attempt fails without it) on a port by port basis.

It'd be nice to come up with a way to find ports that are now vulnerable this way, but don't fail to build (later failing at dynamic linking time).  But I can't think of any particularly effective way to find such vulnerable ports.

Maybe what would be helpful would be an ABI before/after check that ports maintainers (or even automation in stage-qa) could use to check for unexpected ABI changes when doing a port update.  Maybe something already exists in ports/ land?  This is really a larger ABI compatibility assessment problem when updating a port / ports.

Apologies for classifying this as a devel/llvm18 problem - it's not really.  It would help if I had a better grasp of the affected ports.  I'll probably reclassify this after thinking some more (and getting some helpful thoughts on the subject).

If the audience that responds here has ideas about potential ways to address this so users are less likely to be blindsided by ABI changes, please offer suggestions.  I guess I'm thinking the best solution is to come up with an automated way to detect ABI changes so at least it is easier to determine when dependent ports need a PORTREVISION bump due to a potential breaking ABI change.
Comment 1 John Hein 2024-09-11 18:27:54 UTC
Created attachment 253502 [details]
[patch] address newer lld changes which choke on undefined symbols listed in linker scripts (--version-script)

Here's an initial patch I have for security/libtasn1 which changes the check for --version-script to try with --undefined-version if it fails the first attempt.  

It might need need a PORTREVISION bump in case existing packages have been built that fail the --version-script check (and thus don't provide versioning info, and so technically this would be a change to the generated package).  It may be a good idea to bump dependent ports that link with libtasn1 as well, but perhaps not necessary.

I'll open a separate bug for libtasn1, but I'm including it here so readers can see it as an example of the required fix (which could be fed upstream to the libtasn1 project).
Comment 2 Brooks Davis freebsd_committer freebsd_triage 2024-09-11 20:57:17 UTC
It looks like gnulib needs to be updated to detect support for --undefined-version and use it when testing for --version-script (or this test needs to be altered to build an object containing the referenced symbols).  It's unfortunate that --undefined-version is only about two years old in the BFD linker so it would be quite hard to argue for using it unconditionally.