Bug 276035 - net/mpich: ld: error: undefined reference due to --no-allow-shlib-undefined with clang-17
Summary: net/mpich: ld: error: undefined reference due to --no-allow-shlib-undefined w...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-toolchain (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-31 11:31 UTC by Thierry Thomas
Modified: 2024-01-27 17:07 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thierry Thomas freebsd_committer freebsd_triage 2023-12-31 11:31:27 UTC
On -Current, i.e. with clang-17, and without the work-around introduced in eb36006fdb70, mpich fails to link with the following error:

Making all in .
/bin/sh ./libtool  --tag=CC    --mode=link cc -fvisibility=hidden
-I/usr/local/include/json-c -I/usr/local/include/gcc12 -O2 -pipe
-fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing
-L/usr/local/lib -o src/env/mpichversion src/env/mpichversion.o lib/libmpi.la
-L/usr/local/lib -lepoll-shim -ljson-c -lm
libtool: link: cc -fvisibility=hidden -I/usr/local/include/json-c
-I/usr/local/include/gcc12 -O2 -pipe -fstack-protector-strong -isystem
/usr/local/include -fno-strict-aliasing -o src/env/.libs/mpichversion
src/env/mpichversion.o  -L/usr/local/lib lib/.libs/libmpi.so -lhwloc -lfabric
-lrdmacm -libverbs -lexecinfo -lze_loader -lpthread -lepoll-shim -ljson-c -lm
-Wl,-rpath -Wl,/usr/local/lib
ld: error: undefined reference due to --no-allow-shlib-undefined: __addtf3
>>> referenced by lib/.libs/libmpi.so

ld: error: undefined reference due to --no-allow-shlib-undefined: __gttf2
>>> referenced by lib/.libs/libmpi.so

ld: error: undefined reference due to --no-allow-shlib-undefined: __lttf2
>>> referenced by lib/.libs/libmpi.so

ld: error: undefined reference due to --no-allow-shlib-undefined: __multf3
>>> referenced by lib/.libs/libmpi.so

ld: error: undefined reference due to --no-allow-shlib-undefined:
__extendxftf2
>>> referenced by lib/.libs/libmpi.so

ld: error: undefined reference due to --no-allow-shlib-undefined: __trunctfxf2
>>> referenced by lib/.libs/libmpi.so
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** Error code 1

Remark: previously, with clang-16, everything was fine.
Comment 1 commit-hook freebsd_committer freebsd_triage 2023-12-31 11:41:09 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=d12f454e3692d7fbb8e95e4da0cb84f2d7c753a5

commit d12f454e3692d7fbb8e95e4da0cb84f2d7c753a5
Author:     Thierry Thomas <thierry@FreeBSD.org>
AuthorDate: 2023-12-31 11:34:41 +0000
Commit:     Thierry Thomas <thierry@FreeBSD.org>
CommitDate: 2023-12-31 11:40:31 +0000

    net/mpich: work-around to build the dependencies on -CURRENT

    Since the switch from clang-16 to 17.0.6 on -CURRENT, MPICH encounters
    linker errors.

    A work-around to force the usage of clang <= 16 has been introduced in
    eb36006fdb70, but when the compiler has been forced to build, it must be
    kept to build the dependencies.

    PR:             276035

 net/mpich/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 2 Dimitry Andric freebsd_committer freebsd_triage 2023-12-31 16:51:02 UTC
What happens is that clang 17 partially supports __float128 on x86_64, while clang 16 did not.

The configure script under clang 16 shows:

configure:43831: checking size of __float128
configure:43836: clang15 -o conftest -I/usr/local/include/json-c -I/usr/local/include/gcc12 -O2 -pipe  -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing    -isystem /usr/local/include -I/usr/local/include -DNETMOD_INLINE=__netmod_inline_ofi__ -I/wrkdirs/usr/ports/net/mpich/work/mpich-4.1.2/src/mpl/include -D_REENTRANT -I/wrkdirs/usr/ports/net/mpich/work/mpich-4.1.2/src/mpi/romio/include -I/wrkdirs/usr/ports/net/mpich/work/mpich-4.1.2/src/pmi/include  -L/usr/local/lib conftest.c -L/usr/local/lib -lepoll-shim -ljson-c -lm >&5
conftest.c:127:57: error: __float128 is not supported on this target
static long int longval () { return (long int) (sizeof (__float128)); }
                                                        ^
conftest.c:128:67: error: __float128 is not supported on this target
static unsigned long int ulongval () { return (long int) (sizeof (__float128)); }
                                                                  ^
conftest.c:138:28: error: __float128 is not supported on this target
  if (((long int) (sizeof (__float128))) < 0)
                           ^
conftest.c:141:37: error: __float128 is not supported on this target
      if (i != ((long int) (sizeof (__float128))))
                                    ^
conftest.c:148:37: error: __float128 is not supported on this target
      if (i != ((long int) (sizeof (__float128))))
                                    ^
5 errors generated.
configure:43836: $? = 1
configure: program exited with status 1

while with clang 17 you get:

configure:43831: checking size of __float128
configure:43836: cc -o conftest -I/usr/local/include/json-c -I/usr/local/include/gcc12 -O2 -pipe  -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing    -isystem /usr/local/include -I/usr/local/include -DNETMOD_INLINE=__netmod_inline_ofi__ -I/wrkdirs/usr/ports/net/mpich/work/mpich-4.1.2/src/mpl/include -D_REENTRANT -I/wrkdirs/usr/ports/net/mpich/work/mpich-4.1.2/src/mpi/romio/include -I/wrkdirs/usr/ports/net/mpich/work/mpich-4.1.2/src/pmi/include  -L/usr/local/lib conftest.c -L/usr/local/lib -lepoll-shim -ljson-c -lm >&5
configure:43836: $? = 0
configure:43836: ./conftest
configure:43836: $? = 0
configure:43850: result: 16

So in src/include/mpichconf.h you then get:

/* Define if __float128 is supported */
#define HAVE_FLOAT128 1

which causes src/mpi/coll/op/opsum.c to emit calls to libgcc support functions for float128 types, in particular:

* __addtf3
* __gttf2
* __lttf2
* __multf3
* __extendxftf2
* __trunctfxf2

Unfortunately not all these functions are available yet in compiler-rt. They will be included when llvm-18 is imported.

For now, it is probably easiest to suppress float128 detection in the configure script, for example by adding:

CONFIGURE_ENV+= ac_cv_sizeof___float128=0

just below CONFIGURE_ARGS in the Makefile.
Comment 3 Thierry Thomas freebsd_committer freebsd_triage 2024-01-01 11:35:51 UTC
(In reply to Dimitry Andric from comment #2)

Committed, thanks!
Comment 4 commit-hook freebsd_committer freebsd_triage 2024-01-01 11:36:41 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=3eba17d8241bcf4c0ebc213ba246f6111f1770b3

commit 3eba17d8241bcf4c0ebc213ba246f6111f1770b3
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2024-01-01 11:29:41 +0000
Commit:     Thierry Thomas <thierry@FreeBSD.org>
CommitDate: 2024-01-01 11:29:41 +0000

    net/mpich: real fix for clang-17

    This replaces the previous work-around and the need of an external
    compiler.

    For detailed explanations, see PR 276035.

    PR:             276035

 net/mpich/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 5 Peter Much 2024-01-21 22:04:06 UTC
For the records: I just hit this issue on

13.2-STABLE stable/13-n257097-1165116ada35

# cc --version
FreeBSD clang version 17.0.6

Please verify, maybe this needs a little improvement before 13.3 gets rolling?
Comment 6 Dimitry Andric freebsd_committer freebsd_triage 2024-01-21 22:11:11 UTC
(In reply to Peter Much from comment #5)
Do you have commit 3eba17d8241bcf4c0ebc213ba246f6111f1770b3 in your ports tree, and aren't you using the quarterly branch?

E.g. double-check if  you have the ac_cv_sizeof___float128=0 line in your Makefile.
Comment 7 Peter Much 2024-01-21 22:46:36 UTC
(In reply to Dimitry Andric from comment #6)

Yes I have it:

root@dzhn:/usr/ports/net/mpich # git log -- Makefile 
commit 3eba17d8241bcf4c0ebc213ba246f6111f1770b3
Author: Dimitry Andric <dim@FreeBSD.org>
Date:   Mon Jan 1 12:29:41 2024 +0100

But the patch checks for ${OSVERSION} >= 1500005 while this here is:

root@dzhn:/usr/ports/net/mpich # make -V OSVERSION
1302510

That LLVM was apparently MFC'd - sorry for that...
Comment 8 Dimitry Andric freebsd_committer freebsd_triage 2024-01-21 23:32:32 UTC
(In reply to Peter Much from comment #7)
Ah, I had completely missed the OSVERSION check. I think it can simply go away, as we'll only have full float128 support when clang 18 lands (and then only for some platforms).

Thierry, do you object to removing that OSVERSION check, and MFHing it?
Comment 9 Thierry Thomas freebsd_committer freebsd_triage 2024-01-22 09:20:14 UTC
(In reply to Dimitry Andric from comment #8)

No problem: I was about to check the compiler and its version, something like

.if ${COMPILER_TYPE] == clang && ${COMPILER_VERSION} == 170
Comment 10 Thierry Thomas freebsd_committer freebsd_triage 2024-01-23 19:10:00 UTC
Done. Thanks for the report!
Comment 11 commit-hook freebsd_committer freebsd_triage 2024-01-23 19:10:35 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=22ba7292a1c9956f39d2004c0513bede7e2cbcee

commit 22ba7292a1c9956f39d2004c0513bede7e2cbcee
Author:     Thierry Thomas <thierry@FreeBSD.org>
AuthorDate: 2024-01-23 18:59:12 +0000
Commit:     Thierry Thomas <thierry@FreeBSD.org>
CommitDate: 2024-01-23 19:09:49 +0000

    net/mpich: apply the fix for clang 17 on all OSVERSION

    clang-17 has been MFC’ed: do not check OSVERSION but rather COMPILER_VERSION.

    PR:             276035
    Reported by:    pmc (at) citylink.dinoex.sub.org

 net/mpich/Makefile | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)