Created attachment 178761 [details] Example program to reproduce incorrect clang -fast-math behavior on armv6 Originally discussed on freebsd-arm@ https://lists.freebsd.org/pipermail/freebsd-arm/2017-January/015318.html On armv6/12-current as of base r311687, clang command "cc -O1 -ffast-math" optimizes adjacent calls to sin() and cos() to emits calls to nonexistent function sincos(), resulting in linker error "undefined reference to `sincos'". Example program sincos.c attached. % uname -a FreeBSD rpi2 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r311687: Tue Jan 10 21:36:16 CST 2017 jsli@4cbsd:/personal/freebsd/obj/x64/arm.armv6/personal/freebsd/fbsdsrc/sys/RPI2 arm % cc --version FreeBSD clang version 3.9.1 (tags/RELEASE_391/final 289601) (based on LLVM 3.9.1) Target: armv6--freebsd12.0-gnueabihf Thread model: posix InstalledDir: /usr/bin % cc -O1 -ffast-math -lm sincos.c /tmp/sincos-767f23.o: In function `main': sincos.c:(.text+0x2c): undefined reference to `sincos' cc: error: linker command failed with exit code 1 (use -v to see invocation) % cc -O1 -fno-fast-math -lm sincos.c % cc -O0 -ffast-math -lm sincos.c %
I have a proposed patch upstream at https://reviews.llvm.org/D28570
I have a better patch. I've had it since 2011. My libm has sincos, sincosf, and sincosl.
(In reply to sgk from comment #2) > I have a better patch. Is it available somewhere?
As Andrew pointed out, the problem is caused by an unfortunate combination of us incorrectly (or at least misleadingly) having "gnu" in our triple, while we are definitely not GNU, and upstream annoyingly using that to derive that it is targeting glibc with sincos. That said, having sincos in libm would probably be nice. I'm unsure how many applications will be able to make use of this optimization, though.
The GNU in GNUEABI used to mean the variant where enums are a fixed width. This seemed to originate on Linux, hence GNU. It seems LLVM has redefined it's meaning to be something else.
(In reply to Andrew Turner from comment #5) > The GNU in GNUEABI used to mean the variant where enums are a fixed width. Hmm, it would be good to mention that in https://reviews.llvm.org/D28570, as Renato asks there: 'Why do you use "gnueabihf" if you don't use GLIBC? Why not just EABIHF, which would *also* work with GLIBC and GCC, but not have the idiosyncrasies of GLIBC' > This seemed to originate on Linux, hence GNU. It seems LLVM has redefined > it's meaning to be something else. It's just their isGNUEnvironment() function, which assumes GNU (or more specifically, glibc) if the last part of the target triple has "gnu" in it.
(In reply to Dimitry Andric from comment #6) > Hmm, it would be good to mention that in https://reviews.llvm.org/D28570, as Renato > asks there: 'Why do you use "gnueabihf" if you don't use GLIBC? Why not just EABIHF, > which would *also* work with GLIBC and GCC, but not have the idiosyncrasies of GLIBC' Because EABI would imply short enums, "The type of the storage container for an enumerated type is the smallest integer type that can contain all of its enumerated values". We could force long enums on EABI on FreeBSD, however this would introduce an inconsistency between clang and gcc.
I think I've just hit this bug trying to compile math/fftw3 on 11-STABLE/armv6. I think it would be good to have a libm that has sincos, sincosf, and sincosl. [...] libtool: link: cc -D_THREAD_SAFE -pthread -O -pipe -O3 -ffast-math -fstrict-alia sing -fomit-frame-pointer -o .libs/bench bench-bench.o bench-hook.o bench-fftw-b ench.o ../threads/.libs/libfftw3_threads.so ../.libs/libfftw3.so ../libbench2/l ibbench2.a -lm -pthread -Wl,-rpath -Wl,/usr/local/lib ../libbench2/libbench2.a(verify-lib.o): In function `aphase_shift': verify-lib.c:(.text+0x578): undefined reference to `sincos' ../libbench2/libbench2.a(verify-lib.o): In function `tf_shift': verify-lib.c:(.text+0x1380): undefined reference to `sincos' verify-lib.c:(.text+0x1574): undefined reference to `sincos' verify-lib.c:(.text+0x1834): undefined reference to `sincos' cc: error: linker command failed with exit code 1 (use -v to see invocation) gmake[3]: *** [Makefile:400: bench] Error 1 gmake[3]: Leaving directory '/construction/xports/math/fftw3/work/fftw-3.3.6-pl1 /tests' gmake[2]: *** [Makefile:683: all-recursive] Error 1 gmake[2]: Leaving directory '/construction/xports/math/fftw3/work/fftw-3.3.6-pl1 ' gmake[1]: *** [Makefile:548: all] Error 2 gmake[1]: Leaving directory '/construction/xports/math/fftw3/work/fftw-3.3.6-pl1 '
(In reply to Jonathan Chen from comment #8) I posted implementations of sincos[fl] to some mailing list 6 years ago. That seems to not have moved too far. I've now created a bug report so the code can slow rot there as well. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218300
In review at https://reviews.freebsd.org/D10765
A commit references this bug: Author: mmel Date: Sun May 28 06:13:40 UTC 2017 New revision: 319047 URL: https://svnweb.freebsd.org/changeset/base/319047 Log: Implement sincos, sincosf, and sincosl. The primary benefit of these functions is that argument reduction is done once instead of twice in independent calls to sin() and cos(). * lib/msun/Makefile: . Add s_sincos[fl].c to the build. . Add sincos.3 documentation. . Add appropriate MLINKS. * lib/msun/Symbol.map: . Expose sincos[fl] symbols in dynamic libm.so. * lib/msun/man/sincos.3: . Documentation for sincos[fl]. * lib/msun/src/k_sincos.h: . Kernel for sincos() function. This merges the individual kernels for sin() and cos(). The merger offered an opportunity to re-arrange the individual kernels for better performance. * lib/msun/src/k_sincosf.h: . Kernel for sincosf() function. This merges the individual kernels for sinf() and cosf(). The merger offered an opportunity to re-arrange the individual kernels for better performance. * lib/msun/src/k_sincosl.h: . Kernel for sincosl() function. This merges the individual kernels for sinl() and cosl(). The merger offered an opportunity to re-arrange the individual kernels for better performance. * lib/msun/src/math.h: . Add prototytpes for sincos[fl](). * lib/msun/src/math_private.h: . Add RETURNV macros. This is needed to reset fpsetprec on I386 hardware for a function with type void. * lib/msun/src/s_sincos.c: . Implementation of sincos() where sin() and cos() were merged into one routine and possibly re-arranged for better performance. * lib/msun/src/s_sincosf.c: . Implementation of sincosf() where sinf() and cosf() were merged into one routine and possibly re-arranged for better performance. * lib/msun/src/s_sincosl.c: . Implementation of sincosl() where sinl() and cosl() were merged into one routine and possibly re-arranged for better performance. PR: 215977, 218300 Submitted by: Steven G. Kargl <sgk@troutmask.apl.washington.edu> MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D10765 Changes: head/lib/msun/Makefile head/lib/msun/Symbol.map head/lib/msun/man/sincos.3 head/lib/msun/src/k_sincos.h head/lib/msun/src/k_sincosf.h head/lib/msun/src/k_sincosl.h head/lib/msun/src/math.h head/lib/msun/src/math_private.h head/lib/msun/src/s_sincos.c head/lib/msun/src/s_sincosf.c head/lib/msun/src/s_sincosl.c
A commit references this bug: Author: dim Date: Tue Sep 26 09:02:00 UTC 2017 New revision: 324006 URL: https://svnweb.freebsd.org/changeset/base/324006 Log: Synchronize most of libm with head as of r323004. This excludes a few arch-specific updates for powerpcspe, mips and riscv, for which support has not been merged yet. Bump __FreeBSD_version for the addition of cacoshl, cacosl, casinhl, casinl, catanl, catanhl, sincos, sincosf, and sincosl. MFC r305382 (by bde): Add asm versions of fmod(), fmodf() and fmodl() on amd64. Add asm versions of fmodf() amd fmodl() on i387. fmod is similar to remainder, and the C versions are 3 to 9 times slower than the asm versions on x86 for both, but we had the strange mixture of all 6 variants of remainder in asm and only 1 of 6 variants of fmod in asm. MFC r305384 (by bde): Disconnect the "optimized" asm variants of cos(), sin() and tan() from the build on i386. Leave them in the source tree for regression tests. The asm functions were always much less accurate (by a factor of more than 10**18 in the worst case). They were faster on old CPUs. But with each new generation of CPUs they get relatively slower. The double precision C version's average advantage is about a factor of 2 on Haswell. The asm functions were already intentionally avoided in float and long double precision on i386 and in all precisions on amd64. Float precision and amd64 give larger advantages to the C version. The long double precision C code and compilers' understanding of long double precision are not so good, so the i387 is still slightly faster for long double precision, except for the unimportant subcase of huge args where the sub-optimal C code now somehow beats the i387 by about a factor of 2. MFC r305385 (by bde): Oops, the previous i386 version of e_fmodf.S and e_fmodl.S was actually the amd64 version. MFC r306409 (by emaste): libm: fix some unused variable (rcsid) and dangling else warnings s_{fabs,fmax,logb,scalb}{,f,l}.c may be built elsewhere with a higher WARNS setting. Reviewed by: ed Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D8061 MFC r306410 (by emaste): libm: simplify i387 subdir logic with make's :S substitution MFC r306527 (by emaste): libm: remove unused variables for LDBL_MANT_DIG != 113 Sponsored by: The FreeBSD Foundation MFC r306709 (by emaste): libm: remove unused variables Sponsored by: The FreeBSD Foundation MFC r307066 (by br): Don't use fmaxl/fminl on platforms with no long double support, use fmax/fmin instead. This fixes fmaxmin test failure on MIPS64. Reviewed by: emaste Sponsored by: DARPA, AFRL Sponsored by: HEIF5 Differential Revision: https://reviews.freebsd.org/D8216 MFC r308172 (by emaste): libm: add braces around initialization of subobjects This cleans up a warning when building libm at higher WARNS levels and makes the intent more clear. By the C standard the values are assigned to subobject members in order so this change introduces no functional change. (6.7.9 20) Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D8333 MFC r313761 (by mmokhi): Add casinl() cacosl() catanl() casinhl() cacoshl() catanhl() APIs to msun to improve C11 conformance. PR: 216850 216851 216852 216856 216857 216858 Submitted by: mmokhi Reported by: sgk@troutmask.apl.washington.edu Reviewed by: bde, mat, theraven Approved by: bde (src committer), mat (mentor) Differential Revision: https://reviews.freebsd.org/D9491 MFC r313863 (by mmokhi): Fix building of r313761 on platforms that `long double` is alias of `double` (MIPS, etc) PR: 216850 216851 216852 216856 216857 216858 Reported by: emsate Reviewed by: bde emaste hselasky Approved by: bde emaste hselasky Differential Revision: https://reviews.freebsd.org/D9491 MFC r313864 (by mmokhi): Add documentations related to new APIs of r313761 PR: 216850 216851 216852 216856 216857 216858 Submitted by: sgk@troutmask.apl.washington.edu Reported by: sgk@troutmask.apl.washington.edu Reviewed by: bde emaste hselasky Approved by: bde emaste hselasky Differential Revision: https://reviews.freebsd.org/D9491 MFC r314950 (by ngie): Don't expect :test_large_inputs to fail with i386 anymore Recent changes (maybe a side-effect of the ATF-ification in r314649) invalidate the failure expectation. PR: 205446 Sponsored by: Dell EMC Isilon MFC r317349 (by pfg): msun: Remove trailing space in Sunsoft copyright statement. Submittedby: kargl MFC r319047 (by mmel): Implement sincos, sincosf, and sincosl. The primary benefit of these functions is that argument reduction is done once instead of twice in independent calls to sin() and cos(). * lib/msun/Makefile: . Add s_sincos[fl].c to the build. . Add sincos.3 documentation. . Add appropriate MLINKS. * lib/msun/Symbol.map: . Expose sincos[fl] symbols in dynamic libm.so. * lib/msun/man/sincos.3: . Documentation for sincos[fl]. * lib/msun/src/k_sincos.h: . Kernel for sincos() function. This merges the individual kernels for sin() and cos(). The merger offered an opportunity to re-arrange the individual kernels for better performance. * lib/msun/src/k_sincosf.h: . Kernel for sincosf() function. This merges the individual kernels for sinf() and cosf(). The merger offered an opportunity to re-arrange the individual kernels for better performance. * lib/msun/src/k_sincosl.h: . Kernel for sincosl() function. This merges the individual kernels for sinl() and cosl(). The merger offered an opportunity to re-arrange the individual kernels for better performance. * lib/msun/src/math.h: . Add prototytpes for sincos[fl](). * lib/msun/src/math_private.h: . Add RETURNV macros. This is needed to reset fpsetprec on I386 hardware for a function with type void. * lib/msun/src/s_sincos.c: . Implementation of sincos() where sin() and cos() were merged into one routine and possibly re-arranged for better performance. * lib/msun/src/s_sincosf.c: . Implementation of sincosf() where sinf() and cosf() were merged into one routine and possibly re-arranged for better performance. * lib/msun/src/s_sincosl.c: . Implementation of sincosl() where sinl() and cosl() were merged into one routine and possibly re-arranged for better performance. PR: 215977, 218300 Submitted by: Steven G. Kargl <sgk@troutmask.apl.washington.edu> Differential Revision: https://reviews.freebsd.org/D10765 MFC r321457 (by ngie): Mark :reduction as an expected failure It fails with clang 5.0+. PR: 220989 Reported by: Jenkins MFC r322418 (by rlibby): lib/msun: avoid referring to broken LDBL_MAX LDBL_MAX is broken on i386: https://lists.freebsd.org/pipermail/freebsd-numerics/2012-September/000288.html Gcc has produced +Infinity for LDBL_MAX on i386 and amd64 with -m32 for some time, and newer versions of gcc are now warning that the "floating constant exceeds range of 'long double'". Avoid this by referring to half the value of LDBL_MAX instead. Reviewed by: bde Approved by: markj (mentor) Sponsored by: Dell EMC Isilon MFC r322435 (by rlibby): Revert r322418, LDBL_MAX_EXP unsuitable for macro pasting on some arches Either need a different way to spell HALF_LDBL_MAX, or a different way to spell LDBL_MAX_EXP, or a different approach. Reported by: ian MFC r322921 (by ngie): Revert r321457 It doesn't fail after ^/head@r322855 (the releng_50 clang merge). PR: 220989 Changes: _U stable/11/ stable/11/lib/msun/Makefile stable/11/lib/msun/Symbol.map stable/11/lib/msun/amd64/Makefile.inc stable/11/lib/msun/amd64/e_fmod.S stable/11/lib/msun/amd64/e_fmodf.S stable/11/lib/msun/amd64/e_fmodl.S stable/11/lib/msun/i387/Makefile.inc stable/11/lib/msun/i387/e_fmodf.S stable/11/lib/msun/i387/e_fmodl.S stable/11/lib/msun/ld80/e_lgammal_r.c stable/11/lib/msun/ld80/k_expl.h stable/11/lib/msun/ld80/s_logl.c stable/11/lib/msun/man/cacos.3 stable/11/lib/msun/man/sincos.3 stable/11/lib/msun/src/catrig.c stable/11/lib/msun/src/catrigl.c stable/11/lib/msun/src/e_asin.c stable/11/lib/msun/src/e_coshl.c stable/11/lib/msun/src/e_lgammaf_r.c stable/11/lib/msun/src/e_sinhl.c stable/11/lib/msun/src/k_sincos.h stable/11/lib/msun/src/k_sincosf.h stable/11/lib/msun/src/k_sincosl.h stable/11/lib/msun/src/math.h stable/11/lib/msun/src/math_private.h stable/11/lib/msun/src/s_fabs.c stable/11/lib/msun/src/s_fmax.c stable/11/lib/msun/src/s_fmin.c stable/11/lib/msun/src/s_logbl.c stable/11/lib/msun/src/s_scalbn.c stable/11/lib/msun/src/s_scalbnf.c stable/11/lib/msun/src/s_scalbnl.c stable/11/lib/msun/src/s_sincos.c stable/11/lib/msun/src/s_sincosf.c stable/11/lib/msun/src/s_sincosl.c stable/11/lib/msun/src/s_tanhl.c stable/11/lib/msun/tests/ctrig_test.c stable/11/sys/sys/param.h