Summary: | [LIBM] optizimation for cexp and cexpf | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Steve Kargl <kargl> | ||||||||
Component: | bin | Assignee: | freebsd-bugs (Nobody) <bugs> | ||||||||
Status: | Closed Overcome By Events | ||||||||||
Severity: | Affects Only Me | CC: | emaste, lwhsu | ||||||||
Priority: | --- | Keywords: | patch | ||||||||
Version: | CURRENT | ||||||||||
Hardware: | Any | ||||||||||
OS: | Any | ||||||||||
Bug Depends on: | |||||||||||
Bug Blocks: | 216862 | ||||||||||
Attachments: |
|
Created attachment 202398 [details] New patch with cexpl implementation included The new patch cexpl.diff supercedes the old path. It includes the changes in the old patch as well as implementations for long double complex cexpl(). This patch depends on https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236063 A suitable commit messages is * lib/msun/src/math_private.h: . Add an EXTRACT_LDBL80_WORDS2() macro to get access to the high and low word of a 64-bit significand as well as the expsign. . Add prototype for __ldexp_expl(). . Add prototype for __ldexp_cexpl(). * lib/msun/src/s_cexp.c: . A float.h to get LDBL_MANT_DIG. . Add c and s to declaration, and sort. . Move x = 0 case to be the first case tested. This is in preparation for fixing constanting folding in GCC. . Use sincos() instead of a call to sin() and to cos(). . A week_refrence for LDBL_MANT_DIG == 53. * lib/msun/src/s_cexpf.c: . Add c and s to declaration, and sort. . Move x = 0 case to be the first case tested. This is in preparation for fixing constanting folding in GCC. . Use sincosf() instead of a call to sinf() and to cosf(). * lib/msun/src/k_exp.c: . Add c and s to declaration, and sort. . Use sincos() instead of a call to sin() and to cos(). * lib/msun/src/k_expf.c . Add c and s to declaration, and sort. . Use sincosf() instead of a call to sinf() and to cosf(). * lib/msun/ld128/k_cexpl.c: . Copy src/k_exp.c to here. #if 0 ... #endif all code and have functions return NaN. These functions are currently unused, and await someone who cares. . Issue a compile-time warning about the sloppiness. * lib/msun/ld128/s_cexpl.c: . Copy src/s_cexp.c to here. . Convert "double complex" to "long double complex" without use of bit-twiddling. . Add compile-time warning about the sloppiness. . Add run-time warning about the sloppiness. * lib/msun/ld80/k_cexpl.c: . Copy src/k_exp.c to here. . Convert "double complex" to "long double complex". Use bit-twiddling. * lib/msun/ld80/s_cexpl.c: . Copy src/s_cexp.c to here. . Convert "double complex" to "long double complex". Use bit-twiddling where bits are grabbed with new EXTRACT_LDBL80_WORDS2() macro. * lib/msun/man/cexp.3: . Document the addtion of cexpl. * include/complex.h: . Add prototype for cexpl(). Argh. bde has pointed out some silly mistakes in the code. He would also like to see the bit-twiddling reworked. So, the current patch should be considered informative at best. Created attachment 208609 [details]
Updated patch
The updated patch includes the optimization for cexpf, cexp,
and new implementations for ld80/cexpl and ld128/cexpl. The
ld80/cexpl implementation uses bit twiddling and has been
tested on i586 and amd64 systems. The ld128/cexpl is a stub
implementation that has been neither compiled nor tested.
New commit log for cexpl_new.diff. * include/complex.h: . Prototype for cexpl(). * lib/msun/Makefile: . Add s_cexpl.c to the build. . Add MLINK for cexpl.3 to cexp.3 * lib/msun/Symbol.map: . Add a new section for FreeBSD 13 symbols. . Add cexpl to the symbol table. * lib/msun/src/math_private.h: . Change ENTERI() to toggle FPU precision for ld80 long double. . Introduce LEAVEI() to toggles FPU precision back to double. . Update RETURNI(x) to use __typeof(x) to set return type, use LEAVEI(), and use RETURNF(x) for the actual return. . Add prototype for __ldexp_expl(); * lib/msun/man/cexp.3: . Document cexpl(). * lib/msun/src/k_exp.c: * lib/msun/src/k_expf.c: . Add declarations for c and s, and sort. . Use sincos[f]() instead of a call to sin[f]() and a call cos[f](). * lib/msun/src/s_cexp.c: . Include float.h to get access to LDBL_MANT_DIG. . Add declarations for c and s, and sort. . Use sincos() instead of a call to sin() and a call cos(). . Use __weak_reference(cexp, cexpl) on LDBL_MANT_DIG == 53. * lib/msun/src/s_cexpf.c: . Add declarations for c and s, and sort. . Use sincosf() instead of a call to sinf() and a call cosf(). * lib/msun/ld80/k_cexpl.c: . Implementations of __frexp_expl(), __ldexp_expl(), and __ldexp_cexpl(), which are computation kernels used by cexpl(). * lib/msun/ld80/k_expl.h: . Add 'L' suffix to a long double literal constant. . Fix typo in computation of scale2. . Use sincosl() instead of a call to sinl() and a call cosl(). * lib/msun/ld80/s_cexpl.c: . Implementation of cexpl() for ld80 hardware. * lib/msun/ld128/k_cexpl.c: . Stub for __frexp_expl(), __ldexp_expl(), and __ldexp_cexpl(). * lib/msun/ld128/k_expl.h: . Add 'L' suffix to a long double literal constant. . Fix typo in computation of scale2. . Use sincosl() instead of a call to sinl() and a call cosl(). * lib/msun/ld128/s_cexpl.c: . Naive, but better than some might know, implementation of cexpl() for 113-bit long double hardware. Submitter timeout. No body cares. |
Created attachment 201620 [details] patch The attach patch utilizes sincos[f] in the computation of cexp[f]. For 20 million random z=x+Iy drawn in the box defined by x,y in [0,MAX_EXP*LN2], this amounts to a 8.7% and 11.4% speed improvement over computing sin[f] and cos[f] individually in cexp[f].