Bug 235413

Summary:

[LIBM] optizimation for cexp and cexpf

Product:

Base System

Reporter:

Steve Kargl <kargl>

Component:

bin

Assignee:

freebsd-bugs (Nobody) <bugs>

Status:

Closed Overcome By Events

Severity:

Affects Only Me

CC:

emaste, lwhsu

Priority:

---

Keywords:

patch

Version:

CURRENT

Hardware:

Any

OS:

Any

Bug Depends on:

Bug Blocks:

216862

Attachments:

Description	Flags
patch	none
New patch with cexpl implementation included	none
Updated patch	none

Description Steve Kargl freebsd_committer

2019-02-02 00:28:44 UTC

Created attachment 201620 [details]
patch

The attach patch utilizes sincos[f] in the computation
of cexp[f].  For 20 million random z=x+Iy drawn in the
box defined by x,y in [0,MAX_EXP*LN2], this amounts to
a 8.7% and 11.4% speed improvement over computing sin[f]
and cos[f] individually in cexp[f].

Comment 1 Steve Kargl freebsd_committer

2019-02-27 01:56:55 UTC

Created attachment 202398 [details]
New patch with cexpl implementation included

The new patch cexpl.diff supercedes the old path.  It includes the changes in the old patch as well as implementations for long double complex cexpl().  This patch depends on https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236063

A suitable commit messages is 

* lib/msun/src/math_private.h:
  . Add an EXTRACT_LDBL80_WORDS2() macro to get access to the high and
    low word of a 64-bit significand as well as the expsign.
  . Add prototype for __ldexp_expl().
  . Add prototype for __ldexp_cexpl().

* lib/msun/src/s_cexp.c:
  . A float.h to get LDBL_MANT_DIG.
  . Add c and s to declaration, and sort.
  . Move x = 0 case to be the first case tested.  This is in preparation
    for fixing constanting folding in GCC.
  . Use sincos() instead of a call to sin() and to cos().
  . A week_refrence for LDBL_MANT_DIG == 53.

* lib/msun/src/s_cexpf.c:
  . Add c and s to declaration, and sort.
  . Move x = 0 case to be the first case tested.  This is in preparation
    for fixing constanting folding in GCC.
  . Use sincosf() instead of a call to sinf() and to cosf().

* lib/msun/src/k_exp.c:
  . Add c and s to declaration, and sort.
  . Use sincos() instead of a call to sin() and to cos().
  
* lib/msun/src/k_expf.c
  . Add c and s to declaration, and sort.
  . Use sincosf() instead of a call to sinf() and to cosf().

* lib/msun/ld128/k_cexpl.c:
  . Copy src/k_exp.c to here.  #if 0 ... #endif all code and have
    functions return NaN.  These functions are currently unused,
    and await someone who cares.
  . Issue a compile-time warning about the sloppiness.

* lib/msun/ld128/s_cexpl.c:
  . Copy src/s_cexp.c to here.
  . Convert "double complex" to "long double complex" without use of
    bit-twiddling.
  . Add compile-time warning about the sloppiness.
  . Add run-time warning about the sloppiness.

* lib/msun/ld80/k_cexpl.c:
  . Copy src/k_exp.c to here.
  . Convert "double complex" to "long double complex".  Use bit-twiddling.

* lib/msun/ld80/s_cexpl.c:
  . Copy src/s_cexp.c to here.
  . Convert "double complex" to "long double complex".  Use bit-twiddling
    where bits are grabbed with new EXTRACT_LDBL80_WORDS2() macro.

* lib/msun/man/cexp.3:
  . Document the addtion of cexpl.

* include/complex.h:
  . Add prototype for cexpl().

Comment 2 Steve Kargl freebsd_committer

2019-02-27 18:23:03 UTC

Argh.  bde has pointed out some silly mistakes in the code.
He would also like to see the bit-twiddling reworked.  So,
the current patch should be considered informative at best.

Comment 3 Steve Kargl freebsd_committer

2019-10-26 16:32:28 UTC

Created attachment 208609 [details]
Updated patch

The updated patch includes the optimization for cexpf, cexp,
and new implementations for ld80/cexpl and ld128/cexpl.  The
ld80/cexpl implementation uses bit twiddling and has been
tested on i586 and amd64 systems.  The ld128/cexpl is a stub
implementation that has been neither compiled nor tested.

Comment 4 Steve Kargl freebsd_committer

2019-10-26 16:33:09 UTC

New commit log for cexpl_new.diff.

* include/complex.h:
  . Prototype for cexpl().

* lib/msun/Makefile:
  . Add s_cexpl.c to the build.
  . Add MLINK for cexpl.3 to cexp.3

* lib/msun/Symbol.map:
  . Add a new section for FreeBSD 13 symbols.
  . Add cexpl to the symbol table.

* lib/msun/src/math_private.h:
  . Change ENTERI() to toggle FPU precision for ld80 long double.
  . Introduce LEAVEI() to toggles FPU precision back to double.
  . Update RETURNI(x) to use __typeof(x) to set return type, use
    LEAVEI(), and use RETURNF(x) for the actual return.
  . Add prototype for __ldexp_expl();

* lib/msun/man/cexp.3:
  . Document cexpl().

* lib/msun/src/k_exp.c:
* lib/msun/src/k_expf.c:
  . Add declarations for c and s, and sort.
  . Use sincos[f]() instead of a call to sin[f]() and a call cos[f]().

* lib/msun/src/s_cexp.c:
  . Include float.h to get access to LDBL_MANT_DIG.
  . Add declarations for c and s, and sort.
  . Use sincos() instead of a call to sin() and a call cos().
  . Use __weak_reference(cexp, cexpl) on LDBL_MANT_DIG == 53.

* lib/msun/src/s_cexpf.c:
  . Add declarations for c and s, and sort.
  . Use sincosf() instead of a call to sinf() and a call cosf().

* lib/msun/ld80/k_cexpl.c:
  . Implementations of __frexp_expl(), __ldexp_expl(),  and __ldexp_cexpl(),
    which are computation kernels used by cexpl().

* lib/msun/ld80/k_expl.h:
  . Add 'L' suffix to a long double literal constant.
  . Fix typo in computation of scale2.
  . Use sincosl() instead of a call to sinl() and a call cosl().

* lib/msun/ld80/s_cexpl.c:
  . Implementation of cexpl() for ld80 hardware.

* lib/msun/ld128/k_cexpl.c:
  . Stub for __frexp_expl(), __ldexp_expl(), and __ldexp_cexpl().

* lib/msun/ld128/k_expl.h:
  . Add 'L' suffix to a long double literal constant.
  . Fix typo in computation of scale2.
  . Use sincosl() instead of a call to sinl() and a call cosl().

* lib/msun/ld128/s_cexpl.c:
  . Naive, but better than some might know, implementation of cexpl()
    for 113-bit long double hardware.

Comment 5 Steve Kargl freebsd_committer

2023-06-30 20:32:03 UTC

Submitter timeout.  No body cares.