Summary: | [msun] [patch] complex arcsinh, log, etc. | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Stephen Montgomery-Smith <stephen> | ||||||||||||||
Component: | bin | Assignee: | freebsd-numerics (Nobody) <numerics> | ||||||||||||||
Status: | Closed FIXED | ||||||||||||||||
Severity: | Affects Only Me | ||||||||||||||||
Priority: | Normal | ||||||||||||||||
Version: | 8.3-STABLE | ||||||||||||||||
Hardware: | Any | ||||||||||||||||
OS: | Any | ||||||||||||||||
Attachments: |
|
Description
Stephen Montgomery-Smith
![]() ![]() Oops - error in the comments for catanh: catanh(z) = 0.5 * log((1+z)/(1-z)) On Wed, 25 Jul 2012, Stephen Montgomery-Smith wrote:
> This function seems to be able to compute clog with a worst case relative
> error of 4 or 5 ULP.
> ...
I lost your previous reply about this after reading just the first part.
Please resend if interested.
First part recovered by vidcontrol:
VC> > I'm still working on testing and fixing clog. Haven't got near the more
VC> > complex functions.
VC> >
VC> > For clog, the worst case that I've found so far has x^2+y^2-1 ~= 1e-47:
VC> >
VC> > x = 0.999999999999999555910790149937383830547332763671875000000000
VC> > y =
VC> > 0.0000000298023223876953091912775497878893005143652317201485857367516
VC> > (need high precision decimal or these rounded to 53 bits binary)
VC> > x^2+y^2-1 = 1.0947644252537633366591637369e-47
VC>
VC> That is exactly 2^(-156). So maybe triple quad precision really is enough.
Hmm. But you need 53 more value bits after the 156. Quadruple precision
gives 3 to spare. I didn't notice that this number was exactly a power
of 2, but just added 15-17 for the value bits in decimal to 47 to get over
60.
VC> > so it needs more than tripled double precision for a brute force
VC> > evaluation, and the general case is probably worse. I'm working
VC> > on a rearrangement so that doubled double precision works in the
VC> > general case. Both your version and my version get this case right,
VC> > but mess up different much easier cases. It takes insanely great
VC> > accuracy to get even 1 bit in this case right, but now that we
Tripled double precision is enough for this because -1 cancels with
leading terms, giving almost quadrupled double precision:
% hm1 = -1;
% for (i=0;i<12;i++) hm1 += val[i];
% return (cpack(0.5 * log1p(hm1), atan2(y, x)));
It is the trailing terms that I think don't work right here. You sort
them and add from high to low, but normally it is necessary to add
from low to high (consider terms [1, DBL_EPSILON/2, DBL_EPSILON/4]).
Adding from high to low cancels with the -1 term, but then only
particular values work right. Also, I don't see how adding the low
terms without extra precision preserves enough precision.
Bruce
On 07/27/2012 09:26 AM, Bruce Evans wrote: > On Wed, 25 Jul 2012, Stephen Montgomery-Smith wrote: > >> This function seems to be able to compute clog with a worst case >> relative error of 4 or 5 ULP. >> ... > > I lost your previous reply about this after reading just the first part. > Please resend if interested. > > First part recovered by vidcontrol: > > VC> > I'm still working on testing and fixing clog. Haven't got near > the more > VC> > complex functions. > VC> > > VC> > For clog, the worst case that I've found so far has x^2+y^2-1 ~= > 1e-47: > VC> > > VC> > x = > 0.999999999999999555910790149937383830547332763671875000000000 > VC> > y = > VC> > 0.0000000298023223876953091912775497878893005143652317201485857367516 > VC> > (need high precision decimal or these rounded to 53 bits > binary) > VC> > x^2+y^2-1 = 1.0947644252537633366591637369e-47 > VC> VC> That is exactly 2^(-156). So maybe triple quad precision really > is enough. > > Hmm. But you need 53 more value bits after the 156. Quadruple precision > gives 3 to spare. I didn't notice that this number was exactly a power > of 2, but just added 15-17 for the value bits in decimal to 47 to get over > 60. I think one should be able to prove mathematically that if the number is as small as 1e-47, only the first one or two bits of the mantissa will be non-zero. I think that if more than triple double precision is needed, it is only one or two more bits more than triple double precision. On 07/27/2012 09:26 AM, Bruce Evans wrote: > % hm1 = -1; > % for (i=0;i<12;i++) hm1 += val[i]; > % return (cpack(0.5 * log1p(hm1), atan2(y, x))); > > It is the trailing terms that I think don't work right here. You sort > them and add from high to low, but normally it is necessary to add > from low to high (consider terms [1, DBL_EPSILON/2, DBL_EPSILON/4]). > Adding from high to low cancels with the -1 term, but then only > particular values work right. Also, I don't see how adding the low > terms without extra precision preserves enough precision. I understand what you are saying. But in this case adding in order of smallest to largest (adding -1 last) won't work. If all the signs in the same direction, it would work. But -1 has the wrong sign. But I can also tell you that I haven't thought my algorithm through every special case. I can tell you it seems to work in all the examples I tried. But I don't have a mathematical proof. On 07/27/2012 09:26 AM, Bruce Evans wrote:
> VC> > For clog, the worst case that I've found so far has x^2+y^2-1 ~=
> 1e-47:
> VC> >
> VC> > x =
> 0.999999999999999555910790149937383830547332763671875000000000
> VC> > y =
> VC> > 0.0000000298023223876953091912775497878893005143652317201485857367516
> VC> > (need high precision decimal or these rounded to 53 bits
> binary)
> VC> > x^2+y^2-1 = 1.0947644252537633366591637369e-47
> VC> VC> That is exactly 2^(-156). So maybe triple quad precision really
> is enough.
Furthermore, if you use the computation (x-1)*(x+1)*y*y (assuming as you
do x>y>0), only double precision is necessary. This is proved in the
paper "Implementing Complex Elementary Functions Using Exception
Handling" by Hull, Fairgrieve, Tang, ACM Transactions on Mathematical
Software, Vol 20, No 2, 1994. They give a bound on the error, which I
think can be interpreted as being around 3.9 ULP.
And I think you will see that your example does not contradict their
theorem. Because in your example, x-1 will be rather small.
So to get reasonable ULP (reasonable meaning 4 rather than 1), double
precision is all you need.
On Fri, 27 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/27/2012 09:26 AM, Bruce Evans wrote: > >> VC> > For clog, the worst case that I've found so far has x^2+y^2-1 ~= >> 1e-47: >> VC> > >> VC> > x = >> 0.999999999999999555910790149937383830547332763671875000000000 >> VC> > y = >> VC> > 0.0000000298023223876953091912775497878893005143652317201485857367516 >> VC> > (need high precision decimal or these rounded to 53 bits >> binary) >> VC> > x^2+y^2-1 = 1.0947644252537633366591637369e-47 >> VC> VC> That is exactly 2^(-156). So maybe triple quad precision really >> is enough. > > Furthermore, if you use the computation (x-1)*(x+1)*y*y (assuming as you do > x>y>0), only double precision is necessary. This is proved in the paper > "Implementing Complex Elementary Functions Using Exception Handling" by Hull, > Fairgrieve, Tang, ACM Transactions on Mathematical Software, Vol 20, No 2, > 1994. They give a bound on the error, which I think can be interpreted as > being around 3.9 ULP. I'm using x*x-1+y*y in doubled precision, which I believe is better. I'm now thinking of the following refinement: suppose x is close to 1 and y is close to 0 (other cases are easier to get right accidentally, but harder to analyze). Then u = x-1 in non-doubled precision is exact and cancels most low bits. So u*u is exact in non-doubled precision. Thus x*x-1 can be evalated in mostly-non-doubled precision as u*u+2*u. 2u is exact, and u*u is a tiny correction term (less than about half an ulp relative to 2*u). If we just wanted to pass this result to log1p(), it would round to -2*u and no doubled precision would be necessary. But we need to retain it to add to y*y. The necessary extra precision is much easier for addition than for multiplication. If I'm right that u*u is less than half an ulp, then (-2*u, u*u) is already in my normal form for doubled precision. Oops, this is essentially the same as (x-1)*(x+1). x-1 is u, and x+1 is u+2, so the product is u*u+2*u grouped in a numerically bad way (it either needs extra precision for x+1 and then for the multiplication, or loses accuracy starting with x+1). Did you mean doubled precision, not double precision? This also avoids the following complication: double precision has an odd number of bits, so it can't be split in half for calculating x*x and y*y. The usual 26+27 split would give an error of half an ulp in doubled doubled precision. The above avoids this for x*x in some critical cases. I hope in all critical cases. Cases where x*x and y*y are both nearly 0.5 have other interesting points. If x*x => 0.5, then x*x-1 is exact in doubled precsion. When x*x < 0.5, x*x-1 is not necessarily exact in doubled precision. I handle these cases by writing the expression as (x*x-0.5)+(y*y-0.5). When x*x >= 0.5 and y*y >= 0.25, both methods give exact subtractions and I think they give the same result. > And I think you will see that your example does not contradict their theorem. > Because in your example, x-1 will be rather small. > > So to get reasonable ULP (reasonable meaning 4 rather than 1), double > precision is all you need. I think you mean doubled precision. 4 in undoubled precision is mediocre, but 4 in doubled precision is many more than needed, but I hope I get 0.5+ epsilon. The result starting with double precision would then be accurate with 53-4 or (54-0.5-epsilon) extra bits if the log1p() of it were taken with infinite precision. But log1p() has finite precision, and I'm seeing the effects of slightly more than half a double precision bit being lost on conversion of the doubled double precision x*x+y*y-1 when passed to log1p(), and then another slightly more than half [...] lost by imperfect rounding of log1p(). So one of my tests is to remove the log1p() source of inaccuracy by replacing it with log1pl(). In float precision, exhaustive testing is possible though not complete; all cases tested with of |z| as close as possible to 1 were perfectly rounded. Bruce On Fri, 27 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/27/2012 09:26 AM, Bruce Evans wrote: > >> % hm1 = -1; >> % for (i=0;i<12;i++) hm1 += val[i]; >> % return (cpack(0.5 * log1p(hm1), atan2(y, x))); >> >> It is the trailing terms that I think don't work right here. You sort >> them and add from high to low, but normally it is necessary to add >> from low to high (consider terms [1, DBL_EPSILON/2, DBL_EPSILON/4]). >> Adding from high to low cancels with the -1 term, but then only >> particular values work right. Also, I don't see how adding the low >> terms without extra precision preserves enough precision. > > I understand what you are saying. But in this case adding in order of > smallest to largest (adding -1 last) won't work. If all the signs in the > same direction, it would work. But -1 has the wrong sign. No, even if all the signs are the same, adding from the highest to lowest can lose precision. Normally at most 1 ulp, while cancelation can lose almost 2**MANT_DIG ulps. Example: #define DE DBL_EPSILON // for clarity (1) 1 + DE/2 = 1 (half way case rounded down to even) (2) 1 + DE/2 + DE/2 = 1 (double rounding) (3) DE/2 + DE/2 + 1 = 1 + DE (null rounding) We want to add -1 to a value near 1 like the above. Now a leading 1 in the above will cancel with the -1, and the the order in (3) becomes the inaccurate one. Modify the above by shifting the epsilons and adding another 1 to get an example for our context: (2') -1 + 1 + DE + DE*DE/2 + DE*DE/2 = DE (The leading +-1 are intentionally grouped and cancel. The other terms are (2) multiplied by DE, and suffer the same double rounding.) (3') -1 + 1 + DE*DE/2 + DE*DE/2 + DE = DE + DE*DE (The leading +-1 are intentionally grouped and cancel as before. The lower terms must be added from low to high, as in (3).) The right order is perhaps more transparently described as always from low to high, with suitable and explicit grouping of terms using parentheses to reduce cancelation errors. But I don't like parentheses and prefer to depend on left to right evaluation. With some parentheses, the above becomes: (3'') (DE**2/2 + DE**2/2 + DE + (1 + -1) (Full parentheses for the left to right order would be unreadable, so although the order is critical, they shouldn't be used to emphasize it.) Here the cancelation is exact, but in general it gives a nonzero term which might need to be sorted into the other terms. Strictly ordering all the terms is usually unnecessary and slow, and is usually not done. Neither is the analysis needed to prove that it is unnecessary. Even the above examples (3'*) are sloppy about this. They only work because the cancelation is exact. In (3'), (-1 + 1) is added first. That is correct since it is lowest (0). In (3'') (1 + -1) is added last. That is also correct, although the term is not the highest, since it is 0. Usually the cancelation won't be exact and gives a term that is far from lowest. Assuming that it is still highest is the best sloppy sorting. Efficient evaluation of polynomials usually requires regrouping them in numerically dangerous ways. I do some analysis for this. Bruce Yes, everywhere I said "double precision" I meant "doubled precision." I think the papers by Hull et al were perfectly happy with a ULP of around 4. I have been trying to do a little better, but like you I am noticing that log1p isn't that good either. I have tried some other things. I am attaching this example which gets a ULP a little over 2. I simulate high precision arithmetic by expanding everything out into integers. I certainly didn't aim for a speedy program. On 07/28/2012 12:25 AM, Bruce Evans wrote: > On Fri, 27 Jul 2012, Stephen Montgomery-Smith wrote: > >> On 07/27/2012 09:26 AM, Bruce Evans wrote: >> >>> % hm1 = -1; >>> % for (i=0;i<12;i++) hm1 += val[i]; >>> % return (cpack(0.5 * log1p(hm1), atan2(y, x))); >>> >>> It is the trailing terms that I think don't work right here. You sort >>> them and add from high to low, but normally it is necessary to add >>> from low to high (consider terms [1, DBL_EPSILON/2, DBL_EPSILON/4]). >>> Adding from high to low cancels with the -1 term, but then only >>> particular values work right. Also, I don't see how adding the low >>> terms without extra precision preserves enough precision. >> >> I understand what you are saying. But in this case adding in order of >> smallest to largest (adding -1 last) won't work. If all the signs in >> the same direction, it would work. But -1 has the wrong sign. > > No, even if all the signs are the same, adding from the highest to lowest > can lose precision. Normally at most 1 ulp, while cancelation can lose > almost 2**MANT_DIG ulps. Example: > > #define DE DBL_EPSILON // for clarity > > (1) 1 + DE/2 = 1 (half way case rounded down to even) > (2) 1 + DE/2 + DE/2 = 1 (double rounding) > (3) DE/2 + DE/2 + 1 = 1 + DE (null rounding) > > We want to add -1 to a value near 1 like the above. Now a leading 1 > in the above will cancel with the -1, and the the order in (3) becomes > the inaccurate one. Yes, but in my situation, I am rather sure that when I am adding highest to lowest that this won't occur. I am starting with -1, then adding something close to 1, then adding lots of smaller terms. And I find it very plausible that the kind of situation you describe won't happen. x0*x0 is close to 1. x0*x1 is at most sqrt(DE) times smaller. And so on. So I think the kind of situation you describe should never happen. As I said, I don't have a mathematical proof that the kind of thing you describe can NEVER happen. I just have never observed it happen. On Sat, 28 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/28/2012 12:25 AM, Bruce Evans wrote: >> >> #define DE DBL_EPSILON // for clarity >> >> (1) 1 + DE/2 = 1 (half way case rounded down to even) >> (2) 1 + DE/2 + DE/2 = 1 (double rounding) >> (3) DE/2 + DE/2 + 1 = 1 + DE (null rounding) >> >> We want to add -1 to a value near 1 like the above. Now a leading 1 >> in the above will cancel with the -1, and the the order in (3) becomes >> the inaccurate one. > > Yes, but in my situation, I am rather sure that when I am adding highest to > lowest that this won't occur. I am starting with -1, then adding something > close to 1, then adding lots of smaller terms. And I find it very plausible > that the kind of situation you describe won't happen. x0*x0 is close to 1. > x0*x1 is at most sqrt(DE) times smaller. And so on. So I think the kind of > situation you describe should never happen. Ahem. FP^2 space is not nearly as large as the metaverse (only 2^256 cases even for sparc64), but it is large enough so that almost everything that can happen in it does happen in it. You are right that problems are far away with the x* terms (x0*x0 had better not be very close to 1 unless it is exactly 1, since it it is too close then it will no longer be many times larger than x0*x1 after subtracting 1 from it; the other cases for x* are simpler). The problem is with the additional y* terms. x and y are independent, so for many or most x, there are many y's with bits that cause half-way cases when combined with x. After splitting and squaring, the bits move around, so it hard to generate or control the offending y's. > As I said, I don't have a mathematical proof that the kind of thing you > describe can NEVER happen. I just have never observed it happen. There might be a measly 2^128 bad cases out of 2^256. Then no one would even observe them by chance :-). But half-way cases are fairly common. Bruce On 07/28/2012 02:35 AM, Bruce Evans wrote: > On Sat, 28 Jul 2012, Stephen Montgomery-Smith wrote: > >> On 07/28/2012 12:25 AM, Bruce Evans wrote: >>> >>> #define DE DBL_EPSILON // for clarity >>> >>> (1) 1 + DE/2 = 1 (half way case rounded down to even) >>> (2) 1 + DE/2 + DE/2 = 1 (double rounding) >>> (3) DE/2 + DE/2 + 1 = 1 + DE (null rounding) >>> >>> We want to add -1 to a value near 1 like the above. Now a leading 1 >>> in the above will cancel with the -1, and the the order in (3) becomes >>> the inaccurate one. >> >> Yes, but in my situation, I am rather sure that when I am adding >> highest to lowest that this won't occur. I am starting with -1, then >> adding something close to 1, then adding lots of smaller terms. And I >> find it very plausible that the kind of situation you describe won't >> happen. x0*x0 is close to 1. x0*x1 is at most sqrt(DE) times smaller. >> And so on. So I think the kind of situation you describe should never >> happen. > > Ahem. FP^2 space is not nearly as large as the metaverse (only 2^256 > cases even for sparc64), but it is large enough so that almost > everything that can happen in it does happen in it. You are right > that problems are far away with the x* terms (x0*x0 had better not be > very close to 1 unless it is exactly 1, since it it is too close then > it will no longer be many times larger than x0*x1 after subtracting 1 > from it; the other cases for x* are simpler). The problem is with the > additional y* terms. x and y are independent, so for many or most x, > there are many y's with bits that cause half-way cases when combined > with x. > After splitting and squaring, the bits move around, so it hard to generate > or control the offending y's. > >> As I said, I don't have a mathematical proof that the kind of thing >> you describe can NEVER happen. I just have never observed it happen. > > There might be a measly 2^128 bad cases out of 2^256. Then no one would > even observe them by chance :-). But half-way cases are fairly common. I agree. That is why I am sad that I don't have a mathematical proof. and the probability of picking out the bad example by chance is something like the chances of being hit by a large asteroid, so we will never see it happen in a random experiment. OK. This clog really seems to work. x*x + y*y - 1 is computed with a ULP less than 0.8. The rest of the errors seem to be due to the implementation of log1p. The ULP of the final answer seems to be never bigger than a little over 2. On 07/28/2012 10:46 AM, Stephen Montgomery-Smith wrote: > OK. This clog really seems to work. > > x*x + y*y - 1 is computed with a ULP less than 0.8. The rest of the > errors seem to be due to the implementation of log1p. The ULP of the > final answer seems to be never bigger than a little over 2. > > Also, I don't think the problem is due to the implementation of log1p. If you do an error analysis of log(1+x) where x is about exp(-1)-1, and x is correct to within 0.8 ULP, I suspect that about 2.5 ULP is the best you can do for the final answer: relative_error(log(1+x)) = fabs(1/((1+x) log(1+x))) * relative_error(x) = 1.58 * relative_error(x) Given that log1p has itself a ULP of about 1, and relative error in x is 0.8, and considering x=exp(-1)-1, this gives a ULP at around 1.58*0.8+1 = 2.3. And that is what I observed. (Here "=" means approximately equal to.) Here are some diffs to catrig.c so that it completely passes Peter Jeremy's program www.rulingia.com/~peter/ctest.c. That is, it seems to get all the infs and nans correct. On 07/28/2012 11:15 AM, Stephen Montgomery-Smith wrote: > On 07/28/2012 10:46 AM, Stephen Montgomery-Smith wrote: >> OK. This clog really seems to work. >> >> x*x + y*y - 1 is computed with a ULP less than 0.8. The rest of the >> errors seem to be due to the implementation of log1p. The ULP of the >> final answer seems to be never bigger than a little over 2. >> >> > > > Also, I don't think the problem is due to the implementation of log1p. > If you do an error analysis of log(1+x) where x is about exp(-1)-1, and > x is correct to within 0.8 ULP, I suspect that about 2.5 ULP is the best > you can do for the final answer: > > relative_error(log(1+x)) = fabs(1/((1+x) log(1+x))) * relative_error(x) > = 1.58 * relative_error(x) > > Given that log1p has itself a ULP of about 1, and relative error in x is > 0.8, and considering x=exp(-1)-1, this gives a ULP at around 1.58*0.8+1 > = 2.3. And that is what I observed. > > (Here "=" means approximately equal to.) And I should add that I just realized that ULP isn't quite the same as relative error, so an extra factor of up to 2 could make its way into the calculations. On Sat, 28 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/28/2012 10:46 AM, Stephen Montgomery-Smith wrote: >> OK. This clog really seems to work. >> >> x*x + y*y - 1 is computed with a ULP less than 0.8. The rest of the >> errors seem to be due to the implementation of log1p. The ULP of the >> final answer seems to be never bigger than a little over 2. > > Also, I don't think the problem is due to the implementation of log1p. If you > do an error analysis of log(1+x) where x is about exp(-1)-1, and x is correct > to within 0.8 ULP, I suspect that about 2.5 ULP is the best you can do for > the final answer: > > relative_error(log(1+x)) = fabs(1/((1+x) log(1+x))) * relative_error(x) > = 1.58 * relative_error(x) > > Given that log1p has itself a ULP of about 1, and relative error in x is 0.8, > and considering x=exp(-1)-1, this gives a ULP at around 1.58*0.8+1 = 2.3. > And that is what I observed. Not given: - the error in log1p is close to 0.5 ulps in my version - when the extra internal precision of my log1p is brought out, the error will be a maximum of about 1/2**7 ulps before rounding. This will only save about half an ulp after rounding. The are accumalative error of about half an ulp from the other 2+ steps). Extra precision for these is further off. I made progress towards completing exhaustive testing of this point for all cases near 1 for clogf. All cases tested (at least 99% of all cases possible) were perfectly rounded. There were an amazingly (before doing the analysis below) large number of half-way cases near (1 + ~1e-10*I). This is an easy special case for clogf() -- x is precisely 1 so it reduces immediately to log1pf(1+y*y)/2 -- but when x is precisely 1 there are many half-way cases. Due to numerical accidents, it turns out that many cases are correctly rounded in float precision but not in double precision. My testing is not quite complete because it doesn't verify that the accidents aren't mostly due to my test methodology. Part of the methodology is: - start with float args. These have only 24 mantissa bits, so after promotion to double and long double precision they have many lower bits 0. This is what gives they many half-way cases that are unresolvable in long double precision after y is squared. The magic arg values (y near 1e-10) are what is needed for squaring to push the lowest 1 bit into an unfortunate place. - in both clogf() and clog(), return log1pl((long double)y*y)/2 (rounded to float precision), so that we see the inherent inaccuracy of clogf() and not external inaccuracy from log1pf(). This is uninteresting for x = 1 since there are no internal extra-precision calculations to verify for that case, but it is interesting for the general case to verify that the doubled float precision algorithm is exact. I don't want the general case hidden by errors for this case, so I evaluate y*y in long double precision so as to minimise rounding errors in it, although the normal inaccurate y*y is part of clogf() (we could use doubled precision for it, but we don't because we know that all the extra precision would be lost when it is passed to log1pf(). Except when y*y is near underflow, we are more careful). - it turns out that the extra precision of log1pl() is enough for perfect rounding in float precision but not for perfect rounding in double precision. This is no accident. log1pl() gives only 11 extra bits for double precision, but 40 extra for float precision. A simple probabistic argument shows that 40 is always enough unless we are unlucky. The chance of a half-way case that is not resolvable in long double precision is about 2*--11 and 2**-40 in the 2 cases, respectively. There are less than 2**-24 args of interest (all x between sqrt(2)-epsilon and 1, corresponding y near sqrt(1-x*x)). We would have to be much more unlucky to hit a half-way case with so many fewer cases in float precision. Assuming that the half-way cases are uniformly distributed, which they aren't, then the probabities for _not_ hitting a half-way case would be related to: (1-2**-40)**(2**24) = 0.999985 for float precision (1-2**-11)**(2**53) = <too small for pari>, but much smaller than (1-2**-11)**(2**39) = 3.315e-116608509 for double precision This calculation is certainly off by many powers of 2 even in float precision. It's a bit surprising that the probability is so high for 2**24 cases (no birthday paradox with uniform distribution). Bruce On 07/28/2012 06:12 PM, Stephen Montgomery-Smith wrote: > On 07/28/2012 11:15 AM, Stephen Montgomery-Smith wrote: >> On 07/28/2012 10:46 AM, Stephen Montgomery-Smith wrote: >>> OK. This clog really seems to work. >>> >>> x*x + y*y - 1 is computed with a ULP less than 0.8. The rest of the >>> errors seem to be due to the implementation of log1p. The ULP of the >>> final answer seems to be never bigger than a little over 2. >>> >>> >> >> >> Also, I don't think the problem is due to the implementation of log1p. >> If you do an error analysis of log(1+x) where x is about exp(-1)-1, and >> x is correct to within 0.8 ULP, I suspect that about 2.5 ULP is the best >> you can do for the final answer: >> >> relative_error(log(1+x)) = fabs(1/((1+x) log(1+x))) * relative_error(x) >> = 1.58 * relative_error(x) >> >> Given that log1p has itself a ULP of about 1, and relative error in x is >> 0.8, and considering x=exp(-1)-1, this gives a ULP at around 1.58*0.8+1 >> = 2.3. And that is what I observed. >> >> (Here "=" means approximately equal to.) > > And I should add that I just realized that ULP isn't quite the same as > relative error, so an extra factor of up to 2 could make its way into > the calculations. In fact, I think I messed it up a bunch. So let D(f(x)) denote the absolute error in f(x). D(f(x)) = f'(x) Dx. So D(log(1+x)) = Dx/(1+x). If x is a bit bigger than exp(-1)+1 = -0.63, which has ilogb = -1. If ULP in calculating x is around 0.8, then Dx = 0.8 * 2^(-d-1). where d is the number of bits in the mantissa, So D(log(1+x)) = Dx/0.37. Since log(1+x) is a little bit bigger than -1, and so ilogb(log(1+x)) = -1. ULP(log(1+x)) = Dx/0.37 * 2^{d+1} = 0.8/0.37 = 2.2 Now add 1 for ULP in calculating log1p, and this only gives a ULP of 3.2. So the observed bound is actually better than expected. If one could get the ULP of log1p to be as good as possible (0.5), the best ULP one could get is 2.7. We still do a bit better than that. On 07/28/2012 04:46 PM, Stephen Montgomery-Smith wrote:
> Here are some diffs to catrig.c so that it completely passes Peter
> Jeremy's program www.rulingia.com/~peter/ctest.c. That is, it seems to
> get all the infs and nans correct.
And I think I messed up these diffs as well. Can we try this instead?
On Sat, 28 Jul 2012, Stephen Montgomery-Smith wrote:
> On 07/28/2012 11:15 AM, Stephen Montgomery-Smith wrote:
>> On 07/28/2012 10:46 AM, Stephen Montgomery-Smith wrote:
>>> OK. This clog really seems to work.
>>>
>>> x*x + y*y - 1 is computed with a ULP less than 0.8. The rest of the
>>> errors seem to be due to the implementation of log1p. The ULP of the
>>> final answer seems to be never bigger than a little over 2.
> ...
> And I should add that I just realized that ULP isn't quite the same as
> relative error, so an extra factor of up to 2 could make its way into the
> calculations.
Yes, this is tricky. For denormals, it is easy to be off by a factor of
2**MANT_DIG or infinity instead of only 2.
For normals, the most interesting cases are near powers of 2 (say 1).
One ulp is twice as large for values in [1, 2) as it is for values
in [0.5, 1). Even to determine which one to use, you need to know
if the infinitely precise result is >= 1 or < 1, else you may be
off by a factor of 2 in the error checking. If the factor is 1/2,
then it hides errors, and if it is 2 then it gives unexpected errors.
For denormals, the easiest case to understand is when the correctly
rounded case is the smallest strictly positive denormal. Then the
size of an ulp is the same as the value of this denormal. A rounding
error of < 1 ulp (but > 0.5 ulps) may give a result of 0 ulps or 2 ulps.
Such errors are to be expected. But relatively, they are infinity and
2, respectively. Normally you expected rounding errors of near
2**-MANT_DIG, and infinity and 2 are much larger. The relative error
should be scaled (like the size of an ulp should be) so that you don't
see such large errors.
You might not care about denormals, but you should check a few of them
and then you don't want errors in the checking software for them hiding
errors for normals. Without denormals, there would be no gradual
underflow, and underflow from the smallest normal to 0 really would be
essentially infinity (it would be about 2**MANT_DIG in ulps).
My checking software scales for denormals, but I think I got it wrong
and am off by a factor of 2. For normals near a power of 2, its just
impossible to determine the right scale without a reference function
with enough extra precision to determine which side it is on. Even
the correct definition of an ulp is unclear near powers of 2 (perhaps
other cases). I want my checking program to match my definition, which
is currently just what the checking program does :-) -- don't worry
about the reference function not be precise enough. There is the
same uncertainty about the size of an ulp for a result of 0 (and
more -- maybe the result should be -0 :-).
Recently I started checking extra internal precision for some functions.
This gives relative errors of < 2**(-MANT_DIG+N), where N is the extra
precision. When the relative error is near 2**-MANT_DIG, it is hard
to tell if it is smaller than 1 ulp, since the ulp scale may be hard
to determine and the relative error doesn't map simply to ulps. When
N >= 2, there is a factor of 2 to spare and final errors (after
rounding) that are certain to be < 1 ulp are easy to ignore. Even
functions that don;t have much extra internal precision but which
that meet my design goal of a final error < 1 ulp should have N at
least 1, so their non-errors should be easy to not see too, but I have
used this mainly for functions with N about 7. The internal errors
for the recently committed logl() are < 2**-120 relative on sparc64,
except for denormal results. This is obviously enough for 113-bit
precision to 0.5+epsilon ulps. But rounding would cluster the final
errors near 2**-113 or 2**--114 and it wouldn't be obvious how this
maps to ulps.
Bruce
I found a bug in catanh when |z| is very large. I believe I have corrected the bug in catrig.c. I made some other small changes also, mostly style and comments. I'm going to stop posting updates here, and keep the latest version here: http://people.freebsd.org/~stephen/ Responsible Changed From-To: freebsd-bugs->freebsd-numerics Redirect to -numerics batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed. (In reply to Eitan Adler from comment #23) Yes, I'll go ahead and close it. It has been mostly committed. |