I have done my PhD on measuring floating-point error in large numerical simulati...

credit_guy · on Oct 6, 2024

Here's an example where higher precision could be useful. You need to do an expectation of some function over a distribution. You only have the moment generating function for the distribution. The moments are the derivatives of this MGF, but a rough rule of thumb is that if the precision of a function is epsilon, then the precision of it's n'th derivative is epsilon^(1/n). If you have 12 significant digits for the MGF, then you can count on only 1 significant digit for the 12'th derivative. Let's say that this is simply unacceptable for you, you need 4 digits, so you can use the first 3 derivatives only. If you double the precision, you can use 6 derivatives, so you can match 6 moments. This might be enough for your problem. Of course, you can find better algorithms, like you use cumulants instead of moments, find a way to devise a Gaussian quadrature method (a Golub-Welsch algorithm) that uses these cumulants, etc. But that requires doing a lot of hard math (with the inevitable initial bugs), while plugging in a double precision library gives you instantaneous results.

nestorD · on Oct 6, 2024

Yes! And no!

You will indeed need higher precision if you need a very large number of digits (more than what double precision gives you).

However, even with high precision, the algorithms you use to compute your moments matter: you might not have as many digits as you hope for. Variance, for example, is famously numerically unstable (sometimes giving a negative variance) when computed naively. A stable summation algorithm plugged into a variance computation (even a carefully written one) can sometimes give you six additional digits of precision in the variance that you might not get when doubling the precision (it depends on how big of a cancellation you had in your operations; too big, and doubling the precision will not be enough to recover, while a compensated summation would do fine).

I have seen this play out once in a clustering algorithm that would only work on some data when using a compensated summation to compute the variance.

credit_guy · on Oct 7, 2024

Well, my point is that it's not as easy as saying "just pick a smarter algorithm". Not all problems needing high precision can resort to a smart summation. Sometimes that smart algorithm just doesn't exist yet.

You are also right, that if you can solve your problem with high precision, it's unlikely that just quad precision will be enough. For example the inverse Laplace transform. In some cases you need thousands of digits. The cases where double precision is not enough, but quad precision is are probably rare. But it's worth a try.

PaulHoule · on Oct 4, 2024

That’s what I was taught about numerics doing my PhD in the 1990s. I was taught that you shouldn’t even use doubles if you could use singles.

sebzim4500 · on Oct 4, 2024

That's also my experience.

It's interesting that 64 bits are often necessary (as opposed to 32) but 128 bits so rarely are.

nestorD · on Oct 4, 2024

Yes! I suspect that it boils down to 32 bits not being that much (you can enumerate them to test a function), while 64 bits is a lot. By that point, if you are having problems, then you are likely hitting a wall that will not be crossed by throwing additional bits at the problem.

KolenCh · on Oct 4, 2024

Thanks, it is very helpful. 2 questions:

1. What’s the implications when parallelization is considered? I.e. is those accurate sum algorithms parallelizable, and how would it compares to traditional parallelized sum? 2. What is the performance penalty I should be expecting?

nestorD · on Oct 5, 2024

1. Parallelization: Yes, they do parallelize! Conceptually, it is fairly easy to use accurate algorithms locally, then reduce the local accumulators. A side effect of doing this is cutting (or reducing) the noise in the output introduced by the parallelization (note that this is not to be confused with deterministic summation/BLAS, which enforces the order of operations to achieve the same side effect but do not improve accuracy).

2. It varies wildly depending on the algorithm you are using, the implementation of said algorithm, and the operations you are performing outside of the summation (if you sum the results of a loop performing non-trivial operations, that can easily overlap with the summation overhead making it essentially free). The only rule of thumb I can give is that runtime cost increases with desired accuracy. My recommendation would be to switch from a naive sum to using an exact accumulator and observe the difference in output value (letting you know how impactful numerical errors in that sum were) and runtime (letting you know how expensive the switch is for you), then dial it down if needed.

KolenCh · on Oct 5, 2024

Thanks! That’s very interesting! Now I need to find my next project to try this…