> I've extracted a small subset of the data to graph in Excel Thanks. If you sti...

Const-me · on July 21, 2020

Can re-generate the data files, it’s couple pages of code to copy-paste from my gist and compile, but I’m not sure how to implement what you wrote.

The results are not guaranteed to have same exponent, e.g. 1 / 0.499999 can be either 1.999999 or 2.0000001, both are correct within the precision, but have different exponents.

acqq · on July 21, 2020

The algorithm would be approximately:

1) Use the binary representation of the numbers! To do so, cast the resulting float to the unsigned integer, then use bit masks and shifts to extract the exponent and mantissa. Note that the leading 1 is not explicit but implicit in the IEEE format unless it's a denormal number (so make it always explicit during the extraction).

2) use the exponent of the correct result as the "interval" reference.

If the exponents are the same, do a subtraction of the smaller from the bigger mantissa, that's the absolute "distance" between the two numbers -- the goal is to find what is the biggest absolute distance in which interval.

3) If one of the 2 values that are compared has different exponent, they can be converted to the same by a bit shift. Shift the mantissa of the one with the bigger exponent left accordingly. Again do the subtraction and use the result as the absolute distance. The goal is to figure out the biggest absolute distance in each interval (maintaining a maximum for each interval).

In short, think binary, not decimal, and measure using these values. Binary are only values that matter, decimal representation doesn't necessarily represent the exact values of bits.

Examples:

float 1.0 == unsigned 0x3f800000 here exponent is 127 == 2^0 and mantissa 0 with implicit 1 at the start i.e. explicit: 0x800000

float 0.999999940395355224609375 = unsigned 0x3f7fffff here exponent is 126 == 2^-1 and mantissa explicit: 0xffffff

The absolute distance between these two numbers is 1 (adding one to the lowest bit of mantissa of the smaller number would result the higher number 0xffffff + 1 = 0x1000000, the later is the mantissa adjusted to the same exponent of the smaller number (0x800000 << 1) ). If the "correct" number was 0x3f800000 and even if the shift was needed to calculate the absolute distance, the interval is still 0 (i.e. 0 is the x axis value, as its exponent was 2^0 i.e. 127, and the value to be plotted is on y is 1 until a bigger distance occurs).

For more examples of the format you can play here:

https://www.h-schmidt.net/FloatConverter/IEEE754.html

Also note that a few exponents are special, meaning infinity or NaN. Whenever the "correct" answer is not a NaN or infinity but the "incorrect" is, that should be treated specially, if it actually happens.

Const-me · on July 21, 2020

Results from Zen 2: https://github.com/Const-me/SimdPrecisionTest/blob/master/vr... Results from Skylake: https://github.com/Const-me/SimdPrecisionTest/blob/master/vr...

Total, exact, less, greater columns have total count of floats in a bucket. Sum of the “Total” gives 2^32, the total count of unique floats.

Computing max bit that’s different is too slow for the use case, neither SSE nor AVX have vector version of BSR instruction. Instead, I’m re-interpreting floats as integers and computing difference of the integers. maxLess, maxGreater, and maxAbs columns have that maximum error, measured as count of float values of the error. The value 4989 means the mantissa had like 12-13 lowest bits incorrect.

Source code is there: https://github.com/Const-me/SimdPrecisionTest/blob/master/rc... Not particularly readable because I’ve used AVX2 and OpenMP, however this way it takes less than a second on desktop, and maybe 1.5 seconds on a laptop to process all of these floats.

acqq · on July 22, 2020

Based on how I understand the numbers in the tables, it looks to me like both implementations behave the same in the critical points, and AMD obviously achieves less distance from the "exact", but has some kind of truncating instead of rounding logic which changes the distribution of the approximations, and you discovered that with counting "less" and "greater." Congratulations!

acqq · on July 21, 2020

Really fascinating, thanks!