#if defined(__AVX512F__) || defined(__AVX2__) void configure_x86_denormals(void) { _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); // Flush results to zero _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON); // Treat denormal inputs as zero } #endif
- `f64` throughput grew from 0.2 to 8.2 TFLOPS. - `f32` throughput grew from 0.6 to 15.1 TFLOPS.