And.... It raises the very pointed question as to WHY they are getting better performance when all the performance-critical code is written in C/assembler in the Intel library. It seems inconceivable that 75% of the CPU profile isn't being spent in the Intel crypto library. In which case, big fat so what?
The question is: are they cheating?
Could it possibly be that that they have (somewhat suicidally) chosen to force the AVX512 execution path, when more reasonable implementations have decided that it's not really worth risking halving the performance of EVERY OTHER TASK ON THE ENTIRE COMPUTER in order to use AVX512 for a performance gain that isn't going to matter except in the very tiniest slice of use cases -- big iron running on the edge with dozens (hundreds?) of gazillo-bit/s network adapters, doing nothing but streaming TLS connections. Plus the fact that you'd have to lock your TLS encryption code to a particular CPU core on previous-generation CPUS, which is also a Really Bad Thing To Do for a TLS transfer.
I rather suspect it's entirely that.
Even on latest generation intel CPUs it's not clear whether using AVX512 for TLS is a sensible choice. AVX52 still drops the processor frequency by 10% on latest-gen CPUs. So every core on the entire CPU would have to be spending 80% (60%?) of their time running TLS crypto code in order to realize actual benefit from using AVX-512 crypto code.
The question is: are they cheating?
Could it possibly be that that they have (somewhat suicidally) chosen to force the AVX512 execution path, when more reasonable implementations have decided that it's not really worth risking halving the performance of EVERY OTHER TASK ON THE ENTIRE COMPUTER in order to use AVX512 for a performance gain that isn't going to matter except in the very tiniest slice of use cases -- big iron running on the edge with dozens (hundreds?) of gazillo-bit/s network adapters, doing nothing but streaming TLS connections. Plus the fact that you'd have to lock your TLS encryption code to a particular CPU core on previous-generation CPUS, which is also a Really Bad Thing To Do for a TLS transfer.
I rather suspect it's entirely that.
Even on latest generation intel CPUs it's not clear whether using AVX512 for TLS is a sensible choice. AVX52 still drops the processor frequency by 10% on latest-gen CPUs. So every core on the entire CPU would have to be spending 80% (60%?) of their time running TLS crypto code in order to realize actual benefit from using AVX-512 crypto code.
That's what and.