Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> while with stuff like AVX there is this implicit hope that you just code as always and the compiler will take care of the rest via auto-vectorization and PhD level optimization algorithms

No. I recently could really, really have used the packed saturated integer arithmetic and horizontal addition in AVX2 (but my old machine doesn't support it) and even better, the same but 512 bits wide on AVX512. It would only have been 6 or 7 instructions, if that, but it was inner loop, and mattered. Using compiler intrinsics would have been fine. I think you're looking at things too narrowly.



I am looking at it of the point of view of joe/jane developer that cannot tell head from tail regarding vector programming and doesn't even know what compiler intrinsics are for, and use languages that don't expose them anyway.


Well those people will never be getting the most out of their CPUs to begin with.


Which is the whole point of "this implicit hope that you just code as always and the compiler will take care of the rest via auto-vectorization and PhD level optimization algorithms.", because not only do those people not get it, there is a general decline in using languages that expose vector intrisics like C and C++ for regular LOB applications.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: