Because the CPU has to load the model in parts for every cycle so you're spendin...

treprinum · 2024-12-03T20:05:50 1733256350

My comment was about Intel having a starter project, getting enthusiastic response from devs, network effects and iterate from there. They need a way to threaten Nvidia and just focusing on what they can't do won't bring them there. There is one route where they can disturb Nvidia's high end over time and that's a cheap basic GPU with lots of RAM. Like Ryzen 1st gen whose single core performance was two generations behind Intel trashed Intel by providing 2x as many cores for cheap.

m00x · 2024-12-03T21:53:38 1733262818

It would be a good idea to start with some basic understanding of GPU, and realizing why this can't easily be done.

treprinum · 2024-12-03T22:09:02 1733263742

That's a question M3 Max with its internal GPU already answered. It's not like I didn't do any HPC or CUDA work in the past to be completely clueless about how GPUs work though I haven't created those libraries myself.

m00x · 2024-12-04T07:06:49 1733296009

What have you implemented in CUDA?