Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, and inference is a huge market in itself and potentially larger than training (gut feeling haven’t run numbers)

Keep NVIDIA for training and Intel/AMD/Cerebras/… for interference.



The funny thing about Cerebras is that it doesn't scale well at all for inference and if you talk to them in person, they are currently making all their money on training workloads.


Inference is still a lot faster on CUDA than on CPU. It's fine if you run it at home or on your laptop for privacy, but if you're serving those models at any scale, you're going to be using GPUs with CUDA.

Inference is also a much smaller market right now, but will likely be overtaken later as we have more people using the models than competing to train the best one.


NVidia Blackwell is not just a GPU. Its a Rack with a interconnect through a custom Nvidia based Network.

And it needs liquid cooling.

You don't just plugin intel cards 'out of the box'.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: