Yes, and inference is a huge market in itself and potentially larger than traini...

latchkey · 2024-12-03T21:09:56 1733260196

The funny thing about Cerebras is that it doesn't scale well at all for inference and if you talk to them in person, they are currently making all their money on training workloads.

m00x · 2024-12-03T19:35:24 1733254524

Inference is still a lot faster on CUDA than on CPU. It's fine if you run it at home or on your laptop for privacy, but if you're serving those models at any scale, you're going to be using GPUs with CUDA.

Inference is also a much smaller market right now, but will likely be overtaken later as we have more people using the models than competing to train the best one.

Muskyinhere · 2024-12-03T19:32:53 1733254373

NVidia Blackwell is not just a GPU. Its a Rack with a interconnect through a custom Nvidia based Network.

And it needs liquid cooling.

You don't just plugin intel cards 'out of the box'.