Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

“AI Developers Are Buying Gaming GPUs in Bulk”

https://www.extremetech.com/computing/here-we-go-again-ai-de...

Retail gaming GPUs are available and they’re MUCH cheaper than cloud GPU.

It’s happening now.

AI software needs to make a choice too, does it only run in Nvidia, or does it make itself compatible with the mass market.

For 50 years software developers have chosen to make their software work on cheap mass market devices.



But retail gaming GPUs are overwhelmingly Nvidia? https://store.steampowered.com/hwsurvey/videocard/


Those overwhelmingly cheap Nvidia gaming GPUs got purchased for gaming use - and have usually low tensor compute capabilities and low VRAM ammounts so they're less desirable even for post-training quantization generative large language models.


Nvidia imposes constraints on its drivers so you cannot really make full use of them and must go for high end data centre devices.


It's true that a lot of ML development has taken place on gaming GPUs, over the past decade and a half. And you can certainly get a long way with gaming GPUs in a lot of areas of ML - 8GB is enough to run things like stable diffusion, just about.

However, some ML models (such as LLMs) demand a lot of vram to train. Simultaneously, a lot of gamers have been unimpressed by the latest generations of gaming GPUs due to the limited vram.

nvidia's GTX-1070 released in 2016 with 8 GB of vram and an MSRP of $379. Then the RTX-2060 Super, with 8 GB of vram and an MSRP of $399. Then the RTX-3060 Ti, with 8 GB of vram and an MSRP of $399. Then the RTX-4060 Ti with - you guessed it - 8 GB of vram and an MSRP of $399.

Some people think nvidia is deliberately being miserly with vram on gaming cards, to try and force ML users onto the $$$$$ data centre GPUs.

I suspect this is what andrewstuart means by "Nvidia is pushing upmarket" - that they're deliberately letting their gaming products languish, in pursuit of the data centre market.


> 8GB is enough to run things like stable diffusion, just about.

4GB (from personal experience, 2GB from what I’ve heard) is enough to run stable diffusion 1.x/2.x with the major current consumer UIs and the optimizations they make under the hood. Not sure about SDXL. The original first-party inferencing code takes more.

> Some people think nvidia is deliberately being miserly with vram on gaming cards, to try and force ML users onto the $$$$$ data centre GPUs.

And, sure, as long as there is no real competition, that kind fo segmentation makes sense. OTOH, if there is competition and desktop ML demand, it will make progressively less sense. So, the question seems to be, will there be competition that matters?


> AI software needs to make a choice too, does it only run in Nvidia, or does it make itself compatible with the mass market.

Who's making that choice? It sounds like a lot of people want to make this stuff run well on iPhone and Android and AMD and Mac, but they just don't have the API control or insider help. This isn't a thing where Open Source developers step up to the task and fix everything with a magic wand. This is a situation where the entire industry converges on a compute standard, or Nvidia continues to dominate. They are betting on Apple, AMD and Intel being stuck in an eternal grudge-match, and their residuals won't stop rolling in.

The real question is much less hopeful; can the industry put aside it's differences to make a net positive experience for consumers?

...probably not. None of the successful companies do, anyways.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: