Launching a new SKU for $500-1000 with 48gb of RAM seems like a profitable idea....

Tuna-Fish · 2024-12-03T20:31:06 1733257866

It's not technically possible to just slap on more RAM. GDDR6 is point-to-point with option for clamshell, and the largest chips in mass production are 16Gbit/32 bit. So, for a 192bit card, the best you can get is 192/32×16Gbit×2 = 24GB.

To have more memory, you have to design a new die with a wider interface. The design+test+masks on leading edge silicon is tens of millions of NRE, and has to be paid well over a year before product launch. No-one is going to do that for a low-priced product with an unknown market.

The savior of home inference is probably going to be AMD's Strix Halo. It's a laptop APU built to be a fairly low end gaming chip, but it has a 256-bit LPDDR5X interface. There are larger LPDDR5X packages available (thanks to the smartphone market), and Strix Halo should be eventually available with 128GB of unified ram, performance probably somewhere around a 4060.

rubatuga · 2024-12-04T02:47:53 1733280473

Interesting, will have to try it out when it is released.

layer8 · 2024-12-03T20:03:07 1733256187

You can’t just throw in more RAM without having the rest of the GPU architected for it. So there’s an R&D cost involved for such a design, and there may even be trade-offs on performance for the mass-market lower-tier models. I’m doubtful that the LLM enthusiast/tinkerer market is large enough for that to be obviously profitable.

hajile · 2024-12-03T20:20:28 1733257228

That would depend on how they designed the memory controllers. GDDR6 only supporting 1-2gb modules at present (I believe GDDR6W supports 4gb modules). If they were using 12 1gb modules, then increasing to 24gb shouldn't be a very large change.

Honestly, Apple seems to be on the right track here. DDR5 is slower than GDDR6, but you can scale the amount of RAM far higher simply by swapping out the density.

KeplerBoy · 2024-12-03T20:25:30 1733257530

It's a 192 bit interface, so 6 16gbit chips.

KeplerBoy · 2024-12-03T20:23:48 1733257428

Of course you can just add more RAM. Double the capacity of every chip and you get twice the RAM without ever asking an engineer.

People did it with the RTX3070. https://www.tomshardware.com/news/3070-16gb-mod

Tuna-Fish · 2024-12-03T20:32:54 1733257974

Can you find me a 32Gbit GDDR6 chip?

jmward01 · 2024-12-03T20:08:04 1733256484

give me 48gb with reasonable power consumption so I can dev locally and I will buy it in a heartbeat. Anyone that is fine-tuning would want a setup like that to test things before pushing to real GPUs. And in reality if you can fine-tune on a card like that in two days instead of a few hours it would totally be worth it.

justsomehnguy · 2024-12-03T20:14:38 1733256878

I would love too, but you can't just add the chips, you need the the bus too.

jmward01 · 2024-12-03T20:46:58 1733258818

The bigger point here is to ask why they aren't designing that in from the start. Same with AMD. RAM has been stalled and is critical. Start focusing on allowing a lot more of it, even at the cost of performance, and you have a real product. I have a 12GB 3060 as my dev box and the big limiter for it is RAM, not cuda cores. If it had 48GB but the same number of cores then I would be very happy with it, especially if it was power efficient.

ac29 · 2024-12-04T00:55:59 1733273759

Because designing a low end GPU with a very wide memory interface isn't useful for gaming, and that is where the vast majority of non-datacenter discrete GPU sales are right now.

coolspot · 2024-12-04T06:45:10 1733294710

> give me 48gb with reasonable power consumption so I can dev locally and I will buy it in a heartbeat

https://a.co/d/1LMNatf