They are charging as much as Nvidia for it. Now imagine they offered such a card...

elorant · 2024-12-03T21:39:29 1733261969

Let’s say for the sake of argument that you could build such a card and sell it for less than $5k. Why would you do it? You know there’s huge demand in the tens of billions per quarter for high end cards. Why undercut so heavily that market? To overthrow NVidia? So you’ll end up with a profit margin way low and then your shareholders will eat you alive.

ryao · 2024-12-04T10:56:39 1733309799

AMD would be selling it at a loss. Given that HBM costs 3x the price of desktop DRAM and a 192GB kit costs $600 at Newegg, the memory alone would cost 90% of the price. The GPU die, PCB, power circuitry, etc likely costs more than $200 to make.

This does not consider that the board of directors would crucify Lisa Su if she authorized the use of HBM on a consumer product while it is supply constrained and there is enterprise demand for products using it. AMD can only get a limited amount of it and what they do get is not enough for enterprise demand where AMD has extremely healthy margins.

Even if they by some miracle turned a profit on a $2000 consumer card with 192GB HBM, every sale would have a massive opportunity cost and effectively would be a loss in the eyes of the board of directors.

Meanwhile, Nvidia would be unaffected because AMD could not produce very many of these.

blagie · 2024-12-04T12:13:07 1733314387

NVidia would be dramatically affected, just not overnight.

If Intel or AMD sold a niche product with 48GB RAM even at a loss, but hit high-end consumer pricing, there would be a flood of people doing various AI work to buy it. The end result would be that parts of NVidia's moat would start draining rather quickly, and AMD / Intel would be in a stronger position for AI products.

I use NVidia because when I bought AMD during the GPU shortage, ROCm simply didn't work for AI. This was a few years back, but I was burned badly enough that I'm unlikely to risk AMD again for a long, long time. Unused code sits broken, and no ecosystem gets built up. A few years later, things are gradually improving for AMD for the kinds of things I wanted to do years ago, but all my code is already built around NVidia, and all my computers have NVidia cards. It's a project with users, and all those users are buying NVidia as well (even if just for surface dependencies, like dev-ops scripts which install CUDA). That, times thousands of projects, is part of NVidia's moat.

If I could build a cheap system with around 200GB, that would be incentive for me to move the relatively surface dependencies to work on a different platform. I can buy a motherboard with four PCI slots, and plug in four 48GB cards to get there. I'd build things around Intel or AMD instead.

The alternative is NVidia would start shipping competitive cards. If they did that, their high-end profit margins would dissolve.

The breakpoints for inference functionality are really at around 16GB, 48GB, and 200GB, for various historical reasons.

ryao · 2024-12-04T17:47:04 1733334424

> If I could build a cheap system with around 200GB,

Even if AMD dropped the price to $2000, you could not be able to build a system with one of these cards. You cannot buy these cards at their current 5 digit pricing. The idea that you could buy it if they dropped the price to $2000 is a fantasy, since others would purchase the supply long before you have a chance to purchase one, just like they do now.

AMD is already selling out at the current 5 digit pricing and Nvidia is not affected, since Nvidia is selling millions of cards per year and still cannot meet demand while AMD is selling around 100,000. AMD dropping the price to $2000 would not harm Nvidia in the slightest. It would harm AMD by turning a significant money maker into a loss leader. It would also likely result in Lisa Su being fired.

By the way, the CUDA moat is overrated since people already implement support for alternatives. llama.cpp for example supports at least 3. PyTorch supports alternatives too. None of this harms Nvidia unless Nvidia stops innovating and that is unlikely to happen. A price drop to $2000 would not change this.

blagie · 2024-12-04T23:41:07 1733355667

Let's compare to see if it's really the same market:

HGX B200: 36 petaflops at FP16. 14.4 terabytes/second bandwidth.

RX4060 (similar to Intel): 15 teraflops at FP16. 272 gigabytes/second bandwidth

Hmmm.... Note the prefixes (peta versus tera)

A lot of that is apples-to-oranges, but that's kind of the point. It's a different market.

A low-performance high-RAM product would not cut into the server market since performance matters. What it would do is open up a world of diverse research, development, and consumer applications.

Critically, even if it were, Intel doesn't play in this market. If what you're saying were to happen, it should be a no-brainer for Intel to launch a low-cost alternative. That said, it wouldn't happen. What would happen is a lot of business, researchers, and individuals would be able to use ≈200GB models on their own hardware for low-scale use.

> By the way, the CUDA moat is overrated since people already implement support for alternatives.

No. It's not. It's perhaps overrated if you're building a custom solution and making the next OpenAI or Anthropic. It's very much not overrated if you're doing general-purpose work and want things to just work.

https://www.nvidia.com/en-us/data-center/hgx/ https://www.techpowerup.com/gpu-specs/geforce-rtx-4060.c4107

ryao · 2024-12-05T01:23:15 1733361795

What treprinum suggested was AMD selling their current 192GB enterprise card (MI300X) for $2000, not a low end card. Everything you said makes sense but it is beside the point that was raised above. You want the other discussion about attaching 128GB to a basic GPU. I agree that would be disruptive, but that is a different discussion entirely. In fact, I beat you to saying that would be disruptive by about 16 hours:

https://news.ycombinator.com/item?id=42315309

latchkey · 2024-12-03T21:06:57 1733260017

If you want to load up 405B @ FP_16 into a single H100 box, how do you do it? You get two boxes. 2x the price.

Models are getting larger, not smaller. This is why H200 has more memory, but the same exact compute. MI300x vs. MI325x... more memory, same compute.

p1esk · 2024-12-03T20:55:39 1733259339

We would also need to imagine AMD fixing their software.

treprinum · 2024-12-03T21:21:27 1733260887

I think plenty of enthusiastic open source devs would jump at it and fix their software if the software was reasonably open. The same effect as what happened when Meta released LLaMA.

jjmarr · 2024-12-03T21:47:23 1733262443

It is open and they regularly merge PRs.

https://github.com/ROCm/ROCm/pulls?q=is%3Apr+is%3Aclosed

treprinum · 2024-12-03T22:31:37 1733265097

AMD GPUs aren't very attractive to ML folks because they don't outshine Nvidia in any single aspect. Blasting lots of RAM onto a GPU would make it attractive immediately with lots of attention from devs occupied with more interesting things.

genewitch · 2024-12-04T00:37:20 1733272640

does the 7900xt outperform the 3090ti? if so, there's already a market because those are the same price. I don't mean in theory are there any workloads that the 7900xt can do better? Even if they're practically equal performance you get a warranty and support with your new 7900xt.

also i didn't know there was a 192GB amd GPU.

jjmarr · 2024-12-04T01:31:17 1733275877

MI300X already leads in VRAM as it has 192 GB.

For local inference, 7900 XTX has 24 GB of VRAM for less than $1000.

At what threshold of VRAM would you start being interested in MI?

treprinum · 2024-12-04T09:38:35 1733305115

Problem with MI300x is the price. Problem with 7900XTX is that it's at best as good as Nvidia with the same RAM for a similar price. If 7900XTX had e.g. 64GB of RAM, was 2x slower than 4080, and kept its price, it would sell like crazy.

FeepingCreature · 2024-12-04T09:22:21 1733304141

I have a 7900 XTX. Honestly I regret it. It took two years for the driver to stop randomly crashing with very pedestrian ROCm loads. And there's no future in AMD support now they're getting out of the high-end dual-use GPU game anyways. I should have gone with NVidia.