The RAM bandwidth is so slow on this that you can barely train or do inference o...

wmf · 2025-08-28T01:34:11 1756344851

It's the same as Strix Halo and M4 Max that people are going gaga about, so either everyone is wrong or it's fine.

gardnr · 2025-08-28T02:56:59 1756349819

Memory Bandwidth:

Nvidia DGX: 273 GB/s

M4 Max: (up to) 546 GB/s

M3 Ultra: 819 GB/s

RTX 5090: ~1.8 TB/s

RTX PRO 6000 Blackwell: ~1.8 TB/s

aurareturn · 2025-08-28T01:55:26 1756346126

M4 max has more than double the bandwidth.

Strix Halo has the same and I agree it’s overrated.

Rohansi · 2025-08-28T02:49:38 1756349378

I would expect/hope that DGX would be able to make better use of its bandwidth than the M4 Max. Will need to wait and see benchmarks.

woooooo · 2025-08-28T06:26:29 1756362389

Matrix vector multiplication for feed forward layers is most of the bandwidth as I understand things, there's not really a way to do it "better", its just a bunch of memory-bound dot products.

(Posting this comment in hopes of being corrected and learning something).

Rohansi · 2025-08-28T07:57:38 1756367858

The problem is different parts of the SoC (CPU, GPU, NPU) may not actually be able to consume all of the bandwidth available to the system as a whole. This is why you'd need to benchmark - different chips may be able to feed the cores better than others.

woooooo · 2025-08-29T07:50:09 1756453809

Ah, yeah. I guess as we venture further into SoCs that will be more common, I was just thinking "it's whatever the memory controller can do".

imtringued · 2025-08-28T10:47:40 1756378060

Training is performed in parallel with batching and is more flops heavy. I don't have an intuition on how memory bandwidth intensive updating the parameters is. It shouldn't be much worse than doing a single forward pass though.

aurareturn · 2025-08-28T06:24:06 1756362246

It should. It has tensor cores which should drastically improve prompt processing. It should also be highly optimized for most AI apps.

7thpower · 2025-08-28T01:43:02 1756345382

The other ones are not framed as an “AI Supercomputer on your desk”, but instead are framed as powerful computers that can also handle AI workloads.

littlestymaar · 2025-08-28T11:54:19 1756382059

Same as Strix Halo, which is 30% cheaper and readily available, yes.

Hence the disappointment.