i am interested in seeing if they skip m4 and go straight to M5 and only make that available in the Pro. From my unscientific observations it seems that chips are running hotter and hotter, I wouldn't be surprised if M5 Ultra would struggle in a Studio and would require cooling performance of the Mac Pro case
Yes the M4 Base has 120 GB/s, Pro 273 GB/s and Max has 546 GB/s... That means M5 Pro is potentially around 348 GB/s and M5 Max is almost at 700 GB/s - for comparison a 4090 has around 1,000 GB/s. So pretty incredible!
Also I think even an M3 Ultra is more cost effective at running LLMs than 4090 or 5090. Mostly due to being more energy efficient. And less fragile than running a gamer PC build.
It can run larger models quite slowly but lacks matmul acceleration (included in the M5) that is very useful for context and prompt performance at inference time. I will probably burn my budget with an M5 Max with 256gb (maybe even 512gb) memory, the price will be upsetting but I guess that is life!
Yes! I think smaller models on the M3 Ultra is interesting enough, but now with matmul/ tensors on M5 Ultra or Max, with decent unified mem, it will be a gamechanger.
I can easily imagine companies running Mac Studios in prod. Apple should release another Xserve.
DDR5-9600 is 153GB/s from a single channel, Max has 4 channels… these are all theoretical values of course - real world none of these, even the graphics card will get that near to those… so not sure what you’re saying.