FLUX.1 D generation is about a minute at 20 steps on a 4080, but takes 35 minute...

vunderba · 2024-12-04T01:09:14 1733274554

Yep. Any large GenAI image model (beyond SD 1.5) is hideously slow on Mac's irrespective of how much RAM you cram in - whereas I can spit out a 1024x1024 image from Flux.1 Dev model in ~15 seconds on a RTX 4090.

treprinum · 2024-12-03T19:59:46 1733255986

4080 won't do video due to low RAM. The GPU doesn't have to be as fast there, it can be 5x slower which is still way faster than a CPU. And Intel can iterate from there.

m00x · 2024-12-03T21:52:45 1733262765

It won't be 5x slower, it would be 20-50x slower if you would implement it as you said.

You can't just "add more ram" to GPUs and have them work the same way. Memory access is completely different than on CPUs.