looks like there is an easier path using metal shaders: https://dev.to/craigmort...

garblegarble · on Aug 29, 2022

I've been using this on my M1 Max and it works pretty well, 1.65 iterations per second (full precision, whereas my PC's 3080 can only do half-precision due to limited memory)... a 50-iteration image in about 40 seconds or so.

MattRix · on Aug 29, 2022

Your 3080 should be able to do full precision. Are you sure you don’t have the batch size set greater than 1, or another issue along those lines?

garblegarble · on Aug 29, 2022

Thank you and smoldesu for letting me know it should work, I'll have a better look into what's going on - it didn't immediately work on Windows in full precision (probably a batch size issue as you suggested) and I gave up...

I shouldn't have given up so easily, but my tolerance for annoyances on Windows is pretty low (that Windows machine is kept for gaming, the last time I used a Windows machine for anything but launching Steam was when Windows 2000 was the hot new thing...)

smoldesu · on Aug 29, 2022

> full precision, whereas my PC's 3080 can only do half-precision due to limited memory

What model are you using? I've been running full-precision SD1.4 on my 3070, albeit with less than 10% VRAM headroom.

zmmmmm · on Aug 29, 2022

this worked fine for me, and running side by side with Intel CPU + nVidia 2070 it actually does not take much longer (and as a sibling said, seems to be working at full precision). It is one of the first things I've done that has properly made my M1 Max's fan spin up hard though!