Fits into 32gb: 5090
Fits into 64gb - 96gb: Mac Studio
Fits into 128gb: for now 395+ $/token/s,
Mac Studio if you don't care about $
but don't have unlimited money for Hxxx
This could be great for models that fit 128gb and you want best $/token/s (if it is faster than a 395+).
In Linux, you can set it as high as you want, although you should probably have a swap drive and still be prepared for you system to die if you set it to 128GiB. Here's how you'd set it to 120GiB:
# This is deprecated, but can still be referenced
options amdgpu gttsize=122800
# This specifies GTT by # of 4KB pages:
# 31457280 * 4KB / 1024 / 1024 = 120 GiB
options ttm pages_limit=31457280