I presume the counterargument is that inference hosting is commoditized (sort of...

ryao · 2024-12-04T04:40:24 1733287224

Intel does make cards aimed at this space too:

https://www.intel.com/content/www/us/en/products/details/pro...

Coincidentally, it has 128GB of RAM. However, it is not a GPU, is designed to do training too and uses expensive HBM.

Modern GPUs can do more than inference/training and the original poster asked about a GPU with 128GB of RAM, not a card that can only do inferencing as you described. Interestingly, Qualcomm made its own card targeted at only inferencing with 128GB of RAM without using HBM:

https://www.qualcomm.com/news/onq/2023/11/introducing-qualco...

They do not sell it through PC parts channels so I do not know the price, but it is exactly what you described and it has been built. Presumably, a GPU with the same memory configuration would be of interest to the original poster.

gardnr · 2024-12-04T07:46:19 1733298379

Back in January, someone on Reddit claimed the list price was $16k.

reissbaker · 2024-12-04T12:28:44 1733315324

It's competing against Nvidia H100s, which cost $25k. It's cheap, at least by the norms of the space.