Coincidentally, it has 128GB of RAM. However, it is not a GPU, is designed to do training too and uses expensive HBM.
Modern GPUs can do more than inference/training and the original poster asked about a GPU with 128GB of RAM, not a card that can only do inferencing as you described. Interestingly, Qualcomm made its own card targeted at only inferencing with 128GB of RAM without using HBM:
They do not sell it through PC parts channels so I do not know the price, but it is exactly what you described and it has been built. Presumably, a GPU with the same memory configuration would be of interest to the original poster.
https://www.intel.com/content/www/us/en/products/details/pro...
Coincidentally, it has 128GB of RAM. However, it is not a GPU, is designed to do training too and uses expensive HBM.
Modern GPUs can do more than inference/training and the original poster asked about a GPU with 128GB of RAM, not a card that can only do inferencing as you described. Interestingly, Qualcomm made its own card targeted at only inferencing with 128GB of RAM without using HBM:
https://www.qualcomm.com/news/onq/2023/11/introducing-qualco...
They do not sell it through PC parts channels so I do not know the price, but it is exactly what you described and it has been built. Presumably, a GPU with the same memory configuration would be of interest to the original poster.