I presume the counterargument is that inference hosting is commoditized (sort of like how stateless CPU-based containerized workload hosts are commoditized); there’s no margin in that business, because it is parallelizable, and arbitrarily schedulable, and able to be spread across heterogenous hardware pretty easily (just route individual requests to sub-cluster A or B), preventing any kind of lock-in and thus any kind of rent-extraction by the vendor.
Which therefore means that cards that can only do inference, are fungible. You don’t want to spend CapEx on getting into a new LOB just to sell something fungible.
All the gigantic GPU clusters that you can sell a million at a time to a bigcorp under a high-margin service contract, meanwhile, are training clusters. Nvidia’s market cap right now is fundamentally built on the model-training “space race” going on between the world’s ~15 big AI companies. That’s the non-fungible market.
For Intel to see any benefit (in stock price terms) from an ML-accelerator-card LOB, it’d have to be a card that competes in that space. And that’s a much taller order.
Coincidentally, it has 128GB of RAM. However, it is not a GPU, is designed to do training too and uses expensive HBM.
Modern GPUs can do more than inference/training and the original poster asked about a GPU with 128GB of RAM, not a card that can only do inferencing as you described. Interestingly, Qualcomm made its own card targeted at only inferencing with 128GB of RAM without using HBM:
They do not sell it through PC parts channels so I do not know the price, but it is exactly what you described and it has been built. Presumably, a GPU with the same memory configuration would be of interest to the original poster.
Which therefore means that cards that can only do inference, are fungible. You don’t want to spend CapEx on getting into a new LOB just to sell something fungible.
All the gigantic GPU clusters that you can sell a million at a time to a bigcorp under a high-margin service contract, meanwhile, are training clusters. Nvidia’s market cap right now is fundamentally built on the model-training “space race” going on between the world’s ~15 big AI companies. That’s the non-fungible market.
For Intel to see any benefit (in stock price terms) from an ML-accelerator-card LOB, it’d have to be a card that competes in that space. And that’s a much taller order.