Google had a pretty rough start compared to ChatGPT, Claude. I suspect that left...

beastman82 · 2025-02-05T22:00:51 1738792851

No brainer if you're sitting on a >$100k inference server.

throwaway314155 · 2025-02-05T22:40:12 1738795212

Sure, that's fair. If you're aiming for state of the art performance. Otherwise, you can get close and do it on reasonably priced hardware by using smaller distilled and/or quantized variants of llama/r1.

Really though I just meant "it's a no-brainer that they are popular here on HN".

BoorishBears · 2025-02-07T06:07:46 1738908466

I pay 78 cents an hour to host Llama.

beastman82 · 2025-02-08T11:53:36 1739015616

Vast? Specs?

BoorishBears · 2025-02-09T03:13:13 1739070793

Runpod, 2xA40.

Not sure why you think buying an entire inference server is a necessity to run these models.