> today you can get a kindergartener level chat model for about $100. Not hard to imagine the same model costing $10 of compute in a few years.
No, it's extremely hard to imagine since I used one of Karpathy's own models to have a basic chat bot like six years ago. Yes, it spoke nonsense; so did my GPT-2 fine tune four years ago and so does this.
And so does ChatGPT
Improvement is linear at best. I still think it's actually a log curve and GPT3 was the peak of the "fun" part of the curve. The only evidence I've seen otherwise is bullshit benchmarks, "agents" that increase performance 2x by increasing token usage 100x, and excited salesmen proclaiming the imminence of AGI
1. According to who? Open AI?
2. Its current state is "basically free and containing no ads". I don't think this will remain true given that, as far as I know, the product is very much not making money.
Yes, that number is according to OpenAI. They released that 800m number at DevDay last week.
The most recent leaked annualized revenue rate was $12bn/year. They're spending a lot more than that but convincing customers to hand over $12bn is still a very strong indicator of demand. https://www.theinformation.com/articles/openai-hits-12-billi...
Part of that comes from Microsoft API deals. Part of that will most certainly come because the vast network of companies buy subscriptions to help "Open" "AI" [1].
Given the rest of circular deals, I'd also scrutinize if it applies to the revenue. The entanglement with the Microsoft investments and the fact that "Open" "AI" is a private company makes that difficult to research.
[1] In a U.S. startup, I went through three CEOs and three HR apps, which mysteriously had to change for no reason but to accommodate the new CEO's friends and their startups.
When people take the time to virtue signal "Open" "AI" or the annoyingly common M$ on here I wonder often why they are wasting their precious time on earth doing that
Even with linear progression of model capability, the curve for model usefulness could be exponential, especially if we consider model cost which will come down.
For every little bit a model a smarter and more accurate there are exponentially more real world tasks it could be used for.
No, it's extremely hard to imagine since I used one of Karpathy's own models to have a basic chat bot like six years ago. Yes, it spoke nonsense; so did my GPT-2 fine tune four years ago and so does this.
And so does ChatGPT
Improvement is linear at best. I still think it's actually a log curve and GPT3 was the peak of the "fun" part of the curve. The only evidence I've seen otherwise is bullshit benchmarks, "agents" that increase performance 2x by increasing token usage 100x, and excited salesmen proclaiming the imminence of AGI