His strategy is to sell OpenAI stock like it was Bitcoin in 2020, and if for some reason the market decides that maybe a company that loses large amounts of cash isn't actually a good investment... he'll be fine, he's had plenty of time to turn some of his stock into money :)
I believe, but have no proof, that the answer is "because it's easier to sell stock in an unprofitable business than build a profitable one", although given the other comment, there's a good chance I'm wrong about this :)
I'm more "capitalism good" (8 billion people on earth, 7 billion can read, 5 billion have internet, and almost no one dies in childbirth anymore in rich countries, which is several billion people), but that is really interesting that he has no stock and just gets salary.
I guess if other people buying stock in your company is what enables you to have a super high salary (+ benefits like company plane, etc), you are still kinda selling stock though, and honestly, having considered the "start a random software company aligned with the present trend (so ~2015 DevOps/Cloud, 2020 cryptocurrency/blockchain, 2024 AI/ML), pay myself a million dollar a year salary and close shop after 5 years because 'no market lol'" route to riches myself, I still wouldn't consider Altman to be completely free of perverse incentives here :)
Still, very glad you pointed that out, thanks for sharing that information ^^
I don't know how one would think doomers "ignore image and video AI models". They (Yudkowsky, Hinton, Kokotajlo, Scott Alexander) point at these things all the time.
It's completely apparent that HN dismissing doomers with strawmen is because these HN'ers simply don't even read their arguments and just handwave away based on vibes they heard through the grapevine
Somewhat related, but I’ve been feeling as of late what can best be described as “benchmark fatigue”.
The latest models can score something like 70% on SWE-bench verified and yet it’s difficult to say what tangible impact this has on actual software development. Likewise, they absolutely crush humans at sport programming but are unreliable software engineers on their own.
What does it really mean that an LLM got gold on this year’s IMO? What if it means pretty much nothing at all besides the simple fact that this LLM is very, very good at IMO style problems?
Far as i can tell here, the actual advancement is in the methodology used to create a model tuned for this problem domain, and how efficient that method is. Theoretically then, making it easier to build other problem-domain-specific models.
That a highly tuned model designed to solve IMO problems can solve IMO problems is impressive, maybe, but yeah it doesn't really signal any specific utility otherwise.
I don't fault you for maintaining a healthy scepticism, but per the President of the IMO: "It is very exciting to see progress in the mathematical capabilities of AI models, but we would like to be clear that the IMO cannot validate the methods, including the amount of compute used or whether there was any human involvement, or whether the results can be reproduced. What we can say is that correct mathematical proofs, whether produced by the brightest students or AI models, are valid." [1]
The proofs are correct, and it's very unlikely that IMO problems were leaked ahead of time. So the options for cheating in this circumstance are that a) IMO are colluding with a few researchers at OpenAI for some reason, or b) @alexwei_ solved the problems himself - both seem pretty unlikely to me.
Not really. This whole thing looks like a deliberately planned PR campaign, similar to the o3 demo. OpenAI has enough talented mathematicians. They had enough time to just solve the problems themselves. Alternatively, some participants leaking the questions for a reward isn't very unlikely either, and I definitely wouldn't put it past OpenAI to try something like that. Afterwards, they could secretly give hints or tool access to the model, or simply forge the answers, or keep rerunning the model until it gave out the correct answer. We know from FrontierMath and ARC-AGI that OpenAI can't be trusted when it comes to benchmarks.
On OpenAI's own released papers they show Anthropic's models performing better than their own. They tend to be pretty transparent and reliable in honesty in their benchmarks.
The thing is, only leading AI companies and big tech have the money to fund these big benchmarks and run inference on them. As long as the benchmarks are somewhat publicly available and vetted by reputable scientists/mathematicians it seems reasonable to believe they're trustworthy.
You can of course accuse OpenAI of lying or being fraudulent, and if that's how you feel there's probably not much I can say to change your mind. One piece of evidence against this is that the primary source linked above no longer works at OpenAI, and hasn't chosen to blow the whistle on the supposed fraud. I work at OpenAI myself, training reasoning models and running evals, and I can vouch that I have no knowledge or hint of any cheating; if I did, I'd probably quit on the spot and absolutely wouldn't be writing this comment.
Totally fine not to take every company's word at face value, but imo this would be a weird conspiracy for OpenAI, with very high costs on reputation and morale.
Japanese gaming industry employees have low economic stress and high feelings of freedom? I'd be surprised. Even if that is true, can that really be connected to housing availability? Again, I'd be surprised.
If the current tech plateaus (but continues to come down in price, as expected) then this is a good prediction.
But, then there will be a demand for "all-in-one" reliable mega apps to replace everything else. These apps will usher in the megacorp reality William Gibson described.
Hosting, bandwidth, storage, and compute all have come down by orders of magnitude in 20 years.
Regardless which model is currently the best, it looks like there will be an open weight model ~6 months behind it which can be vendored at costs that are closely tied to the hardware costs.
And it comes with rewards! The above lab is synonymous with several popular techniques (one, organocatalysis, which garnered a Nobel prize) - the association would be much less strong if the lab hadn't kept a consistent brand over so many years.
Altman: "Jony was running a design firm called LoveFrom that had established itself as really the densest collection of talent that I've ever heard of in one place AND HAS PROBABLY EVER EXISTED IN THE WORLD."
I felt physically sick from second-hand embarrassment watching this.
German has a word for second hand embarrassment. Fremdschämen. Comes in very handy here. If Sam continues like this it won’t be long until it becomes part of the regular English language like other German words such as Kindergarten.
And I’ll be happy that I don’t have to explain Fremdschämen anymore. Everything has its upsides.
Somehow when the buzz-word machine found talent density, half the passengers forgot that density has a denominator. I see this goof a lot. If you accept the premise that Jony is literally the most talented human in the entire history of the world (I know I know), then obviously he was more dense sitting in a room alone, than after being diluted by hiring 50 other people.