Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Given consistent trends of exponential performance improvements over many years and across many industries, it would be extremely surprising if these improvements suddenly stopped.

I'm sure people were saying that about commercial airline speeds in the 1970's too.

But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.

With LLM's at the moment, the limiting factors might turn out to be training data, cost, or inherent limits of the transformer approach and the fact that LLM's fundamentally cannot learn outside of their context window. Or a combination of all of these.

The tricky thing about S curves is, you never know where you are on them until the slowdown actually happens. Are we still only in the beginning of the growth part? Or the middle where improvement is linear rather than exponential? And then the growth starts slowing...





> a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.

Yes of course it’s not going to increase exponentially forever.

The point is, why predict that the growth rate is going to slow exactly now? What evidence are you going to look at?

It’s possible to make informed predictions (eg “Moore’s law can’t get you further than 1nm with silicon due to fundamental physical limits”). But most commenters aren’t basing their predictions in anything as rigorous as that.

And note, there are good reasons to predict a speedup, too; as models get more intelligent, they will be able to accelerate the R&D process. So quality per-researcher is now proportional to the exponential intelligence curve, AND quantity of researchers scales with number of GPUs (rather than population growth which is much slower).


Yeah exactly!

It’s likely that it will slow down at some point, but the highest likelihood scenario for the near future is that scaling will continue.


NOTE IN ADVANCE: I'm generalizing, naturally, because talking about specifics would require an essay and I'm trying to write a comment.

Why predict that the growth rate is going to slow now? Simple. Because current models have already been trained on pretty much the entire meaningful part of the Internet. Where are they going to get more data?

The exponential growth part of the curve was largely based on being able to fit more and more training data into the models. Now that all the meaningful training data has been fed in, further growth will come from one of two things: generating training data from one LLM to feed into another one (dangerous, highly likely to lead to "down the rabbit hole forever" hallucinations, and weeding those out is a LOT of work and will therefore contribute to slower growth), or else finding better ways to tweak the models to make better use of the available training data (which will produce growth, but much slower than what "Hey, we can slurp up the entire Internet now!" was producing in terms of rate of growth).

And yes, there is more training data available because the Internet is not static: the Internet of 2025 has more meaningful, human-generated content than the Internet of 2024. But it also has a lot more AI-generated content, which will lead into the rabbit-hole problem where one AI's hallucinations get baked into the next one's training, so the extra data that can be harvested from the 2025 Internet is almost certainly going to produce slower growth in meaningful results (as opposed to hallucinated results).


> Where are they going to get more data?

This is a great question, but note that folks were freaking out about this a year or so ago and we seem to be doing fine.

We seem to be making progress with some combination of synthetic training datasets on coding/math tasks, textbooks authored by paid experts, and new tokens (plus preference signals) generated by users of the LLM systems.

It wouldn’t surprise me if coding/math turned out to have a dense-enough loss-landscape to produce enough synthetic data to get to AGI - though I wouldn’t bet on this as a highly likely outcome.

I have been wanting to read/do some more rigorous analysis here though.

This sort of analysis would count as the kind of rigorous prediction that I’m asking for above.

E2A: initial exploration on this: https://chatgpt.com/share/68d96124-a6f4-8006-8a87-bfa7ee4ea3...

Gives some relevant papers such as

https://arxiv.org/html/2211.04325v2#:~:text=3.1%20AI


I am extremely confident that AGI, if it is achievable at all (which is a different argument and one I'm not getting into right now), requires a world model / fact model / whatever terminology you prefer, and is therefore not achievable by models that simply chain words together without having any kind of understanding baked into the model. In other words, LLMs cannot lead to AGI.

Agreed, it surely does require a world-model.

I disagree that generic LLMs plus CoT/reasoning/tool calling (ie the current stack) cannot in principle implement a world model.

I believe LLMs are doing some sort of world modeling and likely are mostly lacking a medium-/long-term memory system in which to store it.

(I wouldn’t be surprised if one or two more architectural overhauls end up occurring before AGI, I also wouldn’t be surprised if these occurred seamlessly with our current trajectory of progress)


Isn’t the memory the pre-trained weights that let it do anything at all? Or do you mean they should be capable of refining them in real-time (learning).

The human brain has many systems that adapt on multiple time-frames which could loosely be called “memory”.

But here I’m specifically interested in real-time updates to medium/long term memory, and the episodic/consciously accessible systems that are used in human reasoning/intelligence.

Eg if I’m working on a big task I can think through previous solutions I learned, remember the salient/surprising lessons, recall recent conversations that may indirectly affect requirements, etc. The brain is clearly doing an associative compression and indexing operation atop the raw memory traces. I feel the current LLM “memory” implementations are very weak compared to what the human brain does.

I suppose there is a sense in which you could say the weights “remember” the training data, but it’s read-only and I think this lack of real-time updating is a crucial gap.

To expand on my hunch about scaffolding - it may be that you can construct an MCP module that can let the LLM retrieve or ruminate on associative memories in such a way as to allow the LLM to not make the same mistake twice and be steerable on a longer timeframe.

I think the best argument against my hunch is that human brains have systems which update the synaptic weights themselves over a timeframe of days-to-months, and so if neural plasticity is the optimal solution here then we may not be able to efficiently solve the problem with “application layer” memory plugins.

But again, there is a lot of solution-space to explore; maybe some LoRA-like algorithm can allow an LLM instance to efficiently update its own weights at test-time, and persist those deltas for efficient inference, thus implementing the required neural plasticity algorithms?


Ah, so you dont know anything about how they work. Thanks for clarification.

Curiously, humans don't seem to require reading the entire internet in order to perform at human level on a wide variety of tasks... Nature suggests that there's a lot of headroom in algorithms for learning on existing sources. Indeed, we had models trained on the whole internet a couple years ago, now, yet model quality has continued to improve.

Meanwhile, on the hardware side, transistor counts in GPUs are in the tens of billions and still increasing steadily.


This is a time horizon thing though. Over the course of future human history AI development might look exponential but that doesn’t mean there won’t be significant plateaus. We don’t even fully understand how the human brain works so whilst the fact it does exist strongly suggests it’s replicable (and humans do it naturally) that doesn’t make it practical in any time horizon that matters to us now. Nor does there seem to be fast movement in that direction since everyone is largely working on the same underlying architecture that isn’t similar to the brain.

Alternative argument, there is no need for more training data, just better algorithms. Throwing more tokens at the problem doesn't solve the fact that training llms using supervised learning is a poor way to integrate knowledge. We have however seen promising results coming out of reinforcement learning and self play. Which means that anthropic and openais' bet on scale is likely a dead end, but we may yet see capability improvements coming from other labs, without the need for greater data collection.

Better algorithms is one of the things I meant by "better ways to tweak the models to make better use of the available training data". But that produces slower growth than the jaw-droppingly rapid growth you can get by slurping pretty much the whole Internet. That produced the sharp part of the S curve, but that part is behind us now, which is why I assert we're approaching the slower-growth part at the top of the curve.

> The point is, why predict that the growth rate is going to slow exactly now? What evidence are you going to look at?

Why predict that the (absolute) growth rate is going to keep accelerating past exactly now?

Exponential growth always assumes a constant relative growth rate, which works in the fiction of economics, but is otherwise far from an inevitability. People like to point to Moore's law ad nauseam, but other things like "the human population" or "single-core performance" keep accelerating until they start cooling off.

> And note, there are good reasons to predict a speedup, too; as models get more intelligent, they will be able to accelerate the R&D process.

And if heaven forbid, R&D ever turns out to start taking more work for the same marginal returns on "ability to accelerate the process", then you no longer have an exponential curve. Or for that matter, even if some parts can be accelerated to an amazing extent, other parts may get strung up on Amdahl's law.

It's fine to predict continued growth, and it's even fine to predict that a true inflection point won't come any time soon, but exponential growth is something else entirely.


> Why predict that the (absolute) growth rate is going to keep accelerating past exactly now?

By following this logic you should have predicted Moore’s law would halt every year for the last five decades. I hope you see why this is a flawed argument. You prove too much.

But I will answer your “why”: plenty of exponential curves exist in reality, and empirically, they can last for a long time. This is just how technology works; some exponential process kicks off, then eventually is rate-limited, then if we are lucky another S-curve stacks on top of it, and the process repeats for a while.

Reality has inertia. My hunch is you should apply some heuristic like “the longer a curve has existed, the longer you should bet it will persist”. So I wouldn’t bet on exponential growth in AI capabilities for the next 10 years, but I would consider it very foolish to use pure induction to bet on growth stopping within 1 year.

And to be clear, I think these heuristics are weak and should be trumped by actual physical models of rate-limiters where available.


> By following this logic you should have predicted Moore’s law would halt every year for the last five decades. I hope you see why this is a flawed argument. You prove too much.

I do think it's continually amazing that Moore's law has continued in some capacity for decades. But before trumpeting the age of exponential growth, I'd love to see plenty of examples that aren't named "Moore's law": as it stands, one easy hypothesis is that "ability to cram transistors into mass-produced boards" lends itself particularly well to newly-discovered strategies.

> So I wouldn’t bet on exponential growth in AI capabilities for the next 10 years, but I would consider it very foolish to use pure induction to bet on growth stopping within 1 year.

Great, we both agree that it's foolish to bet on growth stopping within 1 year. What I'm saying that "growth doesn't stop" ≠ "growth is exponential".

A theory of "inertia" could just as well support linear growth: it's only because we stare at relative growth rates that we treat exponential growth as a "constant" that will continue in the absence of explicit barriers.


Solar panel cost per watt has been dropping exponentially for decades as well...

Partly these are matters of economies of scale - reduction in production costs at scale - and partly it's a matter of increasing human attention leading to steady improvements as the technology itself becomes more ubiquitous.


Sorry, to be clear I was making the stronger claim:

I would consider it very foolish to use pure induction to bet on _exponential_ growth stopping within 1 year.

I think you can easily find plenty of other long-lasting exponential curves. A good starting point would be:

https://en.m.wikipedia.org/wiki/Progress_studies

With perhaps the optimistic case as

https://en.m.wikipedia.org/wiki/Accelerating_change

This is where I’d really like to be able to point to our respective Manifold predictions on the subject; we could circle back in a year’s time and review who was in fact correct. I wager internet points it will be me :)

Concretely, https://manifold.markets/JoshYou/best-ai-time-horizon-by-aug...


I think progress per dollar spent has actually slowed dramatically over the last three years. The models are better, but AI spending has increased by several orders of magnitude during the same time, from hundreds of millions to hundreds of billions. You can only paper over the lack of fundamental progress by spending on more compute for so long. And even if you manage to keep up the current capex, there certainly isn't enough capital in the world to accelerate spending for very long.

It has already been trained on all the data. The other obvious next step is to increase context window, but that's apparently very hard/costly.

I don’t think this is true. See https://arxiv.org/html/2211.04325v2 for example.

Yes, nobody knows the future of AI, but sometimes people use curve fitting to try convince themselves or others that they know what’s going to happen.

> why predict that the growth rate is going to slow exactly now?

why predict that it will continue? Nobody ever actually makes an argument that growth is likely to continue, they just extrapolate from existing trends and make a guess, with no consideration of the underlying mechanics.

Oh, go on then, I'll give a reason: this bubble is inflated primarily by venture capital, and is not profitable. The venture capital is starting to run out, and there is no convincing evidence that the businesses will become profitable.


Indeed you can't be sure. But on the other hand a bunch of the commentariat has been claiming (with no evidence) that we're at the midpoint of the sigmoid for the last three years. They were wrong. And then you had the AI frontier lab insiders who predicted an accelerating pace of progress for the last three years. They were right. Now, the frontier labs rarely (never?) provide evidence either, but they do have about a year of visibility into the pipeline, unlike anyone outside.

So at least my heuristic is to wait until a frontier lab starts warning about diminishing returns and slowdowns before calling the midpoint or multiple labs start winding down capex. The first component might have misaligned incentives, but if we're in a realistic danger of hitting a wall in the next year, the capex spending would not be accelerating the way it is.


Capex requirements might be on a different curve than model improvements.

E.g. you might need to accelerate spending to get sub-linear growth in model output.

If valuations depend on hitting the curves described in the article, you might see accelerating capex at precisely the time improvements are dropping off.

I don’t think frontier labs are going to be a trustworthy canary. If Anthropic says they’re reaching the limit and OpenAI holds the line that AGI is imminent, talent and funding will flee Anthropic for OpenAI. There’s a strong incentive to keep your mouth shut if things aren’t going well.


I think you nailed it. The capex is desperation in the hopes of maintaining the curve. I have heard actual AI researchers say progress is slowing, just not from the big companies directly.

> Indeed you can't be sure. But on the other hand a bunch of the commentariat has been claiming (with no evidence) that we're at the midpoint of the sigmoid for the last three years.

I haven’t followed things closely, but I’ve seen more statements that we may be near the midpoint of a sigmoid than that we are at it.

> Thy were wrong. And then you had the AI frontier lab insiders who predicted an accelerating pace of progress for the last three years. They were right.

I know it’s an unfair question because we don’t have an objective way to measure speed of progress in this regard, but do you have evidence for models not only getting better, but getting better faster? (Remember: even at the midpoint of a sigmoid, there still is significant growth)


I thought the original article included the strongest objective data point on this: recent progress on the METR long task benchmark isn't just on the historical "task length doubling every 7 months" best fit, but is trending above it.

A year ago, would you have thought that a pure LLM with no tools could get a gold medal level score in the 2025 IMO finals? I would have thought that was crazy talk. Given the rates of progress over the previous few years, maybe 2027 would have been a realistic target.


> I thought the original article included the strongest objective data point on this: recent progress on the METR long task benchmark isn't just on the historical "task length doubling every 7 months" best fit, but is trending above it.

There is selection bias in that paper. For example, they chose to measure “AI performance in terms of the length of tasks the system can complete (as measured by how long the tasks take humans)”, but didn’t include calculation tasks in the set of tasks, and that’s a field in which machines have been able to reliably do tasks for years that humans would take centuries or more to perform, but at which modern LLM-based AIs are worse than, say, Python.

I think leaving out such taks is at least somewhat defensible, but have to wonder whether there are other tasks at which LLMs do not become better as rapidly they also leave out.

Maybe it is a matter of posing different questions, with the article being discussed being more interested in “(When) can we (ever) expect LLMs to do jobs that now require humans to do?” than in “(How fast) do LLMs get smarter over time?”


Or are the model author’s, i.e the blog author with a vested interest, getting better at optimizing for the test while real world performance aren’t increasing as fast?

> And then you had the AI frontier lab insiders who predicted an accelerating pace of progress for the last three years.

Progress has most definitely not been happening at an _accelerating_ pace.


There are a few other limitations, in particular how much energy, hardware and funding we (as a society) can afford to throw at the problem, as well as the societal impact.

AI development is currently given a free pass on these points, but it's very unclear how long that will last. Regardless of scientific and technological potential, I believe that we'll hit some form of limit soon.


Luckily both middle eastern religious dictatorships and countries like China are throwing way too many resources at it ...

So we can rest assured the well-being of a country's people will not allowed to be a drag on AI progress.


There's a Mulla Nasrudin joke that's sort of relevant here:

Nasrudin is on a flight, when suddenly the pilot comes on the intercom, saying, "Passengers, we apologize, but we have experienced an engine burn-out. The plane can still fly on the remaining three engines, but we'll be delayed in our arrival by two hours."

Nasrudin speaks up "let's not worry, what's 2 hours really"

A few minutes later, the airplane shakes, and passengers see smoke coming out of another engine. Again, the intercom crackles to life.

"This is your captain speaking. Apologies, but due to a second engine burn-out, we'll be delayed by another two hours."

The passengers are agitated, but the Mulla once again tries to remains calm.

Suddenly, the third engine catches fire. Again, the pilot comes on the intercom and says, "I know you're all scared, but this is a very advanced aircraft, and it can safely fly on only a single engine. But we will be delayed by yet another two hours."

At this, Nasrudin shouts, "This is ridiculous! If one more engine goes, we'll be stuck up here all day"


> I'm sure people were saying that about commercial airline speeds in the 1970's too.

Or CPU frequencies in the 1990's. Also we spent quite a few decades at the end of the 19th century thinking that physics was finished.

I'm not sure that explaining it as an "S curve" is really the right metaphor either, though.

You get the "exponential" growth effect when there's a specific technology invented that "just needs to be applied", and the application tricks tend to fall out quickly. For sure generative AI is on that curve right now, with everyone big enough to afford a datacenter training models like there's no tomorrow and feeding a community of a million startups trying to deploy those models.

But nothing about this is modeled correctly as an "exponential", except in the somewhat trivial sense of "the community of innovators grows like a disease as everyone hops on board". Sure, the petri dish ends up saturated pretty quickly and growth levels off, but that's not really saying much about the problem.


Progress in information systems cannot be compared to progress in physical systems.

For starters, physical systems compete for limited resources and labor.

For another, progress in software vastly reduces the cost of improved designs. Whereas progress in physical systems can enable but still increase the cost of improved designs.

Finally, the underlying substrate of software is digital hardware, which has been improving in both capabilities and economics exponentially for almost 100 years.

Looking at information systems as far back as the first coordination of differentiating cells to human civilization is one of exponential improvement. Very slow, slow, fast, very fast. (Can even take this further, to first metabolic cycles, cells, multi-purpose genes, modular development genes, etc. Life is the reproduction of physical systems via information systems.)

Same with human technological information systems, from cave painting, writing, printing, telegraph, phone, internet, etc.

It would be VERY surprising if AI somehow managed to fall off the exponential information system growth path. Not industry level surprising, but "everything we know about how useful information compounds" level surprising.


> Looking at information systems as far back as the first coordination of differentiating cells to human civilization is one of exponential improvement.

Under what metric? Most of the things you mention don't have numerical values to plot on a curve. It's a vibe exponential, at best.

Life and humans have become better and better at extracting available resources and energy, but there's a clear limit to that (100%) and the distribution of these things in the universe is a given, not something we control. You don't run information systems off empty space.


> It's a vibe exponential, at best.

I am a little stunned you think so.

Life has been on Earth about 3.5-3.8 billion years.

Break that into 0.5-0.8, 1 billion, 1 billion, 1 billion "quarters", and you will find exponential increases in evolutions rate of change and production of diversity across them by many many objective measures.

Now break up the last 1 billion into 100 million year segments. Again exponential.

Then break up the last 100 million into segments. Again.

Then the last 10 million years into segments, and watch humans progress.

The last million, in 100k year segments, watch modern humans appear.

the last 10k years into segments, watch agriculture, civilizations, technology, writing ...

The last 1000 years, incredible aggregation of technology, math, and the appearance of formal science

last 100 years, gets crazy. Information systems appear in labs, then become ubiquitous.

last 10 years, major changes, AI starts having mainstream impact

last 1 year - even the basic improvements to AI models in the last 12 months are an unprecedented level of change, per time, looking back.

I am not sure how any of could appear "vibe", given any historical and situational awareness.

This progression is universally recognized. Aside from creationists and similar contingents.


The progression is much less clear when you don't view it anthropocentrically. For instance, we see an explosion in intelligible information: information that is formatted in human language or human-made formats. But this is concomitant with a crash in natural spaces and biodiversity, and nothing we make is as information-rich as natural environments, so from a global perspective, what we have is actually an information crash. Or hell, take something like agriculture. Cultured environments are far, far simpler than wild ones. Again: an information crash.

I'm not saying anything about the future, mind you. Just that if we manage to stop sniffing our own farts for a damn second and look at it from the outside, current human civilization is a regression on several metrics. We didn't achieve dominion over nature by being more subtle or complex than it. We achieved that by smashing nature with a metaphorical club and building upon its ruins. Sure, it's impressive. But it's also brutish. Intelligence requires intelligible environments to function, and that is almost invariably done at the expense of complexity and diversity. Do not confuse success for sophistication.

> last 1 year - even the basic improvements to AI models in the last 12 months are an unprecedented level of change, per time, looking back.

Are they? What changed, exactly? What improvements in, say, standards of living? In the rate of resource exploitation? In energy efficiency? What delta in our dominion over Earth? I'll tell you what I think: I think we're making tremendous progress in simulating aspects of humanity that don't matter nearly as much as we think they do. The Internet, smartphones, AI, speak to our brains in an incredible way. Almost like it was by design. However, they matter far more to humans within humanity than they do in the relationship of humanity with the rest of the universe. Unlike, say, agriculture or coal, which positively defaced the planet. Could we leverage AI to unlock fusion energy or other things that actually matter, just so we can cook the rest of the Earth with it? Perhaps! But let's not count our chickens before they hatch. As of right now, in the grand scheme of things, AI doesn't matter. Except, of course, in the currency of vibes.


I am curious when you think we will run out of atoms to make information systems.

How many billions of years you think that might take.

Of all the things to be limited by, that doesn't seem like a near term issue. Just an asteroid or two alone will provide resources beyond our dreams. And space travel is improving at a very rapid rate.

In the meantime, in terms of efficiency of using Earth atoms for information processing, there is still a lot space at the "bottom", as Feynman said. Our crude systems are limited today by their power waste. Small energy efficient systems, and more efficient heat shedding, will enable full 3D chips ("cubes"?) and vastly higher density of packing those.

The known limit for information for physical systems per gram, is astronomical:

• Bremermann’s limit : 10^47 operations per second, per gram.

Other interesting limits:

• Margolus–Levitin bound - on quantum state evolution

• Landauer’s principle - Thermodynamic cost of erasing (overwriting) one bit.

• Bekenstein bound: Maximum storage by volume.

Life will go through many many singularities before we get anywhere near hard limits.


> Progress in information systems cannot be compared to progress in physical systems.

> For starters, physical systems compete for limited resources and labor.

> Finally, the underlying substrate of software is digital hardware…

See how these are related?


By physical systems, I meant systems whose purpose is to do physical work. Mechanical things. Gears. Struts.

Computer hardware is an information system. You are correct that it is has a physical component. But its power comes from its organization (information) not its mass, weight, etc.

Transistors get more powerful, not less, when made from less matter.

Information systems move from substrate to more efficient substrate. They are not their substrate.


They still depend on physical resources and labor. They’re made by people and machines. There’s never been more resources going into information systems than right now, and AI accelerated that greatly. Think of all the server farms being built next to power plants.

Yes. Of course.

All information has a substrate at any given time.

But the amount of computation per resource drops because computation is not something tied to any unit matter. Nor any particular substrate.

It is not the same as a steam engine, which can only be made so efficient.

The amount of both matter and labor per quantity of computing power is dropping exponentially. Right?

See a sibling reply on the physical limits of computation. We are several singularities away from any hard limit.

Evidence: History of industrialization vs. history of computing. Fundamental physics.


> The amount of both matter and labor per quantity of computing power is dropping exponentially. Right?

Right. The problem is the demand is increasing exponentially.

It’s not like when computers got 1000x more powerful we were able to get by with 1/1000x of them. Quite the opposite (or inverse, to be more precise).

Just to go back to my original point, I think drawing a comparison that physical systems compete for physical resources and implying information systems don’t is misleading at best. It’s especially obvious right now with all the competition for compute going on.


>[..] to first metabolic cycles, cells, multi-purpose genes, modular development genes, etc.

One example is when cells discovered energy production using mitochondria. Mitochondria add new capabilities to the cell, with (almost) no downside like: weight, temperature-sensitivity, pressure-sensitivity. It's almost 100% upside.

If someone tried to predict the future number of mitochondria-enabled cells from the first one, he could be off by 10^20 less cells.

I am writing a story the last 20 days, with that exact story plot, have to get my stuff together and finish it.


That's fallacious reasoning, you are extrapolating from survivorship bias. A lot of technologies, genes, or species have failed along the way. You are also subjectively attributing progression as improvements, which is problematic as well, if you speak about general trends. Evolution selects for adaptation not innovation. We use the theory of evolution to explain the emergence of complexity, but that's not the sole direction and there are many examples where species evolved towards simplicity (again).

Resource expense alone could be the end of AI. You may look up historic island populations, where technological demands (e.g. timber) usually led to extinction by resource exhaustion and consequent ecosystem collapse (e.g. deforestation leading to soil erosion).


See replies to sibling comments.

Doesn't answer the core fallacy. Historical "technological progress" can't be used as argument for any particular technology. Right now, if we are talking about AI, we're talking about specific technologies, which may just as well fail and remain inconsequential in the grand scheme of things, like most technologies, most things really, did in the past. Even more so since we don't understand much anything in either human or artificial cognition. Again and again, we've been wrong about predicting the limits and challenges in computation.

You see, your argument is just bad. You are merely guessing like everyone else.


My arguments are very strong.

Information technology does not operate by the rules of any other technology. It is a technology of math and organization, not particular materials.

The unique value of information technology is that it compounds the value of other information and technology, including its own, and lowers the bar for its own further progress.

And we know with absolute certainty we have barely scratched the computing capacity of matter. Bremermann’s limit : 10^47 operations per second, per gram. See my other comment for other relevant limits.

Do you also expect a wall in mathematics?

And yes, an unbroken historical record of 4.5 billions years of information systems becoming more sophisticated with an exponential speed increase over time, is in fact a very strong argument. Changes that took a billion years initially, now happen in very short times in today's evolution, and essentially instantly in technological time. The path is long, with significant acceleration milestones at whatever scale of time you want to look at.

Your argument, on the other hand, is indistinguishable from cynical AI opinions going back decades. It could be made any time. Zero new insight. Zero predictive capacity.

Substantive negative arguments about AI progress have been made. See "Perceptrons" by Marvin Minksy and Seymour Papert, for an example of what a solid negative argument looks like. It delivered insights. It made some sense at the time.


> Your argument, on the other hand, is indistinguishable from cynical AI opinions going back decades. It could be made any time. Zero new insight. Zero predictive capacity.

Pointing out logical fallacies?

Lol.


> Historical "technological progress" can't be used as argument for any particular technology.

Historical for billions of years of natural information system evolution. Metabolic, RNA, DNA, protein networks, epigenetic, intracellular, intercellular, active membrane, nerve precursors, peptides, hormonal, neural, ganglion, nerve nets, brains.

Thousands of years of human information systems. Hundreds of years of technological information systems. Decades of digital information systems. Now in in just the last few years, progress year to year is unlike any seen before.

Significant innovations being reported virtually every day.

Yes track records carry weight. Especially with no good reason for any reason for a break, while every tangible reason to believe nothing is slowing down, right up to today.

"Past is not a predictor of future behavior" is about asset gains relative to asset prices in markets where predictable gains have had their profitability removed by the predictive pricing of others. A highly specific feedback situation making predicting asset gains less predictable even when companies do maintain strong predictable trends in fundamentals.

It is a narrow specific second order effect.

It is the worst possible argument for anything outside of those special conditions.

Every single thing you have ever learned was predicated on the past having strong predictive qualities.

You should understand what an argument means, before throwing it into contexts where its preconditions don't exist.

> Right now, if we are talking about AI, we're talking about specific technologies, which may just as well fail and remain inconsequential in the grand scheme of things, like most technologies, most things really, did in the past. Even more so since we don't understand much anything in either human or artificial cognition. Again and again, we've been wrong about predicting the limits and challenges in computation.

> Your argument [...] is indistinguishable from cynical AI opinions going back decades. It could be made any time. Zero new insight. Zero predictive capacity.

If I need to be clearer, nobody could know when you wrote that by reading it. It isn't an argument it's a free floating opinion. And you have not made it more relevant today, than it would have been all the decades up till now, through all the technological transitions up until now. Your opinion was equally "applicable", and no less wrong.

This is what "Zero new insight. Zero predictive capacity" refers to.

> Substantive negative arguments about AI progress have been made. See "Perceptrons" by Marvin Minksy and Seymour Papert, for an example of what a solid negative argument looks like. It delivered insights. It made some sense at the time.

Here you go:

https://en.wikipedia.org/wiki/Perceptrons_(book)


> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.

I'd argue all of them. Any true exponential eventually gets to a point where no computer can even store its numerical value. It's a physically absurd curve.


The narrative quietly assumes that this exponential curve can in fact continue since it will be the harbinger of the technological singularity. Seems more than a bit eschatological, but who knows.

If we suppose this tech rapture does happen, all bets are off; in that sense it's probably better to assume the curve is sigmoidal, since the alternative is literally beyond human comprehension.


Barring fully reversible processes as the basis for technology, you still quickly run into energy and cooling constraints. Even with that, you'd have time or energy density constraints. Unlimited exponentials are clearly unphysical.

Yes, this is an accurate description, and also completely irrelevant to the issue at hand.

At the stage of development we are today, no one cares how fast it takes for the exponent to go from eating our galaxy to eating the whole universe, or whether it'll break some energy density constraint before it and leave a gaping zero-point energy hole where our local cluster used to be.

It'll stop eventually. What we care about is whether it stops before it breaks everything for us, here on Earth. And that's not at all a given. Fundamental limits are irrelevant to us - it's like worrying that putting too many socks in a drawer will eventually make them collapse into a black hole. The limits that are relevant to us are much lower, set by technological, social and economic factors. It's much harder to say where those limits lay.


Sure, but it reminds us that we are dealing with an S-curve, so we need to ask where the inflection point is. i.e. what are the relevant constraints, and can they reasonably sustain exponential growth for a while still? At least as an outsider, it's not obvious to me whether we won't e.g. run into bandwidth or efficiency constraints that make scaling to larger models infeasible without reimagining the sorts of processors we're using. Perhaps we'll need to shift to analog computers or something to break through cooling problems, and if the machine cannot find the designs for the new paradigm it needs, it can't make those exponential self-improvements (until it matches its current performance within the new paradigm, it gets no benefit from design improvements it makes).

My experience is that "AI can write programs" is only true for the smallest tasks, and anything slightly nontrivial will leave it incapable of even getting started. It doesn't "often makes mistakes or goes in a wrong direction". I've never seen it go anywhere near the right direction for a nontrivial task.

That doesn't mean it won't have a large impact; as an autocomplete these things can be quite useful today. But when we have a more honest look at what it can do now, it's less obvious that we'll hit some kind of singularity before hitting a constraint.


I think the technological singularity has generally been a bit of a metaphor rather than a mathematical singularity.

Some exponentials are slow enough that it takes decades or centuries, though.

You clearly haven’t played my idle game.

I am getting the sense that the 2nd deriative of the curve is already hitting negative teritory. models get updated, and I don't feel I'm getting better answers from the LLMs.

On the application front though, it feels that the advancements from a couple of years ago are just beginning to trickle down to product space. I used to do some video editing as a hobby. Recently I picked it up again, and was blown away by how much AI has chipped away the repetitive stuff, and even made attempts at the more creative aspects of production, with mixed but promising results.


What are some examples of tasks you no longer have to do?

one example is auto generating subtitles -- elements of this tasks, e.g. speech to text with time coding, have been around for a while (openai whisper and others), but they have only recently been integrated into video editors and become easy to use for non-coders. other examples: depth map (estimating object distance from the camera; this is useful when you want to blur the background), auto-generating masks with object tracking.

>I'm sure people were saying that about commercial airline speeds in the 1970's too.

Also elegantly formulated by: https://idlewords.com/talks/web_design_first_100_years.htm


>> it would be extremely surprising if these improvements suddenly stopped.

> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.

An S-curve is exactly the opposite of "suddenly" stopping.

It is possible for us to get a sudden stop, due to limiting factors.

For a hypothetical: if Moore's Law had continued until we hit atomic resolution instead of the slowdown as we got close to it, that would have been an example of a sudden stop: can't get transistors smaller than atoms, but yet it would have been possible (with arbitrarily large investments that we didn't have) to halve transistor sizes every 18 months until suddenly we can't.

Now I think about it, the speed of commercial airlines is also an example of a sudden stop: we had to solve sonic booms first before even considering a Concorde replacement.


Agreed!

And, maybe I'm missing something, but to me it seems obvious that flat top part of the S curve is going to be somewhere below human ability... because, as you say, of the training data. How on earth could we train an LLM to be smarter than us, when 100% of the material we use to teach it how to think, is human-style thinking?

Maybe if we do a good job, only a little bit below human ability -- and what an accomplishment that would still be!

But still -- that's a far cry from the ideas espoused in articles like this, where AI is just one or two years away from overtaking us.


Author here.

The standard way to do this is Reinforcement Learning: we do not teach the model how to do the task, we let it discover the _how_ for itself and only grade it based on how well it did, then reinforce the attempts where it did well. This way the model can learn wildly superhuman performance, e.g. it's what we used to train AlphaGo and AlphaZero.


Yes. It's true that we don't know, with any certainty, (1) whether we are hitting limits to growth intrinsic to current hardware and software, (2) whether we will need new hardware or software breakthroughs to continue improving models, and (3) what the timing of any necessary breakthroughs, because innovation doesn't happen on a predictable schedule. There are unknown unknowns.[a]

However, there's no doubt that at a global scale, we're sure trying to maintain current rates of improvement in AI. I mean, the scale and breadth of global investment dedicated to improving AI, presently, is truly unprecedented. Whether all this investment is driven by FOMO or by foresight, is irrelevant. The underlying assumption in all cases is the same: We will figure out, somehow, how to overcome all known and unknown challenges along the way. I have no idea what the odds of success may be, but they're not zero. We sure live in interesting times!

---

[a] https://en.wikipedia.org/wiki/There_are_unknown_unknowns


I hope the crash won't be unprecedented as well...

I hope so too. Capital spending on AI appears to be holding up the entire economy:

https://am.jpmorgan.com/us/en/asset-management/adv/insights/...


The cost of the next number in a GPT (3>4>5) seems to be in 2 ways:

1) $$$

2) data

The second (data) also isn't cheap. As it seems we've already gotten through all the 'cheap' data out there. So much so that synthetic data (fart huffing) is a big thing now. People tell it's real and useful and passes the glenn-horf theore... blah blah blah.

So it really more so comes down to just:

1) $$$^2 (but really pick any exponent)

In that, I'm not sure this thing is a true sigmoid curve (see: biology all the time). I think it's more a logarithmic cost here. In that, it never really goes away, but it gets really expensive to carry out for large N.

[To be clear, lots of great shit happens out there in large N. An AI god still may lurk in the long slow slope of $N, the cure for boredom too, or knowing why we yawn, etc.]


It never ceases to amaze me how people consistently mistake the initial phase of a sigmoid curve for an exponential function.

"I'm sure people were saying that about commercial airline speeds in the 1970's too."

But there are others that keep going also. Moore's law is still going (mostly, slowing), and made it past a few pinch points where people thought it was the end.

The point is, that over 30 decades, many people said Moore's law was at an end, and then it wasn't, there was some breakthrough that kept it going. Maybe a new one will happen.

The thing with AI is, maybe the S curve flattens out , after all the jobs are gone.

Everyone is hoping the S curve flattens out somewhere just below human level, but what if it flattens out just beyond human level? We're still screwed.


Each specific technology can be S-shaped, but advancements in achieving goals can still maintain an exponential curve. e.g. Moore's law is dead with the end of Dennard scaling, but computation improvements still happen with parallelism.

Meta's Behemoth shows that scaling number of parameters has diminished returns, but we still have many different ways to continue advancements. Those who point at one thing and say "see", isn't really seeing. Of course there are limits, like energy but with nuclear energy or photon-based computing were nowhere near the limits.


Yes exponential is only an approximation of the first part of S curves. And this author claims that he understands the exponential better than others…

the author is an anthropic employee

if the money dries up because the investors lose faith on the exponential continuing, then his future looks much dimmer


That is even true for covid for obvious reasons, because Covid runs out of people it can infect at some point.

Infectious diseases rarely see actual exponential growth for logistical reasons. It's a pretty unrealistic model that ignores that the disease actually needs to find additional hosts to spread, the local availability of which starts to go down from the first victim.

If you assume the availability of hosts is local to the perimeter of the infected hosts, then the relative growth is limited to 2/R where R is the distance from patient 0 in 2 dimensions. It's becuase an area of the circle defines how many hosts are already ill but the interaction can only happen on the perimeter of the circle.

The disease is obviously also limited by the total amount of hosts, but I assume there's also the "bottom" limit - i.e. the resource consumption of already-infected hosts.


It also depends on how panicked people are. Covid was never going to spread like ebola, for instance: it was worse. Bad enough to harm and kill people, but not bad enough to scare them into self-enforced isolation and voluntary compliance with public health measures.

Back on the subject of AI, I think the flat part of the curve has always been in sight. Transformers can achieve human performance in some, even many respects, but they're like children who have to spend a million years in grade school to learn their multiplication tables. We will have to figure out why that is the case and how to improve upon it drastically before this stuff really starts to pay off. I'm sure we will but we'll be on a completely different S-shaped curve at that point.


Yes the model where the S curves comes out is extremely simplified. Looking at covid curves we could have well said it was parabolic, but that’s much less worrisome

It's obvious, but the problem was that enough people would die in the process for people to be worried. Similarly, if the current AI will be able to replace 99% of devs in 5-10 years (or even worse, most white collar jobs) and flatten out there without becoming a godlike AGI, it will still have enormous implications for the economy.

Ironically, given that it probably mistakes a sigmoid curve for an exponential curve, "Failing to understand the exponential, again" is an extremely apt name for this blog post.

> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.

S curves are exponential before they start tapering off though. It's hard to predict how long that could continue, so there's an argument to be made that we should remain optimistic and milk that while we can lest pessimism cut off investment top early.


> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.

S curves are exponential before they start tapering off though. It's hard to predict how long that could continue, so there's an argument to be made that we should milk that while we can.


> I'm sure people were saying that about commercial airline speeds in the 1970's too.

They were also saying that about CPU clock speeds.


> I'm sure people were saying that about commercial airline speeds in the 1970's too.

They'd be wrong, of course - for not realizing demand is a limiting factor here. Airline speeds plateaued not because we couldn't make planes go faster anymore, but because no one wanted them to go faster.

This is partially economical and partially social factor - transit times are bucketed by what they enable people to do. It makes little difference if going from London to New York takes 8 hours instead of 12 - it's still in the "multi-day business trip" bucket (even 6 hours goes into that bucket, once you add airport overhead). Now, if you could drop that to 3 hours, like Concorde did[0], that finally moves it into "hop over for a meet, fly back the same day" bucket, and then business customers start paying attention[1].

For various technical, legal and social reasons, we didn't manage to cross that chasm before money for R&D dried out. Still, the trend continued anyway - in military aviation and, later, in supersonic missiles.

With AI, the demand is extreme and only growing, and it shows no sign of being structured into classes with large thresholds between them - in fact, models are improving faster than we're able to put them to any use; even if we suddenly hit a limit now and couldn't train even better models anymore, we have decades of improvements to extract just from learning how to properly apply the models we have. But there's no sign we're about to hit a wall with training any time soon.

Airline speeds are inherently a bad example for the argument you're making, but in general, I don't think pointing out S-curves is all that useful. As you correctly observe:

> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.

But, what happens when one technology - or rather, one metric of that technology - stops improving? Something else starts - another metric of that technology, or something built on top of it, or something that was enabled by it. The exponent is S-curves on top of S-curves, all the way down, but how long that exponent is depends on what you consider in scope. So, a matter of accounting. So yeah, AI progress can flatten tomorrow or continue exponentially for the next couple years - depending on how narrowly you define "AI progress".

Ergo, not all that useful.

--

[0] - https://simpleflying.com/concorde-fastest-transatlantic-cros...

[1] - This is why Elon Musk wasn't immediately laughed out of the room after proposing using Starship for moving people and cargo across the Earth, back in 2017. Hopping between cities on an ICBM sounds borderline absurd for many reasons, but it also promised cutting flight time to less than one hour between any two points on Earth, which put it a completely new bucket, even more interesting for businesses.


Starship produces deadly noise in a large radius around it, whatever space port you're going to build, it's going to be far away from civilization.

Yes, though "far" isn't so large as to be inconceivable: the city of Starbase is only 2.75 km from the Starship launch tower.

That kind of distance may or may not be OK for a whole bunch of other reasons, many of which I'm not even qualified to guess at the nature of, but the noise at least isn't an absolute issue for reasonable scale civil infrastructure isolated development in many places.


There’s a key way to think about a process that looks exponential and might or might not flatten out into an S curve: reasoning about fundamental limits. For COVID it would obviously flatten out because there are finite humans, and it did when the disease had in fact infected most humans on the planet. For commercial airlines you could reason about the speed of sound or escape velocity and see there is again a natural upper limit- although which of those two would dominate would have very different real world implications.

For computational intelligence, we have one clear example of an upper limit in a biological human brain. It only consumes about 25W and has much more intelligence than today’s LLMs in important ways. Maybe that’s the wrong limit? But Moore’s law has been holding for a very long time. And smart physicists like Feynman in his seminal lecture predicting nanotechnology in 1959 called “there’s plenty of room at the bottom” have been arguing that we are extremely far from running into any fundamental physical limits on the complexity of manufactured objects. The ability to manufacture them we presume is limited by ingenuity, which jokes aside shows no signs of running out.

Training data is a fine argument to consider. Especially since there are training on “the whole internet” sorta. The key breakthrough of transformers wasn’t in fact autoregressive token processing or attention or anything like that. It was that they can learn from (memorize / interpolate between / generalize) arbitrary quantities of training data. Before that every kind of ML model hit scaling limits pretty fast. Resnets got CNNs to millions of parameters but they still became quite difficult to train. Transformers train reliably on every size data set we have ever tried with no end in sight. The attention mechanism shortens the gradient path for extremely large numbers of parameters, completely changing the rules of what’s possible with large networks. But what about the data to feed them?

There are two possible counter arguments there. One is that humans don’t need exabytes of examples to learn the world. You might reasonably conclude from this that NNs have some fundamental difference vs people and that some hard barrier of ML science innovation lies in the way. Smart scientists like Yann LeCun would agree with you there. I can see the other side of that argument too - that once a system is capable of reasoning and learning it doesn’t need exhaustive examples to learn to generalize. I would argue that RL reasoning systems like GRPO or GSPO do exactly this - they let the system try lots of ways to approach a difficult problem until they figure out something that works. And then they cleverly find a gradient towards whatever technique had relative advantage. They don’t need infinite examples of the right answer. They just need a well chosen curriculum of difficult problems to think about for a long time. (Sounds a lot like school.) Sometimes it takes a very long time. But if you can set it up correctly it’s fairly automatic and isn’t limited by training data.

The other argument is what the Silicon Valley types call “self play” - the goal of having an LLM learn from itself or its peers through repeated games or thought experiments. This is how Alpha Go was trained, and big tech has been aggressively pursuing analogs for LLMs. This has not been a runaway success yet. But in the area of coding agents, arguably where AI is having the biggest economic impact right now, self play techniques are an important part of building both the training and evaluation sets. Important public benchmarks here start from human curated examples and algorithmically enhance them to much larger sizes and levels of complexity. I think I might have read about similar tricks in math problems but I’m not sure. Regardless it seems very likely that this has a way to overcome any fundamental limit on availability of training data as well, based on human ingenuity instead.

Also, if the top of the S curve is high enough, it doesn’t matter that it’s not truly exponential. The interesting stuff will happen before it flattens out. E.g. COVID. Consider the y axis “human jobs replaced by AI” instead of “smartness” and yes it’s obviously an S curve.


> For computational intelligence, we have one clear example of an upper limit in a biological human brain. It only consumes about 25W and has much more intelligence than today’s LLMs in important ways. Maybe that’s the wrong limit?

It's a good reference point, but I see no reason for it to be an upper limit - by the very nature of how biological evolution works, human brains are close to the worst possible brains advanced enough to start a technological revolution. We're the first brain on Earth that crossed that threshold, and in evolutionary timescales, all that followed - all human history - happened in an instant. Evolution didn't have time yet to iterate on our brain design.


> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.

S curves are exponential before they start tapering off though. It's hard to predict how long that could continue, so there's an argument to be made that we should remain optimistic and milk that while we can lest pessimism cut off investment too early.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: