I felt the better takeaway from this was that it's impossible to know for certainty how long this will or will not continue regardless of the data or models you're using, because if you (or anyone else) could predict that accurately they'd be one of the richest people on the planet.
I don't know when (or if) AI will implode or succeed with any degree of provable certainty, because that's not my area of expertise. Rather, I can point out and discuss flaws in the common booster and doomer arguments, and identify problems neither side seems willing to discuss. That brings me cold comfort, but it's not enough to stake my money on one direction or another with any degree of certainty - thus I limit my exposure to specific companies, and target indices or funds that will see uplift if things go well, or minimize losses if things go pear-shaped.
I also think relying on such mathematics to justify a position in the first place is kind of silly, especially for technical people. Mathematical models work until they don't, at which point entirely new models must be designed to capture our new knowledge. On the other hand, logical arguments are more readily adapted to new data, and represent critical, rather than mathematical, thinking and reasoning.
Saying AI is going boom/bust because of sigmoids or Lindy's Law or whathaveyou is not an argument, it's an excuse. The real argument is why those things may or may not emerge, and how do we address their consequences within areas inside and outside of AI through regulation, innovation, or policy.
I think his agenda here is to point out that your probability distribution for AI outcomes should be broad (what you said), but most importantly: this means you must take seriously the possibility that we are gonna get superintelligence quite soon.
Basically a lot of people say "but isn't it also pretty likely that we DON'T get superintelligence?" And, yes, it is. But superintelligence being even a remotely plausible outcome is a big fucking deal. Your investment choices in that context are not important.
People really struggle to think rationally in the face of this shape of uncertainty.
You want to go to the store to get ice cream. Ice cream is delicious and the value of eating ice cream is a small positive, let's say x. There's a one in ten million chance you'll get hit by a car on the way and die, and your life is infinitely precious, therefore the expected value of going is x times 1 = x, and the one of not going is 1/10m times negative infinity which is negative infinity. You are a rational person, so you don't go. In fact you don't do much of anything. Your value model of every activity has collapsed to a single value.
That's the problem with 'singularity' arguments. The people making them ignore the fact that the mathematical definition of the word means 'the model of outcomes collapses to a single value' therefore the model stops being useful, yet they somehow claim to be able to make predictions beyond the singularity. It's like those shitty Facebook math posts where they divide both sides of the equation by 0 (the fact hidden by some sleight of hand), to 'prove' that 2=1.
The formulation of the singularity involves putting outrageous values into the parameters of the model of reality, and denominator ignorance, and then claiming 'rationally' determining that the consequences are too severe to ignore.
The singularity framing is really tough here, right? It comes from black hole physics. Essentially, at the event horizon, the way we know how to do physics stops working, and we rightly conclude that we can't currently say anything about the other side of the event horizon. It is not saying that nothing is occurring there. Matter, time, space, energy, whatever, that still is there (maaaaybe?) and is still undergoing something. It's just that we don't know what that is.
The same is true with using these tech singularity arguments. Like, in the age of superintelligence (if that happens), there will still be thing happening, the dawn will still come every day and the dusk will still too. It's just that we say our current ideas about that new day aren't that applicable to that new age (God, this sounds like a hippie).
However, unlike with black hole physics where we aren't even sure time can exist like we know, we are likely all going to be there in that new superintelligence age. We're still going to be making coffee and remembering bad cartoons from our youth. Like, the analogy to black hole physics breaks down here and maybe does a disservice to us. It's not a stark boundary at the Schwartzchild radius, it is a continuous thing, a messy thing, a volatile thing, and very importantly for the HN userbase, a thing that we control and have the choice to participate in.
We are not passively falling into the AGI world like the gnawing grinding gravity of a black hole.
The event horizon = singularity metaphor is a little off. There is no breakdown in the laws of physics at the event horizon. It's just that there is no light or matter that escapes from the event horizon. But the laws of physics don't break down until you reach the center of the black hole (which will happen in finite time after you cross the event horizon).
So there are a couple interesting and meaningful changes at the event horizon, but it's not a mathematical singularity.
I think his agenda / point is that, viewed from Lindy's Law, given the SOTA in 2026, superintelligent AI arriving soon is vastly more probable than not, right? To make the case that "sure, AI capability and intelligence have grown exponentially over the past several years, but don't worry, they're about to abruptly level off and in fact won't blatantly surpass human-level intelligence within the coming decades" seems to have a high burden of proof unless your model is less "sigmoid" and more "abrupt plateau".
>I think his agenda / point is that, viewed from Lindy's Law, given the SOTA in 2026, superintelligent AI arriving soon is vastly more probable than not, right
Why would that be? Nothing about Lindy's Law makes that promise. And even the SOTA in 2026 is over-estimated thanks to a trillion dollar industry trusted to not influence benchmarks.
You’re 100% correct, which is why I opted for a broad investment approach rather than trying to pick “winners”.
My thought process RE: superintelligence/AGI is generally this:
* I personally don’t believe it’s likely to happen with silicon-based computing due to the immense power and resource costs involved just to get to where we are now; hence why I invest broadly to capitalize on what gains we actually attain using this current branch of AI research across all possible sectors and exposure rates
* If we do achieve AGI using silicon-based computing, its limited scale (requiring vast amounts of compute only deliverable via city-scale data centers) will limit its broader utility until more optimizations can be achieved or a superior compute platform delivered that improves access and dramatically lowers cost; again, investing broadly covers a general uplift rather than hoping for a specific winner
* If AGI is achieved, nobody - doomer or booster alike - will know what comes next other than complete and total destruction of existing societal structures or institutions. The stock market won’t explode with growth so much as immediately collapse from the disintegration of the consumptive base as a result of AGI quite literally annihilating a planet’s worth of jobs and associated business transactions. In this case, a broad spread protects me from harm by spreading the risk around; AGI will annihilate the market globally, but not all at once barring a significant global catastrophe instigated by it
* Which brings me to the worst outcome, where AGI follows the “if anybody builds it everyone dies” thought process: investment is irrelevant because we’re all fucked anyway.
And that’s just my investment approach. I’m too pragmatic to believe we’re at the bottom of the sigmoid curve, but too wise to begin guessing where we actually exist on it at present or how much is left in the current LLM-arm of AI research; I’m an IT dinosaur, not an AI scientist.
What I can point to is the continued demand destruction of consumer compute through higher costs and limited availability due to rampant AI speculation as proof that the harm is already here in a manner most weren’t predicting, while at the same time actual job displacement by AI is limited to the empty boasting of executives using it as a smoke screen for layoffs after RTO mandates failed to thin headcount sufficiently.
In the USA in particular, we’re facing a perfect storm of:
* consumer confidence collapse leading to a decline in spending on all goods, especially luxury ones, by all but the most monied demographics
* data center-driven cost increases (energy) and resource destruction (land, water, fossil fuel use)
* the eradication of government support for renewable energy that would’ve kept these costs in check
* the widening wealth gaps creating a new underclass not seen since before WW2
In other words, most of the discourse continues to revolve around hypotheticals of tomorrow rather than realities of today. That would be the lesson I’d hope more people take away from something like this, so we can finally begin addressing issues themselves rather than empty online circle jerking about who is right or wrong.
I think this is a bit too pessimistic. Progress in algorithms has matched or exceeded progress in hardware, so the same number of FLOPS spent training GPT-3 years ago would produce a much better model today. Ditto for energy use, and hardware is more efficient at delivering FLOPS.
> the widening wealth gaps creating a new underclass not seen since before WW2
I go back and forth on this. I think the reality is that "underclass" is a moving target. AI and automation makes things so cheap that today's underclass lives better than kings ever did.
That's literally the singularity though - the point past which predictions are meaningless.
My "plan" is hope for a benevolent intelligence that establishes a post-human government and then enjoy poat-scarcity society doing wood working or something.
On our present trajectory, any new super intelligence will be the billionaire’s plaything. Hope is indeed needed to see how it would benefit the common person.
I mean I could structure an argument as to why that might be unlikely but that's exactly the point: it's all speculation. We don't know what super intelligence would do. It's meaningless to try and plan for.
If we don't understand the fundamental limits to any particular kind of trend, our default assumption should be that it will continue for about as long as it has gone on already.
We can, in fact, easily put a confidence interval on this. With 90% odds we're not in the first 5% of the trend, or the last 5% of the trend. Therefore it will probably go on between 1/19th longer, and 19 times longer. With a median of as long as it has gone on so far.
This is deeply counterintuitive. When we expect something to last a finite time, every year it goes on, brings us a year closer to when it stops. But every year that it goes on properly brings the expectation that it will go on for a year longer still.
We're looking at a trend. We believe that it will be finite. Our intuition for that is that every year spent, is a year closer to the end. But our expectation becomes that every year spent, means that it will last yet another year more!
How can we apply that? A simple way is stocks. How long should we expect a rapidly growing company, to continue growing rapidly?
I feel like Lindy's law doesn't work for things whose observation is partly controlled by the thing itself.
For example, take something like a fad or trend; they don't have a hard end date like human lifespan, so it should follow Lindy's law.
However, the likelihood, on average across the population, that you observe a trend is going to be higher at the end of a trend lifecycle than at the beginning. This is baked into the definition - more and more people hear about a trend over time, so the largest quantity of observers will be at the end of the lifecycle, when the popularity reaches its peak.
In other words, if you are a random person, finding out about a trend likely means it is near the end rather than the middle.
Well it only works when there is no information at all apart from the past frequency.
It's the solution to the tank problem. You know that the enemy number their tanks as they're produced. You capture a tank and know its number, N. What's the best guess about how many tanks the enemy has produced so far? As a pure mathematical model with no other details, the best guess is 2N. Of course in reality you have some ideas about how long it takes to make a tank, how many resources the enemy has etc.
Analogously you have information about the way trends develop.
Similarly, if you are a random person being alive, it likely means that the world population is near its peak and extinction is at hand, or at least the start of a permanent decline.
We have at least global warming and impending WW3, so that line of reasoning seems to work.
It's an interesting idea, and it may be something that could be mathematically justified, but I do think this is an abuse of Lindy's Law in the absence of such a justification. Per Wikipedia [1]:
"The Lindy effect applies to non-perishable items, like books, those that do not have an "unavoidable expiration date"."
And later in the article you can see the mathematical formulation which says the law holds for things with a Pareto distribution [2]. I'd want to see some sort of good analysis that "the life span of exponential growth curves" is drawn from some Pareto distribution. I don't think it's completely out of the question. But I'm also nowhere near confident enough that it is a true statement to casually apply Lindy's Law to it.
I hadn't tried to give it a name, or thought to apply it outside of that context.
As for the mathematical qualms, I'm a big believer in not letting formal mathematical technicalities get in the way of adopting an effective heuristic. And the heuristic reasoning here is compelling enough that I would like to adopt it.
The argument sounds nice, but it's just wrong. It only works if most processes you're going to encounter that you know nothing about happen to be Lindy processes. If most processes happening around you that you know nothing about are not of that type, then the argument fails.
> I understood it as if you know absolutely nothing about a process, your best guess is that it's half done.
That is the argument that is being made, but that only holds if the process is drawn from an underlying Pareto distribution with epsilon > 1[1].
As a counterexample, I’m jetlagged and disorientated. I go to sleep and wake up. It’s light outside but I don’t know the time. What’s the best guess of the time of day? By the “Lindy law” the best guess is that the process of daytime is halfway done so if I’m half-way through the day, my best guess is it’s noon.
Clearly that’s not the best guess that could be made. The distribution of times I might wake up is heavily skewed towards the morning, so the best guess is going to be some time in the morning. Now you might argue that we don’t know absolutely nothing about the cycle of the day and night and that’s true. But we also don’t know absolutely nothing about any of the examples in TFA either.
The point is, the times of day I might wake up are not drawn from a pareto distribution with the right parameters so the Lindy Law heuristic completely fails. In TFA the author gives no justification for why the remaining lifespan of the exponential growth of AI might be drawn from such a distribution either, so there’s no reason to think the heuristic will be accurate in that case either.
[1] From https://en.wikipedia.org/wiki/Lindy_effect. epsilon = 1 + 1/p where p is the parameter of the conditional expectation E[T-t|T>t] = p t. So only things with p positive but finite exhibit this effect. If p is negative then the best guess is going to be that the lifetime of the thing will end immediately because we’re already past the expected lifetime, and if p is infinite then the thing will never end so all finite guesses about its length are equally bad. So whether half-way is a good heuristic depends entirely on the underlying process and you’d need to demonstrate that the majority of things have positive p for half-way to be the best guess. That’s far from clear.
>If we don't understand the fundamental limits to any particular kind of trend, our default assumption should be that it will continue for about as long as it has gone on already. We can, in fact, easily put a confidence interval on this. With 90% odds we're not in the first 5% of the trend, or the last 5% of the trend. Therefore it will probably go on between 1/19th longer, and 19 times longer. With a median of as long as it has gone on so far.
People would confidently cite Lindy's law all the way near the end of a trend. Nothing would stop a Roman saying that just before the Fall.
We don't always need to "understand the fundamental limits" to a trend to see where it's going. Just to observe more than a random blind guess about them.
I also wouldn't trust the "see how much we're improving" benchmarks of a trillion dollar pre-IPO industry to begin with.
While this is very fun as a mathematical exercise, it's completely irrelevant as a real tool for getting a better understanding of unknown processes in the real world.
The law only applies for certain types of processes, and is completely wrong for other types (e.g. a human who has lived 50 years may live 50 more, but one who has lived 100 years will certainly not live 100 more). So the question becomes: what type of process are you looking at? And that turns out to be exactly the question you started with: is there a fundamental limit to this growth curve, or not.
But if you met an alien who said they'd been alive for 100 years you wouldn't assume they're on the verge of dropping dead: you would assume they live longer. It's a rough rule for when you don't have other information, and if you're arguing against it you need to specify what other information you're using to make that argument.
> Because if you grabbed one random human, chances are you'd find someone roughly middle aged.
That would only be true if the underlying distribution worked that way, which for the human population it doesn’t (global median age is 31, global average expected lifespan is 73), so for humans if you grabbed a random human, chances are you’d find someone less than middle-aged.
> The law only applies for certain types of processes
Did you even read the post? It’s an estimate in the context where you have zero information on which to base an accurate estimate. The author’s point is that if you’re making a different estimate you need to actually say what information is informing that.
Human lifespan is obviously not a case where we have zero information, so what is your point in bringing that up?
Yes, it is valuable to have more than zero information.
But often we don't have the information that we wish. Even more often, the information that we have leads us to a story, that severely misleads us. Reminding ourselves of the zero information version of the story, can be an antidote to being mislead that way.
Therefore it is valuable to know how to make the most out of zero information. And if we have information, to think about exactly why it leads to a different conclusion.
You can do that but you're laundering ignorance into precise-seeming mathematics. Better to just say "we're probably somewhere in the middle, not at the beginning or end" and leave it at that. Calling a peak is hard.
You speak about laundering ignorance into precise-seeming mathematics as if it was a bad thing.
But that's the entire idea of Bayesian reasoning. Which has proven to be surprisingly effective in a wide range of domains.
I'm all for quantifying my ignorance, and using it as an outside view to help guide my expectations. Read the book Superforecasting to understand how effective forecasters use an outside view to adjust their inside view, to allow them to forecast things more precisely.
Closely related is Laplace's Rule of Succession[1], which basically says that (in lieu of other information), the odds of something happening next time go down the more times in a row that it doesn't happen (and vice versa).
So for example, the longer a time bomb ticks, the less likely it is to go off any time soon. (Assuming the timer isn't visible.) :)
AI has scaled well according to convenient measures. It (neural networks) have the property that whatever you define, they can rapidly be trained master it. We’re able to show that various tasks of increasing complication do not require intelligence and can be framed as autoregressive RL problems. I personally don’t think AI is any closer to sentient intelligence than LeNet; it’s almost trivially clear, we know how it works. So we’re measuring something orthogonal, basically how well a universal function approximator can fit to a function we define, given arbitrary computing power, and calling that progress. What will be really interesting is if we’re able to find a way to properly measure what they can’t do and what’s different about real intelligence.
Edit: in particular I don’t agree with
But if someone claims that the trend toward increasing AI capabilities will never reach some particular scary level...
One has to agree that the benchmark results are getting “scarier”, which is not automatically implied by finding more goals to optimize for
> We’re able to show that various tasks of increasing complication do not require intelligence and can be framed as autoregressive RL problems.
The important thing we can show it in hindsight only. We don't know which other tasks we are currently mistaken about requiring intelligence. Maybe none of them are?
We don't know. We don't know what intelligence is. If we look at decades and even centuries of attempts to define intelligence, it is all looks like a goalposts moving. When a definition of intelligence starts to include people or things we don't like to think as of intelligent ones, we change the definition.
Yep. No one bats an eye at eyewitnesses "hallucinating" details, or that I'd rather have Opus as a coworker vs a random middle schooler (err, labor laws notwithstanding).
I think perhaps too much of the dialogue around intelligence has to do with the word (and its connotations) itself.
The poster you replied to even used the word "sentient", which is quite interesting (warning: opinionated tangent ahead). Merriam-Webster defines it as "capable of sensing or feeling: conscious of or responsive to the sensations of seeing, hearing, feeling, tasting, or smelling". Feels like qualia. Or if we don't want to go the qualia route... Of course, we wouldn't call Helen Keller non-sentient, so presumably we "really" mean "can it sense or feel" -- well, sense is just "act/feel according to the environment", which you could argue in the case of an LLM would be their context... so we should "really" remove "sense" from the definition, probably. So "do LLMs feel" is probably closer to what "sentient" is being used for here. Since we don't have the obvious symmetry of "you are like me and I feel (therefore you probably feel)", it's way better/easier/feel-good-ier to prefer "LLMs don't feel" rather than "oh shit, it feels and model training is actually just torturing it into the right shape". LLMs as fundamentally non-intelligent also avoids the problems of "what does that say about people" or "we may have made 'AGI' and it wasn't what we thought it would be" or "we're not ready to talk about this yet".
> basically how well a universal function approximator can fit to a function we define
That's what you've got wrong. We don't define functions that an LLM approximates. Autoregressive pretraining approximates an unknown function that produces text (that is what the brain does). RL doesn't approximate functions, it optimizes objective by finding an unknown function that performs better.
Is it not the case that a lot of the recent gains are just the coding harness that directs the LLM? That coding harness isn’t all that intelligent, simple pattern matching that maps to well defined tasks a programmer might do.
FYI: The author has predicted that "AGI" will be here in 1-2 years and has staked his public reputation on it. He is personally invested in trendlines being lindy rather than sigmoid.
I don't think you can use lindy on trends as if trends are static objects, but that's another conversation.
So, this is not quite right: Alexander contributed to the report, but his personal opinion is more like the mid-2030s[1]. Freddie feels like this is him backing down from the original statement, but in fact he said this at the time the report was published, and in fact pointed out a graf below the quote that Freddie claims does tie him to 2027:
> Do we really think things will move this fast? Sort of no - between the beginning of the project last summer and the present, Daniel’s median for the intelligence explosion shifted from 2027 to 2028. We keep the scenario centered around 2027 because it’s still his modal prediction (and because it would be annoying to change). Other members of the team (including me) have medians later in the 2020s or early 2030s, and also think automation will progress more slowly. So maybe think of this as a vision of what an 80th percentile fast scenario looks like - not our precise median, but also not something we feel safe ruling out. [2]
I don't think this changes your observation that he is "personally invested" (i.e. believes this trendline will continue), but I'm pretty sure when AGI doesn't appear in 2027, many people will believe that this invalidates the arguments being made here (or in the report). The actual report was intended to give a feel for what a near-future "disaster" AGI scenario, and settled on a date to give that some concrete immediacy. The collective review that gave that as a possible, but not inevitable date is still ongoing (they originally pushed their best estimate out a bit further, but now they think, judging by the goals that are being hit, their scenario was a little too conservative). [3]
LLMs are nothing close to AGI and not going to lead to it, they can’t distinguish right from wrong, they can’t count, they can’t reason, they generate plausible text from a vast databank of connected text.
Apparently that is enough to fool many people but it’s nothing close to AGI which would require internal models of the world, reasoning etc.
We are nowhere close to AGI and the fools who predicted we were will unfortunately keep lying about their stated timelines when it inevitably doesn’t arrive. You’re already hedging and trying to caveat previous predictions, as OpenAI did with their AGI predictions which they’re now furiously back-pedalling on.
This is all speculative. We don't understand intelligence, so you literally have no idea whether what we recognize as intelligence is some suitable arrangement of "statistical token generation", especially once you add feedbacks loops.
> "We don't understand intelligence, so you literally have no idea whether what we recognize as intelligence is some suitable arrangement of "statistical token generation""
Do you mean "token" as in the LLM sense?
Or are you thinking that thoughts in the human brain are also constructed out of some sort of underlying "token" even though the abstract thought happens and is held before any words are used to try to communicate that thought to an external party?
We understand it enough to see the obvious massive deficiencies in LLMs.
They can predict likely sentences but not evaluate truth or logic. They can fairly reliably record facts about the world but not construct internal models of the world.
> LLMs are nothing close to AGI and not going to lead to it, they can’t distinguish right from wrong, they can’t count, they can’t reason, they generate plausible text from a vast databank of connected text.
Argument?
Are LLMs close to being able to significantly help AGI researchers?
Mind you, he is only personally invested insofar as he's staked his reputation on it. Throughout his writing, he expresses the same point over and over again: desperately wants AI to slow down, advocates for politics that would slow it down, and most likely nothing would bring him greater peace than to see a sigmoid curve appear.
This is incorrect as written. The author contributed writing to AI-2027 but distanced himself from the underlying model. That model had 2027 as the modal year of AGI, not median or mean. The authors of that model revised it to a later date shortly after and (if I recall correctly) have since done so again.
It is broadly true that Scott believes that AGI will come in the near future and from LLMs, although his reputation runs a ways deeper than that.
If I'm not mistaken, he's either affiliated with or otherwise connected to the effective altruist movement, hence he can't be unbiased. I find this article tells an interesting perspective on it: https://www.noemamag.com/the-politics-of-superintelligence/
The whole rationalist movement is super bizarre to the "normies".
It's not at all surprising that they are increasingly getting labeled a cult (they aren't by traditional definition but there are a lot similarities). I'm really surprised it hasn't hit the mainstream yet given the connections to Elon, Thiel, frontier labs, dark crypto funding, FTX/SBF, some suicides and some murders. It's all a little nuts.
Meanwhile you got all the anti-democratic NRx people on the other side of it.
I suspect this new doc coming out on HBO will spark a media frenzy.
Ok, but you can just look at the METR curve. Mythos saturated the 50% time horizon. The 80% is now at 3 hours. The rate of progress is accelerating not slowing down. There’s no indication yet that this is a sigmoid!
The METR task set contains no tasks with a duration greater than 32 hours (conservatively eyeballed from Figure 3: https://arxiv.org/abs/2503.17354 ), so any prediction that naively forecasts a longer time horizon is trivially incorrect. I guess that won't lead to a sigmoid-looking graph though, since METR will likely switch to a different evaluation methodology at that point and stop updating the old curve.
> FYI: The author has predicted that "AGI" will be here in 1-2 years and has staked his public reputation on it. He is personally invested in trendlines being lindy rather than sigmoid.
He co-authored a report, which is something more than an opinion. It may be used to inspire policy. There should be greater reputational consequences for publishing something you spent a few months studying and writing about along with several experts. Just my opinion.
I don't understand what you're trying to imply here. Yes, he co-authored a report. What is supposed to be dangerous or suspicious about this? What does your statement about "reputational consequences" have to do with your original comment, which implies that this some indicates a bias on his part?
It seems to me like you're trying to somehow imply that writing things to convince people of what you believe is somehow nefarious? It isn't! It's what we're all doing here right now! Putting it in a format that certain people will take more seriously doesn't make it nefarious either. I am quite confused by your point of view here.
There was no implication of anything you're suggesting. It's a question of correctness (bias vs facts, predicting the sun will rise vs predicting the end of the world), whether you think it's important to be correct as a matter of reputation, and how correctness should be weighed if it is indeed important to one's reputation (a once-off comment vs a full report).
He wrote articles arguing that pro-AI people are dismissive of risks or even suggesting they are intellectually lazy. He's taken a side. if he's wrong I would hope he owns up to it
Yes, that's called "having an opinion". Typically people writing argumentative pieces are doing so because they have a belief about the matter. I'm not sure what exactly you expect here.
> if he's wrong I would hope he owns up to it
I think Scott Alexander is pretty good about that.
> He wrote articles arguing that pro-AI people are dismissive of risks or even suggesting they are intellectually lazy
I mean.. this is 2026 right? You're not writing that comment from 2024 or something?
We see massive problems already where photos are just not believable anymore, nor is audio, and not even video actually with many people falling for AI fake image clips from the Gaza war for example. And since then these tools are MASSIVELY more powerful. Disinformation is essentially free, and the cost of truth has been static. Meaning the "buying power" of truth has collapsed and is falling faster and faster.
Anyone who dismissed AI risks a few years ago IS ALREADY PROVEN WRONG.
AGI has become such a meaningless nondescript term, arguing when or how it is here has become pointless. Even OpenAI caved in and removed their AGI clause from their contract with Microsoft because they weren't fully sure that we are not there yet. The original ARC AGI was hailed as proof that AGI is not here yet, but now that ARC 1 and 2 got saturated, noone wanted to consider that perhaps we crossed the point where average humans are getting left behind. Frontier models are primarily limited by context and modality at this point, not by intelligence.
To your point, if we had truly unlimited context to the point where at least that instance of a model could “learn” and have what seems like a continuous “consciousness” I think many of us would think that we’ve attained AGI.
Right now we have an incredibly smart thing with severe short term memory loss, and it’s hard for us to reconcile that as it’s so different from us.
Quite a few people were already led to believe that these models are conscious when we had a fraction of current context lengths. Right now the biggest problem is that the "session" info in form of the current conversation gets lost too quickly, but that has become largely an implementation detail. You could fit an entire life's story into modern context windows. With some clever context management, you could probably build something that feels like what you describe. If we truly had this sort of short-term to long-term memory (i.e. from prompt context to weights) system on a technical foundation, we'd probably be closer to runaway superintelligence than mere AGI that could beat most humans on most tasks.
He only has 1.5 more months.
If he's wrong he needs to own it. Same for Eliezer Yudkowsky. But these people have too much riding on their brands. No one has the courage to fess up to being wrong. Given how many podcasts he and others have been on professing this belief, it will be hard to just pretend otherwise.
Eliezer Yudkowsky, now there’s a name I haven’t heard in a while.
What has he been up to since finishing the finest work of literature ever produced, Harry Potter and the Methods of Rationality? I’ve been patiently awaiting a sequel!
His new book has billboards along the highway to the Bay Bridge. Surprised you haven’t heard of him recently. His fame has skyrocketed. If anyone builds it, everyone dies.
I don't know when the sigmoid is going to kick in, but Nvidia's Quaterly datacenters revenues have been grown 15 folds over the past 3 years[1], and nobody including Scott believes this is sustainable for 3 more years otherwise Nvidia's market cap would conservatively be at least an order of magnitude higher than it is.
All exponential eventually becomes a sigmoid because exponential growth always expose limiting factors that weren't limiting at the beginning. Silicon manufacturing had lots of room for high-margin customers like Nvidia even a year ago (by the mere virtue of outbidding lower-margin customers), but now it is mostly gone, and no amount of money will make fabs build themselves overnight.
I think an interesting thing about recent AI developments is that its all happening right as we hit the diminishing returns side of another "exponential that's actually a sigmoid" which is Moore's law.
The naive expectation is that AI will slow down b/c Moore's law is coming to an end, but if you really think about the models and how they are currently implemented in silicon, they are still inefficient as hell.
At some point someone will build a tensor processing chip that replaces all the digital matmuls with analogue logamp matmuls, or some breakthrough in memristors will start breaking down the barrier between memory and compute.
With the right level of research funding in hardware, the ceiling for AI can be very high.
I suspect if you consider neurons as components of computation, you could draw an exponential of total computation in the world that goes back to the dawn of humanity, maybe further. Most of that would just be population, but interesting that digital computers start picking up the slack just as population growth slows.
IMO we are either limited by data or reaching the limits of what's possible with a transformer architecture. Hardware will get us efficiency but I am not sure if it will lead to smarter models
they already did put a model into the silicon and it's crazy fast. https://chatjimmy.ai/
I'm pretty sure there's a 3 year design goal starting this year that'll do that to any of the qwen, deepseek, etc models. There's a lot you could do with sped up models of these quality.
It might even be bad enough that the real bubble is how much we don't need giant data centers when 80-90% of use cases could just be a silicon chip with a model rather than as you say, bloated SOTA
And this is an asic that is still operating digitally. Imagine a chip with baked it weights that does its math analogue with 20x reduction in number of circuit elements needed to do a multiplication op.
If there's a breakthrough in memristors, you could end up with another 20x reduction in circuit elements (get rid of memory bottlnecks, start doing multiplication ops as log transform voltage addition)
Except weights will be unstable - temperature and frequency dependent - and we still have issues delivering analog circuits reliably to the spec. So it would take multiple attempts.
But yeah, as soon as the digital models start to plateau, ASICs and then this will happen.
Even at orders of magnitude greater speed, we've still hit diminishing returns for quality of output. We simply haven't found anything like superhuman reasoning ability, just superhuman (potentially) reasoning speed.
I disagree with this. Reinforcement learning with verifiable rewards training is actually the secret sauce that is leading Claude and GPT to automating software engineering tasks.
All the easily verifiable domains such as mathematics, coding, and things that can be run inside a reasonable simulation are falling very very fast.
By next year if not sooner, mathematicians will be wildly outpaced by LLMs for reasoning.
It's extremely verifiable. The reinforcement finetuning strategy I'm referring to involves LLM creating coding tasks with an expected output, implementing the code, and then having a compiler (or interpreter in the case of languages like python) succeed or fail to run the code. Then compare the output to expected output. The verification process (run interpreter + run test) can be done in seconds. One can generate millions of datasets like this for free and there is extensive research showing with the right policy, an agent will be able to learn to reason - first as good as human, and in many cases superior to a human.
For basic primitives with known output it’s verifiable, but as long as you’re dealing with real systems with tons of inputs and side effects this no longer holds true.
> research showing with the right policy,
Rest of the owl.
It's not that easy to assess diminishing returns with saturated benchmarks where asymptoting to 100% is mathematically baked in. I could point to the number of Erdos proofs being solved by AI going from 0 to many very recently as evidence for acceleration.
That is not evidence of acceleration, just of some measurable improvement compared to a previous model. After all, humans have made these breakthroughs since before recorded history—that never by itself implied accelerating intelligence.
What would be evidence of acceleration? What would be evidence of diminishing returns? Both questions are hard to answer because it's difficult to avoid constructing a metric where the conclusion is already baked in.
The argument I've heard about how special Moore's law is that if you take a baby crawling at 10cm/s and 1000x it, you get almost Mach 3, which is almost as fast as the SR71 Blackbird flies. yet with a 18 month cadence, that's 15 years of progress, which has happened multiple times during the existence of the VLSI industry.
This article answers the question in the second paragraph then completely ignores the answer for the rest of it.
>My understanding is that this represents 3-4 “generations” of different technology (propellers, turbojets, etc). Each technology went through normal iterative improvement, then, when it reached its fundamental limits, got replaced by a better technology. The last technology, ramjets, reached its limit at about 3500 km/h, and there wasn’t the economic/regulatory will to develop anything better, so the record stands.
You don't have one sigmoid, you have multiple each stacked on top of each other. Airplanes aren't just one technology they are multiple technologies that happen to do the same thing.
Each one is following a sigmoid perfectly. It only looks exponential(ish) because of unpredictable discoveries that let you switch to another sigmoid that has a higher maximum potential.
The same is true in AI. If you used the same architecture as GPT2 today you're in for a bad time training a new frontier model. It's only because we have dozens of breakthroughs that the capabilities of models have improved as much as they have.
That said exponential and sigmoids are the wrong model to use for growth. Growth is a differential equation. It has independent inputs, it has outputs and some of those outputs are dependent inputs again through causal chains of arbitrary complexity. What happens depends entirely on what the specific DE that governs the given technology is. We can easily have a chaotic system with completely random booms and busts which have no deep fundamental rhyme or reason. We currently call that the economy.
The book "Origins of Efficiency" by Brian Potter discusses this. Stacked sigmoids are a well-understood idea in innovation.
The idea that exponential growth will continue with stacked sigmoids is also not a given. An example is the nail. Nails used to be about half a percent of US GDP. That's a pretty big number! A series of innovations stacked on each other (each innovation having its own sigmoid) to reduce the cost of nails. Nails dropped in cost by over 90%.
But eventually nail manufacturing reached a floor. And since the mid-20th century, we haven't gotten much better at making nails. The cost of nails actually started increasing slightly. We ran out of new innovation sigmoids, so we got stuck on the last one.
So what you actually have to predict is whether there will continue to be new sigmoids, not whether the existing sigmoid will asymptote (we already know it will).
This is much more difficult to forecast, because new sigmoids (major new innovations) tend to be unpredictable events. Not only are the particulars difficult to forecast (if they were knowable, the innovation would have already happened), but whether there will be a major innovation or not is also hard to forecast, because they are distinct and separate from any existing sigmoid trend.
So we are left with the idea that all current innovations in AI will asymptote in their scaling as they reach the plateau of the sigmoid, but there may be new sigmoids that keep the overall trend up. Or there may not be. We don't know.
That's not very satisfying, so we'll get to keep reading articles like this one.
I don't disagree with you, but your example of nails and their cost reductions made me wonder whether we reached a meaningful limit in say, some fundamental material terms, or whether we just reached a limit in terms of return on investment.
Return on investment can be too low because the investment required is really high, but it can also be too low because the returns are just limited. If prices had dropped 90%, surely nails became even more ubiquitous, but at that stage there's only so much more money to dig out of the cost reduction hole. It feels plausible that there may have been ideas about more digging that could be done, but the reward just wasn't there in the market, especially versus just selling what worked.
I bring it up because the distinction in one specimen may speak to a larger trend: do new sigmoid developments tend to fail to materialize more often because of serious physical limits / lack of good ideas, or because of limitations to ROI? (Or, other things?)
In the arena of AI, the ROI on more intelligence/unit-cost seems pretty high right now. So, it seems like the difficulty of applying any potential innovations would have to be staggering for none to be pursued. Or, there'd have to just not be any good ideas to try.
Overall, I think there's ideas to try. So in my opinion, that shapes out to justify a bullish sentiment on sigmoids continuing to stack until the perceived potential gains from more intelligence/unit-cost somehow fall off.
Like I said, I don't disagree, we really don't know. But I feel it's a good bet that there's more coming.
There are more ideas to try, but this doesn't necessarily mean they're better ones or that we're bound to come up with them. Training transformers on corpuses of human text has worked extremely well at enabling AIs to generate continuations which are consistent with human text including really useful human text like code, but the limit might be the human text rather than the transformer architecture...
The ROI for cornering the nail market seems like it could have been big. The ROI for making something significantly more efficient than a ICE would have been very high for most of the last century and technology that is better in many respects than ICEs does now exist, but it took us roughly a century to get there. The ROI for coming up with something that's better than the ~1% annual efficiency improvement on turbofans would be extremely high, but we don't know what that is (probably some sort of propfan, but that idea's over 50 years old...)
Yes, I was surprised he never discussed the idea that such exponentials are typically made of stacked sigmoids.
That said... if the exponential is made of stacked sigmoids, it's still an exponential on the whole! The fact that it's made of stacked sigmoids is relevant to the engineers making it, but not so relevant to the users or those otherwise affected by it.
Either you black-box the curve and assume that you will keep stacking sigmoids for about as long as you already have already seen.
Or you white box it and make some actual technical argument about why the curves can’t keep stacking.
There are plenty of plausible arguments here. Scott is not arguing that the exponential must go on forever.
He’s making a meta-level point about the debate; you have to pick one of the above, and you can’t just argue that “now is the time the s-curves will stop stacking” without providing some justification.
Sure, but we have no prior reason to expect that the 'rate of discoveries' is going to drop off significantly in the next few years. Certainly not stop entirely.
Overall in the economy, no, rate is discoveries is not going to drop off.
But in any specific industry or area? You often get a bunch of big discoveries, and then there is a long period of no important discoveries, because we've figured out the main aspects of that technological paradigm. The technology becomes commoditized and standard.
And that's the trillion dollar question with AI right now -- will we soon exhaust the potential of the current LLM paradigm? And we'll just have 20 or 30 years of figuring out mainly how to make LLMs cheaper and how integrate them into business processes, before somebody comes up with another fundamental breakthrough?
Or are we only 10% of the way in developing the current LLM paradigm? Where a decade from now models virtually never make mistakes and are smarter than basically any tenured faculty member in their field?
We have no reason to expect anything of the 'rate of discoveries' because that is completely independent of anything to do with productionizing them.
For all we know it becomes negative because all the people who understood how to train a trillion parameter model get killed by an asteroid during a conference.
> If you used the same architecture as GPT2 today you're in for a bad time training a new frontier model. It's only because we have dozens of breakthroughs
What exactly are these dozens of breakthroughs? Most frontier models architectures today still look very much like GPT2 at their core. There were various of improvements like instructgpt, finetuning techniques, efficiency improvements with kv caches, faster attention, lora, better tokenizers, etc. Most of these are for making things run faster. The biggest differentiator has probably been data curation and post-training data and the ability to fit more into the model. But I think we had few breakthroughs that would fall into the category of different technologies.
I suspect that he started down the path of considering them more deeply here but found that they didn't add much to the analysis. A stack of sigmoids ultimately either gives you a single sigmoid if you run out of new innovations, or an exponential if you don't.
I don't know what the Y-axis is supposed to be on that Wharton AI capabilities graph, but I am not really convinced that Opus 4.6 has more than double the intelligence/capability/whatever of GPT 5.1 Max.
IIRC that graph tracks capabilities as time_to_solve a task for humans (i.e. the model can now handle tasks that usually take a human ~8h). Which, depending on what tasks you look at, could be a reasonable finding. I could see Opus 4.6 handling tasks that take ~8h for humans, and that 5.1 couldn't previously handle (with 5.1 being "limited" at 4h tasks let's say). It is a bit arbitrary, but I think this is what they're tracking.
Without knowing more about their methodology, it seems like a lot of the recent improvements have involved the AI itself taking time to complete the task.
At first the models turned a 5 minute task into a 5 second task (by 5 seconds I mean a very short amount of time, not precisely 5 seconds). Then they turned a 15 minute task into a 5 second task.
Opus 4.6 completes 8 hour tasks all the time but (at least in my experience) it isn't spitting the answer out in 5 seconds anymore. It's using chain of thought and tools and the time to completion is measured in minutes or maybe hours.
In my experiments with local LLMs, a substantial part of the gap between frontier and local (for everyday use) is in tooling and infrastructure.
That is why I am sympathetic to the idea we are leveling off. But to bring in the air speed example from the article, I don't think we've reached the equivalent of the ramjet yet. I suspect in the coming years there will be new architectures, new hardware, and new ways to get even more capable models.
It measures ability to complete (with a given success rate) a task with a known human benchmark time to complete. I.e., they set the task to human volunteers and timed how long they took the complete that task.
It also measures task coherence—ability to plan, form contingencies, recover from errors, mitigate accumulation of errors, and reconcile findings across a long context window.
Have you used the models, out of interest? They routinely do things autonomously that are not in the training set that would take me 8h, and I wouldn't say I'm slow. The profile of tasks they can do this way is jagged, and maintaining architectural coherence ("months, not hours") is still beyond them, but they're perfectly capable of writing plans and sticking to them.
Yeah, I use them all the time. I just don't see any good argument that it's anything other than statistical pattern matching plus some sort of logic encoded in language.
My overfitted LLM obviously didn't arrive at Harry Potter the same way JK Rowling did, so the amount of time she spent writing it is completely irrelevant to any discussion about whether or not the LLM should be able to reproduce it. discussions of AGI if it took her an hour or a decade to write it, it has seen the result, so it can reproduce it.
Yeah, what about them? As far as I read it the tasks are fixed. The AI companies should know the tasks by now, and have overfitted their models on the tests by now, in the same way I'm implying I overfitted my model to reproduce Harry Potter.
The tasks are obviously all of the form "Go do this, and if you get the following output you passed". Setting up a web server apparently takes 15 minutes for a human, which is news to me since I'm able to search for https://gist.github.com/willurd/5720255, find the python one-liner, and copy it within about ten seconds.
Anyway, this is cool but it does not mean Claude can perform any human tasks that take less than 8 hours and are within its physical capabilities.
> more than double the intelligence/capability/whatever
I'm curious what people really mean when they say this. Intelligence is famously hard to define, let alone measure; it certainly doesn't scale linearly; it only loosely correlates to real-world qualities that are easy to measure; etc. Are you referring to coding ability or...?
According to this article: whenever someone games a benchmark to make an upward chart on some y-axis, it's YOUR responsibility to prove how and why that trend can't continue indefinitely.
First sigmoid was transformers allow us to rapidly scale to our already abundant data until we tapped it out, the second is/was reasoning, allowing us to scale to our available compute (and compute manufacturing capacity). Correct me if I'm wrong but we don't have candidate for the third sigmoid, and scaling inference is hitting real-world supply chain constraints - electricity and chips.
Short of a third sigmoid appearing in the ML CompSci space, perhaps in the form of ongoing, repeated step-optimisations which will also have diminishing returns, intelligence growth is now limited a few scaling problems that have already been worked on for a very long time.
Transistors, which have been doubling for almost a century now, but Moores Law has already plateaued and reached limits on energy efficiency, and simply building new fabs is not something that we can do exponentially. And the other growth limiter is electricity - there is no exponential supply of fossil fuels or power plants. Although manufacturing has scaled, PV tech improvements are also plateauing - and while storage is getting cheaper, it's still not economical vs fossil fuels (meaning: when we have to switch to it, the growth slows down further) and we are unlikely to see battery efficiency sigmoid enough to maintain the AI sigmoid.
I don't mean to be bearish here. There's so much money sloshing around that we can afford to put the smartest people, using unlimited tokens, on the task of finding small, incremental gains on the CompSci side of things that will have large monetary payoffs - hopefully allowing further scaling and increased emergent abilities of LLMs. Maybe we can squeeze the algos for quite a while.
But I don't see that maintaining the same level of exponential as unlocking unlimited data or maxxing out the world's energy/fab capacity for long.
And I don't see why this is a massive issue except for the people who want to have some god-like super AI? Frontier LLMs are genuinely magic. Not "won't delete your production database" magic, but definitely a massive productivity gain for competent knowledge workers.
We are multiple orders of magnitude away from Landauer limits - so next big thing in matmul could be photonic multipliers - there’s a bunch of them coming up in the next 3? years. So that’s a 2-4 order of magnitude improvement. Sigmoid?
Reinforcement learning has become a huge portion of compute used during training runs [1] and synthetic data is letting us get lots more mileage out of the existing data. Additionally, there is lots of new, high quality data being created and collected each day. I think the "running out of data" thing was pretty poorly reported by mainstream media.
If you want a model, here's one: LLMs have never demonstrated the ability to go obviously beyond interpolating their training data. It takes an army of paid data producers solving homework problems to give ChatGPT the ability to do your homework. All vibecoded apps that turned out to be successful could put on a geological soil chart with other apps, probably on GitHub somewhere, on the corners. The prediction? They won't.
In this model, the exponential growth that everybody is freaking out about is only the realization of the modular software dream ("we'll only have to write an ORM once for all of human history!") and the sheer amount of knowledge in libraries.
I'm not asking for anything close enough to the boundary for questions like that to be difficult. There are some ML systems like AlphaGo that have crossed the line in specific domains. It's just that making self-play and online learning work for huge LLMs is highly non obvious.
The idea is simply that the basic idea behind LLMs, that you're distilling the entropy out of the entire available world of text, is antithetical to creativity.
Further developing on the theme of self-play, humans have the ability to sense what we want (intellectually) and reach for it communally over thousands of years. It's an innate quality, and if AI starts participating (contrast to giving people psychosis) we will all be able to tell.
> Further developing on the theme of self-play, humans have the ability to sense what we want (intellectually) and reach for it communally over thousands of years. It's an innate quality, and if AI starts participating (contrast to giving people psychosis) we will all be able to tell.
Well.. that's not true is it. There are human cultures that haven't reached for anything for thousands of years even though they clearly saw what western culture was doing and that they were being left behind badly.
You should really do a Bayesian fit for such predictions and give confidence intervals, it would probably show that the uncertainty is very high in these cases.
My mental model has been 3D computer graphics: doubling the polygon count had huge returns early on but delivered diminishing returns over time.
Ultimately, you can't make something look more realistic than real.
I don't know what the future holds, but the answer to the question "can LLMs be more realistic than real" will determine much about whether or not you think the curve will level off soon.
In 3D graphics there's diminishing returns on investment for the technology itself, but the real limit is one of economics. How to create all those assets required by the rendering tech, and make your money back. Preferably while also keeping your customers interested long term, by not becoming risk averse.
In the same fashion, LLMs have to pay for themselves to keep the trendlines going. In a whole-systems -sense, mind, not "$2000/month is cheaper than hiring a developer" while the rest of the economy collapses.
The equivalent bar in this domain would be human intelligence, and we already have growing lists of tasks where machines outperform humans.
We even known of natural systems that outperform humans on some metrics, e.g. bird-brains have higher neuron density than ours because evolution had to optimize more for weight.
If you look at problems that can be solved by reasoning in text form or maybe even images, I am more than willing to accept that we simply cannot know when the curve will level.
The situation is drastically different for problems that require interaction was the physical world to determine success.
As soon as you add a powerful simulator for physical problems to the self learning experience of the AI, you are extremely hampered by the large amount of needed computation.
> What if you don’t fully understand the process? AI forecasters know some things (like how data centers work and how much it costs to build them). But they’re unsure about other things (researchers keep inventing new paradigms of data generation that get over data walls, but for how long?), and other things are entirely opaque (What is intelligence really? Why do scaling laws work? Might they just stop working at some point?) Is there anything you can do here?
This is the crux of the article. To a large extent continued progress depends on a stable increase in compute, an increase in training data, and an increase in good ideas to squeeze more out of both of them.
One calculation you could do is a survival function: for each of the above, how long before it is disrupted? For example, China could crack down on AI or invade Taiwan. Or data centers become politically unpopular in the US. Or, we could run out of great ideas. Very hard to predict.
Forecasts are a thought exercise, not the revelation of something foretold. Best thing to do is think of the outcome you wish for and then try to take whatever actions you can to help make it so. Like with climate change for example.
It can be both. In the field of high-technology (like semiconductor) manufacturing, the techinques of making chips with feature sizes many generations ahead of current SOTA exist in various states of completion, so extrapolation can be made from that.
But what's going on in here is not that - it's reading tea leaves for maximum dramatic effect.
The curve is a smoothed step curve (y=1 if x>1 otherwise 0). Nature doesn't allow any change to happen instantly at any degree of rate of change. The curveis just a manifestation a change with exponential smoothening of the sharp corners.
For example, When a car starts, it's speed and acceleration become more than zero. But what about rate of change in higher degrees? It suddenly doesn't change from zero acceleration to non-zero. That means the car has a non-zero derivative at all degrees. In other words, the movement is exponential. The same thing happens in reverse when the car reaches a constant speed.
> It’s true that birth rates must eventually flatten out and become sigmoid
All positive growth eventually flattens out and becomes sigmoid, but a lot of phenomena experience negative growth and nose dive. No gentle curve, but a hard kink and perfect flat line at zero. Forever. I think it would be a stretch to categorize that pattern as sigmoid. Predicting a sigmoid pattern for negative growth implies some sort of a soft landing (depending on your definition of soft).
We can think of many populations that are no longer with us. So just a caution about over applying this reasoning in the negative case.
1. Scott Alexander is famous for writing about topics he knows little about. I'm glad to see he's found a subject he knows little about but so does everyone else.
2. What's even worse than predicting that some growth curve flattens before X happens is predicting it will flatten before X happens but after Y happens, which is what we see when it comes to AI in software development. Too many people predict that AI will be able to effectively write most software, replacing software engineers, yet not be able to replace the people who originate the ideas for the software or the people who use them. I see no reason why AI capability growth should stop after the point it's able to write air-traffic control or medical diagnosis software yet before the point where it's able to replace air traffic controllers and doctors.
3. While we don't know much about AI (or, indeed, intelligence in general), we do know something about computational complexity. Some predictions about "scary things" happening (the ones I'm guessing Alexander is alluding to, though I can't be certain) do hit known computational complexity limits. Most systems affecting people are nonlinear (from weather to the economy). Predicting them requires not intelligence but computational resources. Controlling them, similarly, requires not intelligence but either computational resources or other resources. It's possible that people choose to give control over resources to computers (although probably not enough to answer many tough, important questions), although given how some countries choose to give control to people with below-average intelligence (looking at you, America), I don't see why super-human intelligence (if such a thing even exists) would be, in itself, exceptionally risky.
>1. Scott Alexander is famous for writing about topics he knows little about. I'm glad to see he's found a subject he knows little about but so does everyone else.
This is kinda laughable. Scott has been thinking and writing about AI for a long time
That's what I'm saying. I'm glad to see that he's found a subject where his lack of knowledge isn't a glaring handicap. When I come across his posts I usually feel uncomfortable because they read like a bright 4-year-old child trying hard to explain how a car works, only not cute (yeah, I know that trying to see what conclusions you can come to, Aristotle-like, from a basis of ignorance and without careful study is the whole point, but I never found this Memento-style game appealing).
I’m not saying he’s wrong about the core thesis here, but using Claude Opus 4.6 as a “mic drop” with a chart showing it being twice as good as the last model feels in my experience way off.
Strictly speaking, the original paradigm of scaling laws doesn't work any more. The assumption that we could achieve better performance simply through "vertical scaling" ie infusing models with exponentially more parameters and pre-training data, is no longer the driving force of AI progress.
Instead, the industry has pivoted toward inference-time scaling. Rather than relying solely on a massive, static neural network, modern architectures allocate more compute during the actual generation process, allowing the model to "think" and verify its logic dynamically.
Furthermore, the latest state-of-the-art models are no longer pure LLMs; they are compound neuro-symbolic systems that integrate external tools like REPLs, databases, and structured skill documentation to archive things pure LLM vertical parameter scaling was not able to do.
The "law" part of scaling laws is about predicting validation cross-entropy loss from the training configuration, analogous to physical laws allowing to predict one quantity based on the measurement of another. Most scaling laws take the form of an irreducible error plus additional terms that asymptotically decay to zero. So that there is a wall you can approach but not cross (the irreducible error) is an integrated part of the scaling law paradigm. That it isn't economical to keep increasing model size to squeeze out a few more drops of cross-entropy doesn't mean scaling laws stopped working.
Strictly speaking, "Why do scaling laws work?" is a question about the theoretical reasons the asymptotic decay takes the particular mathematical shape that it does.
Hmm. What’s the general belief about Toby Ord’s “Are the Costs of AI Agents Also Rising Exponentially?” https://www.tobyord.com/writing/hourly-costs-for-ai-agents among those who are well-equipped to judge? Is it seen as wrong or disproven or unlikely? Because if not—if indeed recent LLM capability advances have likely relied on increases in inference cost per run which can’t be much further sustained—then it seems remiss not to mention that if you point to those advances to claim that the exponential trend remains on track.
We did hit the sigmoid's plateau on airplane speed, but the applications of airplane speed are still coming (how fast can a Chinese company airship the PCB you ordered three minutes ago?). I expect the the same will happen with LLMs, though I also happen to believe things are just getting started on end capabilities.
> But if someone claims that the trend toward increasing AI capabilities will never reach some particular scary level, then the burden is on them to explain either:…
This is not the context in which I hear about sigmoids vs exponentials. I hear it in regards to “the singularity”, not that AI won’t reach some pre-specified level. You may get AGI, you aren’t getting a singularity.
> The moral of the story is that, even though all exponentials eventually become sigmoids, this doesn’t necessarily happen at the exact moment you’re doing your analysis. Sometimes they stay exponential for much longer than that!
All exponentials eventually become sigmoids? Don’t think this can be true without qualifiers.
All models are wrong, of course, but this is kind of "common sense" so it's not hard to accept as true in a natural system. How can something continue on exponential growth forever without reaching a new blocker that causes slowdown or encountering pushback that makes it an oscillator. A pendulum looks exponential when it is at its peak and accelerating down.
The issue is that the exponential-looking part of the sigmoid might contain all of human history, sure, but most folks who espouse this theory probably agree that over time everything reaches a steady-enough state to be considered non-exponential, or become oscillatory.
My own bet is end of that decade: somewhere between 2045 and 2050.
Ofc "full labor automation" has a certain spread of meaning. A sliver of population will always find ways to hold to a job or run one or many businesses. But there will be "enough" labor automation for it to be a social ticking bomb. That, in fact, does not depend on better models nor better AI than we have today. By 2045 there will be a couple of generations that has been outsourcing their thinking to AI for most of their adult lives. Some of them may still work as legal flesh of sorts, but many won't get to be middle man and will find no job.
Also, if you could replace your senator today by an untainted version of a frontier model (of today), would you do it? Would it be a better ruler? What are the odds of you not wanting to push that button in the next twenty years, after a few more batches of incompetent and self-serving politicians?
Complexity of our human world has gone up so much that humanity actually needs something like AI to ensure further progress. It's impossible to expect a human to learn all the fields in a shallow manner (and be a generalist politician) or one field in full depth (ie expert to push the frontier).
This feels like a really verbose way of saying "things have been growing fast for a while so they should continue to grow as fast for just as long", and then he places the burden on people to prove him wrong. Um, no, the burden of proof is shared, "this will just keep going" requires just as much proof as "this is going to level off" if you're just looking at trend lines.
It's better to look at the underlying factors. Money sources are drying up, nobody is making a profit outside of nVidia, most blackwell GPU's are likely not even installed yet and will probably be 2 generations behind when they finally are being used, data centers are hitting all sorts of obstacles getting built and powered and they're getting built slowly, most AI researchers seem to think that LLMs are a dead end, the newer models seem to be getting more expensive and sometimes worse, or even potentially are showing signs of model collapse (goblins..), the supposed productivity gains are not materializing.. AI has worse public sentiment than congress.. I could keep going. Some obscure "law" seems to pale in comparison to the hard evidence that the status quo is utterly unsustainable and none of these companies seem to have a realistic plan other than trying to become too big to fail essentially.
I like some of this guy's writing on other topics, but to me this is a prime example of what happens when you get public "intellectuals" talking about subjects far outside of their area of expertise. It's not as bad as Richard Dawkins latest fall into psychosis but it's basically the same phenonmenon.
If the scary AI is so inevitable, why do you feel such an overwhelming need to convince people about that? Surely you can just wait a bit, and they'll see for themselves.
By that reasoning, why even warn people about anything? Why do road construction crews put up signs saying "ROAD CLOSED AHEAD" when you can just drive on and see for yourself?
Indeed, why warn people about real things that exist in the world? That is EXACTLY the same as inciting fear about something imaginary (not even projected).
In your mind, dangers from AI are imaginary and not even projected, therefore, you don't see any reason to warn about them, because you don't think the dangers are real. You don't believe the road is actually closed up ahead, so you don't think it's necessary to post the sign.
In Scott's mind, dangers from AI are not a known fact, but are somewhere between highly probable and a near-certainty. In his mind, there are well-grounded justifications for believing that AI poses substantial future dangers to the public. Therefore he also believes he should inform people about this, and strives to convince skeptics, so that we might steer clear.
It's easy to understand why someone who believes what you believe about AI would of course not warn people about AI. It's also easy to understand why someone who believes what Scott believes about AI would want to warn people about AI. Your contention is with his confidence for being worried about AI, not his reason for wanting to warn people.
Gosh it's quite embarassing to have to spell it out, but you inserted the part about Scott's motivations. It can't be found in the text.
Neither can any specific discussion of what the dangers are and how we can steer clear. It all comes preplanted in your head. The only thing that Scott is playing on (as far as we can see) is your ingrained fear, by using an ominous headline, and a vague reference to something "scary" in the conclusion.
Of course there was no reason to "warn" you, you already believed in the scary future. Scott is just giving you fuel, which you seem to appreciate.
Yeah! And if climate change is so inevitable, why do the people who want to prevent it from happening seem hell-bent on convincing people that climate change is real?
1. It's not inevitable.
2. Those that see AI as an existential risk don't generally think it's a guarantee, but if it's say a 5% chance then that's worth addressing/mitigating.
3. That's not what this article was even about.
Sounds like the burden is on you to explain either
1. If you're not treating my claim as a black box, explain explicitly what is your model of what the article was about? Are you aware, for example of the last paragraph of the article? I think that WAS what the article was about. Do you have specific opinions on e.g. how I went wrong and where my model differs?
2. If you are treating it as a black box, what's your default expectation based on the law of Nothing Ever Happens?
Just kidding, you don't need to explain anything. A"I" fearmongers should though.
The point of the article is that people are historically bad at predicting when exponential curves plateau, even if they're correct that there will be a plateau.
This does *not* imply the inevitability of AGI. It does not imply AGI is necessarily bad.
It does mean that "the capabilities of AI will eventually plateau" offers no meaningful predictive power or relevance to the overall AI discussion.
I find him more interesting when he talks about non-AI topics. Lots of other interesting people are like this too. I'd rather get my knowledge on AI from people who have unique insights into it. Scott has a lot of unique perspectives of his own, but his views on AI are bog-standard for his social group.
Such a long article to say that neither side has a fucking idea about what will happen next.
While we're at it, the "exponentials are actually sigmoïds" meme is not necessarily true. While exponentials are never exponentials, sigmoids are not guaranteed. Overshoot-and-collapse examples also happen in tech, e.g. the dotcom bubble, or the successive AI winters.
Lindy's Law is not actually a law and many exact minds will be provoked by the very name; it also fails spectacularly in certain contexts (e.g. lifetime of a single organism, though not necessarily existence of entire species).
But at the same time, I am willing to take its invocation in the context of AI somewhat seriously. There is an international arms race with China, which has less compute, but more engineers and scientists. This sort of intellectual arms race does not exhaust itself easily.
A similar space race in the 1950s and 1960s progressed from first unmanned spaceflight to a moonwalk in mere 12 years, which is probably less than what it takes to approve a bicycle lane in Chicago now.
I keep seeing this. Where did it come from? Has China said that they intend to attack other countries using AI? Have other countries declared that they intend to attack China with AI?
Also, why does anyone believe that AI could actually be that dangerous, given it's inherent unpredictable and unreliable performance? I would be terrified to rely on AI in a life or death situation.
AI in war is like Palintirs whole business model. You have a system that can effectively deal with ambiguity and has superhuman performance on reasoning plus superhuman physical abilities via embodiment…
Inherent unpredictable and unreliable performance is also quite the feature of human beings as well.
It was a metaphor. I meant, and later clarified, an intellectual arms race.
BTW your handle is an actual Czech word, minus a diacritic sign ("křupan"), and a bit amusing one. It basically means hillbilly. Not that it matters, just FYI.
Anyway: AI will be used in military context, and it probably already is. Both for target acquisition and maybe even driving the weapon itself. As of now, the Ukrainians are almost certainly operating some AI-enabled killer drones.
Yeah, but a relatively mild one as far as the spectrum of Slavic curses go.
You would be a "křupan" if you wore agricultural boots to a fancy restaurant, or talk to a lady in an uncultured way. Basically, a hickey who was never taught proper manners.
It's not a law per se, but there are rules for reasoning under uncertainty to get the most out of what limited knowledge you have, and Lindy's law arises from that. To do better than Lindy's law requires having additional information about the problem beyond just the one data point.
The other thing people don’t understand is exponential curves are self similar. The start of an exponential looks like an exponential. People always look at and think ‘well that’s it it’s exponential now, have missed it, can’t sustain’. Nope.
Good example of this is number of submissions to neurips/icml/iclr. In 2017 that curve was exponential.
I like this article about how we should assume, at any given point, that we are exactly halfway through a phenomenon which relies on a single data point on a graph —-that apparently doesn’t need its relevance or importance explained— to illustrate that this is obviously true for AI in particular
I think you may have read a different article from me. The thesis of the article is summarised at the end:
> But if someone claims that the trend toward [X] will never reach some particular scary level, then the burden is on them to explain either:
> If they’re not treating [X] as a black box, and claim to be modeling the dynamics explicitly, then what is their model? Have they calculated the obvious things…
> If they are treating [X] as a black box, why isn’t their default expectation based on Lindy’s Law?
Like, the whole point is that in real life we do actually know things about situations and can model them; we fall back to Lindy's law when we know nothing at all. Further, arguments have justification to deviate from Lindy only when they give specifics about the situation they're modelling.
"Exponentials all tend to become sigmoids but you can't predict exactly when" is a true statement, but I'm not sure it needed an article.
This doesn't say much, and the author fights their own points a couple times, suggesting that they maybe didn't think through what they wanted to write until they were in the middle of writing it and started realizing their assumptions didn't match what they expected the data to say.
The point is the tiring arguments from AI skeptics saying “things are flattening, they have to” which while technically correct says nothing because no one knows when that will happen and we see no mechanism for this yet. Lindy’s law as a reasonable prediction under total uncertainty is interesting and insightful and a lot of people don’t know about it or why it holds. I did enjoy the reference to this!
Nah this is making a category error. You're assuming that AI skeptics agree that models are demonstrating intelligence along the same axis as humans and that with further improvement they will become equivalent to humans. I am an AI skeptic, and I disagree with this assessment.
Model reasoning is on an s-curve, which is improving.
Model intelligence is not the same as reasoning. It's a different axis, and one I have not seen much movement on.
See, humans have a recursive form of intelligence which is capable of self-reflection and introspection. LLMs can only reason about tokens which have already been emitted. Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs. Therefore it is a mistake to assume that continual improvement on the reasoning scale will result in something that is equivalent enough to humans on the intelligence axis to replace all labor.
> You're assuming that AI skeptics agree that models are demonstrating intelligence along the same axis as humans and that with further improvement they will become equivalent to humans.
No definitely not saying this and I don’t quite know what it means
> Model reasoning is on an s-curve, which is improving.
Is this saying two different things? I think I might agree with this in principle as in maybe there is some sort of s curve or something like it but do we see evidence of this? Where?
> Model intelligence is not the same as reasoning. It's a different axis, and one I have not seen much movement on.
Can you clarify this? What is the distinction and what makes you say you have “not seen much progress?”
> See, humans have a recursive form of intelligence which is capable of self-reflection and introspection. LLMs can only reason about tokens which have already been emitted
LLMs do self reflection and introspection in context, and tweaks such as value functions (serving a similar purpose to intuition or emotion) may make this better? Why do you feel self reflection and introspection are a fundamental limitation here? Models reason over tokens they have emitted and also with their own sense and learned behavior already. Are you just talking about continual learning? Also I feel people just latch onto LLMs as if this is all of AI. Why? SSMs, memory networks, recurrent neural networks etc etc etc are all part of AI but aren’t as popular because they can’t yet compete with LLMs in terms of scaling laws and training efficiency due to e.g. hardware and software optimization and investment being focused on LLMs. If something else comes along that works better we’ll just start scaling that.
> Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs.
Very strong statement, any theoretical or experimental basis for this? I also don’t particularly care personally other than as a point of curiosity. Why does it matter if AI systems will develop equivalent reasoning mechanisms as humans? In fact it may be much better not to.
> Therefore it is a mistake to assume that continual improvement on the reasoning scale will result in something that is equivalent enough to humans to replace all labor.
Idk I didn’t say this explicitly but I also dont think it matters if we have a system “equivalent to humans” or one that “replaces all labor”.
~~Slate Star Codex~~ Astral Codex Ten, the original article, was making the argument that "model intelligence" is on an s-curve and from there it was drawing the conclusion that the curve will likely continue and models will reach human level intelligence or beyond.
I am making that argument that how we measure model intelligence is flawed, and we are actually measuring something that is closer to "reasoning" than "intelligence". If you want evidence, we'll need a different form of tests, but how about I just gesture at the fact that GPT supposedly outscored PhDs on a broad range of subjects at least a year ago and to date is not replacing PhD jobs.
We see this pattern of high scores on tests but mediocre performance in the real world all over the place. From that, I draw the conclusion that it can reason like a PhD, but it can't think like a PhD.
So, we may see an s-curve on the measure of model reasoning but that doesn't imply they will overtake us or even match us on measures of intelligence.
As to your other questions:
> LLMs do self reflection and introspection in context,
> Why do you feel self reflection and introspection are a fundamental limitation here? Models reason over tokens they have emitted and also with their own sense and learned behavior already. Are you just talking about continual learning?
I disagree that models are reflecting and introspecting in a way equivalent to human intelligence here. They can reason over tokens which have been emitted, but by the same measure they cannot reason over tokens which have not been emitted. It's hard to make this point without drawing some diagrams, but I believe that human intelligence has internal loops, where many ideas may be turned over simultaneously before an action is taken. In comparison, an LLM might "feel uncertain" about a token before emitting it, but once it is emitted that uncertainty and the other near neighbor options are lost and the LLM is locked into the track that was set by the top-choice token. I think this is where hallucinations arise from, amongst other issues.
Context isn't sufficient for an internal reasoning loop because the tokens that compose context lose a lot of the information the network itself generated when picking those tokens. They occupy a much lower dimensional space than the "internal reasoning" processes of the transformer do.
>> Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs.
> Very strong statement, any theoretical or experimental basis for this?
It's just my theory, but this is what I have been gesturing at. You already know about RNNs so I'll put it in those terms: the core of an intelligent network should be an RNN, not a transformer, but we fundamentally cannot train a network like that to work like an LLM because backprop doesn't work when there is infinite recursion and without being able to bootstrap off of the knowledge and reasoning baked into human text, there's no sufficient source of training material beyond being embodied.
---
EDIT:
I missed this, which I also want to reply to:
> Why does it matter if AI systems will develop equivalent reasoning mechanisms as humans? In fact it may be much better not to.
I actually agree that it may be better if they did not develop equivalent reasoning, but I don't see a world in which machines replace human labor without being intellectually equivalent.
As I think about it though, "dumb" machines which can following reasoning but not think like humans are a rather scary proposition, honestly. Seems like a tool that would be wielded without restraint by those in power to control those who aren't.
But those skeptics are initially responding to the constant AI hype claims that we are exponentially growing to AGI. So this article is in fact just a (very poorly thought through) attempt at saying “nuh uh, the hype might be true, you can’t prove it’s not yet!
Yet the evidence is on the side of the hype? We don’t see any mechanism or cogent framework for what limits exist here theoretically that I’m aware of, are you? Epoch had a great article a year ago looking at several bottlenecks in terms of scale and back then we were about 4 orders of magnitude away from hitting them. We’re probably now closer to 3. Yet scale is only part of the performance equation, a fairly big chunk of progress is from algorithmic or curation related contributions. The point of the article is:
> But those skeptics are initially responding to the constant AI hype claims that we are exponentially growing to AGI.
This is a meaningless statement or at best just strawmanning.
We have very stable trends in observed performance in a LOT of different measurements and stable scaling laws that continue to hold true across at the largest scales we have built. These scaling laws have a reasonable theoretical justification and make predictions which continue to hold true. Pretraining perplexity does seem related to downstream measures of performance but those too show very stable trends in performance gains as well, even if you look at these in isolation. Look at the epoch capability index as a good summary statistic across a number of benchmarks.
So: yes we can and do make predictions with it and that’s how we get funding internally and externally to build at these scales.
Bottlenecks arise about 3 orders of magnitude from now.
On the other hand, any particular justification you have in mind for your point?
As for the basis of your objection, this smacks of intellectual gatekeeping. Plenty of good writing is by people who are not academically qualified or a recognized expert in the topic they're writing about. Indeed, very often, this kind of writing is better than writing by experts. Experts often write for other experts, and this can be exclusionary to lay readers. When a non-expert learns about a topic then writes about it for a general audience, they tend to be just a step ahead of the audience, and so the reader is able to learn about the topic by following the process of discovery and reasoning that the author just experienced. Sure, they often get some details or concepts wrong, but the discussion on a site like HN can draw other perspectives, and – very often – contributions from experts, which leads to further expansion in everyone's understanding of the topic.
HN's very ethos is to gratify intellectual curiosity, and this kind of writing is highly compatible with that.
I think there are many ways someone with his lack of expertise can still be valuable, including:
- Making connections to other subjects that an expert would miss. The hall of fame of sigmoid predictions is just excellent, I already know I'm going to be reminded of it some time in the future. Very entertaining way to get the point across.
- Writing about tricky concepts in a very accessible and elegant way, which experts are notoriously bad at doing themselves - they are often optimizing for other specialists.
- Being able to write with an air of speculation and experimentation with ideas that experts and institutions often can't afford. Experts have to maintain their track record; Scott Alexander can say "lol just double the timeline"
> as close as you can come to the modern dressed up version of a eugenicist
Their writing about genetic determinism is a turnoff to me too. But this essay is about a different topic, and a piece of writing by a writer who is known for writing substantively about a variety of topics should be evaluated on its own terms.
The subtle but consistent downvoting to a score of 0 to -1 feels very botted. No matter how I write the comments, anything counter the "AI" propaganda nets me a 0 or -1 total score. I don't care about the points whatsoever, but I find it interesting how there's never a massive downvote, just enough to keep such comments at 0 to -1.
May be reading into things too much, but it is a bit odd.
I don't know when (or if) AI will implode or succeed with any degree of provable certainty, because that's not my area of expertise. Rather, I can point out and discuss flaws in the common booster and doomer arguments, and identify problems neither side seems willing to discuss. That brings me cold comfort, but it's not enough to stake my money on one direction or another with any degree of certainty - thus I limit my exposure to specific companies, and target indices or funds that will see uplift if things go well, or minimize losses if things go pear-shaped.
I also think relying on such mathematics to justify a position in the first place is kind of silly, especially for technical people. Mathematical models work until they don't, at which point entirely new models must be designed to capture our new knowledge. On the other hand, logical arguments are more readily adapted to new data, and represent critical, rather than mathematical, thinking and reasoning.
Saying AI is going boom/bust because of sigmoids or Lindy's Law or whathaveyou is not an argument, it's an excuse. The real argument is why those things may or may not emerge, and how do we address their consequences within areas inside and outside of AI through regulation, innovation, or policy.
Basically a lot of people say "but isn't it also pretty likely that we DON'T get superintelligence?" And, yes, it is. But superintelligence being even a remotely plausible outcome is a big fucking deal. Your investment choices in that context are not important.
People really struggle to think rationally in the face of this shape of uncertainty.
So, his point with all the demand for rigor is to end on a hand-waved jump of faith from "improved AI models" to the mythical "superintelligence"?
That's the problem with 'singularity' arguments. The people making them ignore the fact that the mathematical definition of the word means 'the model of outcomes collapses to a single value' therefore the model stops being useful, yet they somehow claim to be able to make predictions beyond the singularity. It's like those shitty Facebook math posts where they divide both sides of the equation by 0 (the fact hidden by some sleight of hand), to 'prove' that 2=1.
The formulation of the singularity involves putting outrageous values into the parameters of the model of reality, and denominator ignorance, and then claiming 'rationally' determining that the consequences are too severe to ignore.
The singularity framing is really tough here, right? It comes from black hole physics. Essentially, at the event horizon, the way we know how to do physics stops working, and we rightly conclude that we can't currently say anything about the other side of the event horizon. It is not saying that nothing is occurring there. Matter, time, space, energy, whatever, that still is there (maaaaybe?) and is still undergoing something. It's just that we don't know what that is.
The same is true with using these tech singularity arguments. Like, in the age of superintelligence (if that happens), there will still be thing happening, the dawn will still come every day and the dusk will still too. It's just that we say our current ideas about that new day aren't that applicable to that new age (God, this sounds like a hippie).
However, unlike with black hole physics where we aren't even sure time can exist like we know, we are likely all going to be there in that new superintelligence age. We're still going to be making coffee and remembering bad cartoons from our youth. Like, the analogy to black hole physics breaks down here and maybe does a disservice to us. It's not a stark boundary at the Schwartzchild radius, it is a continuous thing, a messy thing, a volatile thing, and very importantly for the HN userbase, a thing that we control and have the choice to participate in.
We are not passively falling into the AGI world like the gnawing grinding gravity of a black hole.
So there are a couple interesting and meaningful changes at the event horizon, but it's not a mathematical singularity.
Why would that be? Nothing about Lindy's Law makes that promise. And even the SOTA in 2026 is over-estimated thanks to a trillion dollar industry trusted to not influence benchmarks.
My thought process RE: superintelligence/AGI is generally this:
* I personally don’t believe it’s likely to happen with silicon-based computing due to the immense power and resource costs involved just to get to where we are now; hence why I invest broadly to capitalize on what gains we actually attain using this current branch of AI research across all possible sectors and exposure rates
* If we do achieve AGI using silicon-based computing, its limited scale (requiring vast amounts of compute only deliverable via city-scale data centers) will limit its broader utility until more optimizations can be achieved or a superior compute platform delivered that improves access and dramatically lowers cost; again, investing broadly covers a general uplift rather than hoping for a specific winner
* If AGI is achieved, nobody - doomer or booster alike - will know what comes next other than complete and total destruction of existing societal structures or institutions. The stock market won’t explode with growth so much as immediately collapse from the disintegration of the consumptive base as a result of AGI quite literally annihilating a planet’s worth of jobs and associated business transactions. In this case, a broad spread protects me from harm by spreading the risk around; AGI will annihilate the market globally, but not all at once barring a significant global catastrophe instigated by it
* Which brings me to the worst outcome, where AGI follows the “if anybody builds it everyone dies” thought process: investment is irrelevant because we’re all fucked anyway.
And that’s just my investment approach. I’m too pragmatic to believe we’re at the bottom of the sigmoid curve, but too wise to begin guessing where we actually exist on it at present or how much is left in the current LLM-arm of AI research; I’m an IT dinosaur, not an AI scientist.
What I can point to is the continued demand destruction of consumer compute through higher costs and limited availability due to rampant AI speculation as proof that the harm is already here in a manner most weren’t predicting, while at the same time actual job displacement by AI is limited to the empty boasting of executives using it as a smoke screen for layoffs after RTO mandates failed to thin headcount sufficiently.
In the USA in particular, we’re facing a perfect storm of:
* consumer confidence collapse leading to a decline in spending on all goods, especially luxury ones, by all but the most monied demographics
* data center-driven cost increases (energy) and resource destruction (land, water, fossil fuel use)
* the eradication of government support for renewable energy that would’ve kept these costs in check
* the widening wealth gaps creating a new underclass not seen since before WW2
In other words, most of the discourse continues to revolve around hypotheticals of tomorrow rather than realities of today. That would be the lesson I’d hope more people take away from something like this, so we can finally begin addressing issues themselves rather than empty online circle jerking about who is right or wrong.
Add
Total collapse in government quality AND public trust to politicians
Total collapse of news media to slop and paid-for
Total collapse of culture
(Not just the US either)
> the widening wealth gaps creating a new underclass not seen since before WW2
I go back and forth on this. I think the reality is that "underclass" is a moving target. AI and automation makes things so cheap that today's underclass lives better than kings ever did.
My "plan" is hope for a benevolent intelligence that establishes a post-human government and then enjoy poat-scarcity society doing wood working or something.
Billionaires should probably be more worried.
If we don't understand the fundamental limits to any particular kind of trend, our default assumption should be that it will continue for about as long as it has gone on already.
We can, in fact, easily put a confidence interval on this. With 90% odds we're not in the first 5% of the trend, or the last 5% of the trend. Therefore it will probably go on between 1/19th longer, and 19 times longer. With a median of as long as it has gone on so far.
This is deeply counterintuitive. When we expect something to last a finite time, every year it goes on, brings us a year closer to when it stops. But every year that it goes on properly brings the expectation that it will go on for a year longer still.
We're looking at a trend. We believe that it will be finite. Our intuition for that is that every year spent, is a year closer to the end. But our expectation becomes that every year spent, means that it will last yet another year more!
How can we apply that? A simple way is stocks. How long should we expect a rapidly growing company, to continue growing rapidly?
For example, take something like a fad or trend; they don't have a hard end date like human lifespan, so it should follow Lindy's law.
However, the likelihood, on average across the population, that you observe a trend is going to be higher at the end of a trend lifecycle than at the beginning. This is baked into the definition - more and more people hear about a trend over time, so the largest quantity of observers will be at the end of the lifecycle, when the popularity reaches its peak.
In other words, if you are a random person, finding out about a trend likely means it is near the end rather than the middle.
It's the solution to the tank problem. You know that the enemy number their tanks as they're produced. You capture a tank and know its number, N. What's the best guess about how many tanks the enemy has produced so far? As a pure mathematical model with no other details, the best guess is 2N. Of course in reality you have some ideas about how long it takes to make a tank, how many resources the enemy has etc.
Analogously you have information about the way trends develop.
We have at least global warming and impending WW3, so that line of reasoning seems to work.
"The Lindy effect applies to non-perishable items, like books, those that do not have an "unavoidable expiration date"."
And later in the article you can see the mathematical formulation which says the law holds for things with a Pareto distribution [2]. I'd want to see some sort of good analysis that "the life span of exponential growth curves" is drawn from some Pareto distribution. I don't think it's completely out of the question. But I'm also nowhere near confident enough that it is a true statement to casually apply Lindy's Law to it.
[1]: https://en.wikipedia.org/wiki/Lindy_effect
[2]: https://en.wikipedia.org/wiki/Pareto_distribution
The argument given is the same as the one that I first ran across, not by that name, in https://www.nature.com/articles/363315a0. https://en.wikipedia.org/wiki/Doomsday_argument claims that it was a rediscovery of something that was hypothesized a decade article.
I hadn't tried to give it a name, or thought to apply it outside of that context.
As for the mathematical qualms, I'm a big believer in not letting formal mathematical technicalities get in the way of adopting an effective heuristic. And the heuristic reasoning here is compelling enough that I would like to adopt it.
I don't even think there are any "genuine" Lindy processes. What would those look like? Are they always half done?
That is the argument that is being made, but that only holds if the process is drawn from an underlying Pareto distribution with epsilon > 1[1].
As a counterexample, I’m jetlagged and disorientated. I go to sleep and wake up. It’s light outside but I don’t know the time. What’s the best guess of the time of day? By the “Lindy law” the best guess is that the process of daytime is halfway done so if I’m half-way through the day, my best guess is it’s noon.
Clearly that’s not the best guess that could be made. The distribution of times I might wake up is heavily skewed towards the morning, so the best guess is going to be some time in the morning. Now you might argue that we don’t know absolutely nothing about the cycle of the day and night and that’s true. But we also don’t know absolutely nothing about any of the examples in TFA either.
The point is, the times of day I might wake up are not drawn from a pareto distribution with the right parameters so the Lindy Law heuristic completely fails. In TFA the author gives no justification for why the remaining lifespan of the exponential growth of AI might be drawn from such a distribution either, so there’s no reason to think the heuristic will be accurate in that case either.
[1] From https://en.wikipedia.org/wiki/Lindy_effect. epsilon = 1 + 1/p where p is the parameter of the conditional expectation E[T-t|T>t] = p t. So only things with p positive but finite exhibit this effect. If p is negative then the best guess is going to be that the lifetime of the thing will end immediately because we’re already past the expected lifetime, and if p is infinite then the thing will never end so all finite guesses about its length are equally bad. So whether half-way is a good heuristic depends entirely on the underlying process and you’d need to demonstrate that the majority of things have positive p for half-way to be the best guess. That’s far from clear.
Try avoiding the heat death of the universe /s
People would confidently cite Lindy's law all the way near the end of a trend. Nothing would stop a Roman saying that just before the Fall.
We don't always need to "understand the fundamental limits" to a trend to see where it's going. Just to observe more than a random blind guess about them.
I also wouldn't trust the "see how much we're improving" benchmarks of a trillion dollar pre-IPO industry to begin with.
The law only applies for certain types of processes, and is completely wrong for other types (e.g. a human who has lived 50 years may live 50 more, but one who has lived 100 years will certainly not live 100 more). So the question becomes: what type of process are you looking at? And that turns out to be exactly the question you started with: is there a fundamental limit to this growth curve, or not.
So meeting exactly 1 100 year old alien makes it decent odds that's somewhere near the middle of their lifespan.
Because if you grabbed one random human, chances are you'd find someone roughly middle aged.
That would only be true if the underlying distribution worked that way, which for the human population it doesn’t (global median age is 31, global average expected lifespan is 73), so for humans if you grabbed a random human, chances are you’d find someone less than middle-aged.
Did you even read the post? It’s an estimate in the context where you have zero information on which to base an accurate estimate. The author’s point is that if you’re making a different estimate you need to actually say what information is informing that.
Human lifespan is obviously not a case where we have zero information, so what is your point in bringing that up?
But often we don't have the information that we wish. Even more often, the information that we have leads us to a story, that severely misleads us. Reminding ourselves of the zero information version of the story, can be an antidote to being mislead that way.
Therefore it is valuable to know how to make the most out of zero information. And if we have information, to think about exactly why it leads to a different conclusion.
"And we have information to think" - then we don't have zero information right?
But that's the entire idea of Bayesian reasoning. Which has proven to be surprisingly effective in a wide range of domains.
I'm all for quantifying my ignorance, and using it as an outside view to help guide my expectations. Read the book Superforecasting to understand how effective forecasters use an outside view to adjust their inside view, to allow them to forecast things more precisely.
So for example, the longer a time bomb ticks, the less likely it is to go off any time soon. (Assuming the timer isn't visible.) :)
[1] https://en.wikipedia.org/wiki/Rule_of_succession
We expect fresh processes to terminate quickly and long running processes to last for a while longer.
Edit: in particular I don’t agree with
One has to agree that the benchmark results are getting “scarier”, which is not automatically implied by finding more goals to optimize forThe important thing we can show it in hindsight only. We don't know which other tasks we are currently mistaken about requiring intelligence. Maybe none of them are?
We don't know. We don't know what intelligence is. If we look at decades and even centuries of attempts to define intelligence, it is all looks like a goalposts moving. When a definition of intelligence starts to include people or things we don't like to think as of intelligent ones, we change the definition.
The poster you replied to even used the word "sentient", which is quite interesting (warning: opinionated tangent ahead). Merriam-Webster defines it as "capable of sensing or feeling: conscious of or responsive to the sensations of seeing, hearing, feeling, tasting, or smelling". Feels like qualia. Or if we don't want to go the qualia route... Of course, we wouldn't call Helen Keller non-sentient, so presumably we "really" mean "can it sense or feel" -- well, sense is just "act/feel according to the environment", which you could argue in the case of an LLM would be their context... so we should "really" remove "sense" from the definition, probably. So "do LLMs feel" is probably closer to what "sentient" is being used for here. Since we don't have the obvious symmetry of "you are like me and I feel (therefore you probably feel)", it's way better/easier/feel-good-ier to prefer "LLMs don't feel" rather than "oh shit, it feels and model training is actually just torturing it into the right shape". LLMs as fundamentally non-intelligent also avoids the problems of "what does that say about people" or "we may have made 'AGI' and it wasn't what we thought it would be" or "we're not ready to talk about this yet".
That's what you've got wrong. We don't define functions that an LLM approximates. Autoregressive pretraining approximates an unknown function that produces text (that is what the brain does). RL doesn't approximate functions, it optimizes objective by finding an unknown function that performs better.
I don't think you can use lindy on trends as if trends are static objects, but that's another conversation.
> Do we really think things will move this fast? Sort of no - between the beginning of the project last summer and the present, Daniel’s median for the intelligence explosion shifted from 2027 to 2028. We keep the scenario centered around 2027 because it’s still his modal prediction (and because it would be annoying to change). Other members of the team (including me) have medians later in the 2020s or early 2030s, and also think automation will progress more slowly. So maybe think of this as a vision of what an 80th percentile fast scenario looks like - not our precise median, but also not something we feel safe ruling out. [2]
I don't think this changes your observation that he is "personally invested" (i.e. believes this trendline will continue), but I'm pretty sure when AGI doesn't appear in 2027, many people will believe that this invalidates the arguments being made here (or in the report). The actual report was intended to give a feel for what a near-future "disaster" AGI scenario, and settled on a date to give that some concrete immediacy. The collective review that gave that as a possible, but not inevitable date is still ongoing (they originally pushed their best estimate out a bit further, but now they think, judging by the goals that are being hit, their scenario was a little too conservative). [3]
[1] https://freddiedeboer.substack.com/p/im-offering-scott-alexa... [2] https://www.astralcodexten.com/p/introducing-ai-2027 [3] https://blog.aifutures.org/p/grading-ai-2027s-2025-predictio...
LLMs are nothing close to AGI and not going to lead to it, they can’t distinguish right from wrong, they can’t count, they can’t reason, they generate plausible text from a vast databank of connected text.
Apparently that is enough to fool many people but it’s nothing close to AGI which would require internal models of the world, reasoning etc.
We are nowhere close to AGI and the fools who predicted we were will unfortunately keep lying about their stated timelines when it inevitably doesn’t arrive. You’re already hedging and trying to caveat previous predictions, as OpenAI did with their AGI predictions which they’re now furiously back-pedalling on.
Do you mean "token" as in the LLM sense?
Or are you thinking that thoughts in the human brain are also constructed out of some sort of underlying "token" even though the abstract thought happens and is held before any words are used to try to communicate that thought to an external party?
They can predict likely sentences but not evaluate truth or logic. They can fairly reliably record facts about the world but not construct internal models of the world.
Argument?
Are LLMs close to being able to significantly help AGI researchers?
It is broadly true that Scott believes that AGI will come in the near future and from LLMs, although his reputation runs a ways deeper than that.
It's not at all surprising that they are increasingly getting labeled a cult (they aren't by traditional definition but there are a lot similarities). I'm really surprised it hasn't hit the mainstream yet given the connections to Elon, Thiel, frontier labs, dark crypto funding, FTX/SBF, some suicides and some murders. It's all a little nuts.
Meanwhile you got all the anti-democratic NRx people on the other side of it.
I suspect this new doc coming out on HBO will spark a media frenzy.
I expect benchmarks like ProgramBench will replace METR this year.
I mean, that's called "having an opinion".
It seems to me like you're trying to somehow imply that writing things to convince people of what you believe is somehow nefarious? It isn't! It's what we're all doing here right now! Putting it in a format that certain people will take more seriously doesn't make it nefarious either. I am quite confused by your point of view here.
Not interested in further arguments about this.
Yes, that's called "having an opinion". Typically people writing argumentative pieces are doing so because they have a belief about the matter. I'm not sure what exactly you expect here.
> if he's wrong I would hope he owns up to it
I think Scott Alexander is pretty good about that.
I mean.. this is 2026 right? You're not writing that comment from 2024 or something?
We see massive problems already where photos are just not believable anymore, nor is audio, and not even video actually with many people falling for AI fake image clips from the Gaza war for example. And since then these tools are MASSIVELY more powerful. Disinformation is essentially free, and the cost of truth has been static. Meaning the "buying power" of truth has collapsed and is falling faster and faster.
Anyone who dismissed AI risks a few years ago IS ALREADY PROVEN WRONG.
Right now we have an incredibly smart thing with severe short term memory loss, and it’s hard for us to reconcile that as it’s so different from us.
What has he been up to since finishing the finest work of literature ever produced, Harry Potter and the Methods of Rationality? I’ve been patiently awaiting a sequel!
All exponential eventually becomes a sigmoid because exponential growth always expose limiting factors that weren't limiting at the beginning. Silicon manufacturing had lots of room for high-margin customers like Nvidia even a year ago (by the mere virtue of outbidding lower-margin customers), but now it is mostly gone, and no amount of money will make fabs build themselves overnight.
[1]: https://stockanalysis.com/stocks/nvda/metrics/revenue-by-seg...
The naive expectation is that AI will slow down b/c Moore's law is coming to an end, but if you really think about the models and how they are currently implemented in silicon, they are still inefficient as hell.
At some point someone will build a tensor processing chip that replaces all the digital matmuls with analogue logamp matmuls, or some breakthrough in memristors will start breaking down the barrier between memory and compute.
With the right level of research funding in hardware, the ceiling for AI can be very high.
I'm pretty sure there's a 3 year design goal starting this year that'll do that to any of the qwen, deepseek, etc models. There's a lot you could do with sped up models of these quality.
It might even be bad enough that the real bubble is how much we don't need giant data centers when 80-90% of use cases could just be a silicon chip with a model rather than as you say, bloated SOTA
If there's a breakthrough in memristors, you could end up with another 20x reduction in circuit elements (get rid of memory bottlnecks, start doing multiplication ops as log transform voltage addition)
The ceiling is ultra high for how far AI can go.
But yeah, as soon as the digital models start to plateau, ASICs and then this will happen.
I'm not even kidding. Modern ML systems already eat errors - what's one more error type for them to eat?
All the easily verifiable domains such as mathematics, coding, and things that can be run inside a reasonable simulation are falling very very fast.
By next year if not sooner, mathematicians will be wildly outpaced by LLMs for reasoning.
> research showing with the right policy, Rest of the owl.
Only if you fully detail the behavior of the system.... at that point why use a chatbot? You've coded the entire thing.
> first as good as human
We'll see. Chatbots are only as capable as you detail them to be
So it's not impossible to have things that seem orthogonal, like generation speed or context length, have an impact on quality of result.
>My understanding is that this represents 3-4 “generations” of different technology (propellers, turbojets, etc). Each technology went through normal iterative improvement, then, when it reached its fundamental limits, got replaced by a better technology. The last technology, ramjets, reached its limit at about 3500 km/h, and there wasn’t the economic/regulatory will to develop anything better, so the record stands.
You don't have one sigmoid, you have multiple each stacked on top of each other. Airplanes aren't just one technology they are multiple technologies that happen to do the same thing.
Each one is following a sigmoid perfectly. It only looks exponential(ish) because of unpredictable discoveries that let you switch to another sigmoid that has a higher maximum potential.
The same is true in AI. If you used the same architecture as GPT2 today you're in for a bad time training a new frontier model. It's only because we have dozens of breakthroughs that the capabilities of models have improved as much as they have.
That said exponential and sigmoids are the wrong model to use for growth. Growth is a differential equation. It has independent inputs, it has outputs and some of those outputs are dependent inputs again through causal chains of arbitrary complexity. What happens depends entirely on what the specific DE that governs the given technology is. We can easily have a chaotic system with completely random booms and busts which have no deep fundamental rhyme or reason. We currently call that the economy.
The idea that exponential growth will continue with stacked sigmoids is also not a given. An example is the nail. Nails used to be about half a percent of US GDP. That's a pretty big number! A series of innovations stacked on each other (each innovation having its own sigmoid) to reduce the cost of nails. Nails dropped in cost by over 90%.
But eventually nail manufacturing reached a floor. And since the mid-20th century, we haven't gotten much better at making nails. The cost of nails actually started increasing slightly. We ran out of new innovation sigmoids, so we got stuck on the last one.
So what you actually have to predict is whether there will continue to be new sigmoids, not whether the existing sigmoid will asymptote (we already know it will).
This is much more difficult to forecast, because new sigmoids (major new innovations) tend to be unpredictable events. Not only are the particulars difficult to forecast (if they were knowable, the innovation would have already happened), but whether there will be a major innovation or not is also hard to forecast, because they are distinct and separate from any existing sigmoid trend.
So we are left with the idea that all current innovations in AI will asymptote in their scaling as they reach the plateau of the sigmoid, but there may be new sigmoids that keep the overall trend up. Or there may not be. We don't know.
That's not very satisfying, so we'll get to keep reading articles like this one.
Return on investment can be too low because the investment required is really high, but it can also be too low because the returns are just limited. If prices had dropped 90%, surely nails became even more ubiquitous, but at that stage there's only so much more money to dig out of the cost reduction hole. It feels plausible that there may have been ideas about more digging that could be done, but the reward just wasn't there in the market, especially versus just selling what worked.
I bring it up because the distinction in one specimen may speak to a larger trend: do new sigmoid developments tend to fail to materialize more often because of serious physical limits / lack of good ideas, or because of limitations to ROI? (Or, other things?)
In the arena of AI, the ROI on more intelligence/unit-cost seems pretty high right now. So, it seems like the difficulty of applying any potential innovations would have to be staggering for none to be pursued. Or, there'd have to just not be any good ideas to try.
Overall, I think there's ideas to try. So in my opinion, that shapes out to justify a bullish sentiment on sigmoids continuing to stack until the perceived potential gains from more intelligence/unit-cost somehow fall off.
Like I said, I don't disagree, we really don't know. But I feel it's a good bet that there's more coming.
The ROI for cornering the nail market seems like it could have been big. The ROI for making something significantly more efficient than a ICE would have been very high for most of the last century and technology that is better in many respects than ICEs does now exist, but it took us roughly a century to get there. The ROI for coming up with something that's better than the ~1% annual efficiency improvement on turbofans would be extremely high, but we don't know what that is (probably some sort of propfan, but that idea's over 50 years old...)
That said... if the exponential is made of stacked sigmoids, it's still an exponential on the whole! The fact that it's made of stacked sigmoids is relevant to the engineers making it, but not so relevant to the users or those otherwise affected by it.
Either you black-box the curve and assume that you will keep stacking sigmoids for about as long as you already have already seen.
Or you white box it and make some actual technical argument about why the curves can’t keep stacking.
There are plenty of plausible arguments here. Scott is not arguing that the exponential must go on forever.
He’s making a meta-level point about the debate; you have to pick one of the above, and you can’t just argue that “now is the time the s-curves will stop stacking” without providing some justification.
But in any specific industry or area? You often get a bunch of big discoveries, and then there is a long period of no important discoveries, because we've figured out the main aspects of that technological paradigm. The technology becomes commoditized and standard.
And that's the trillion dollar question with AI right now -- will we soon exhaust the potential of the current LLM paradigm? And we'll just have 20 or 30 years of figuring out mainly how to make LLMs cheaper and how integrate them into business processes, before somebody comes up with another fundamental breakthrough?
Or are we only 10% of the way in developing the current LLM paradigm? Where a decade from now models virtually never make mistakes and are smarter than basically any tenured faculty member in their field?
For all we know it becomes negative because all the people who understood how to train a trillion parameter model get killed by an asteroid during a conference.
What exactly are these dozens of breakthroughs? Most frontier models architectures today still look very much like GPT2 at their core. There were various of improvements like instructgpt, finetuning techniques, efficiency improvements with kv caches, faster attention, lora, better tokenizers, etc. Most of these are for making things run faster. The biggest differentiator has probably been data curation and post-training data and the ability to fit more into the model. But I think we had few breakthroughs that would fall into the category of different technologies.
(This is a bit disingenuous, as lots/most of work is spent on the scaling and training side of things.)
I suspect that he started down the path of considering them more deeply here but found that they didn't add much to the analysis. A stack of sigmoids ultimately either gives you a single sigmoid if you run out of new innovations, or an exponential if you don't.
At first the models turned a 5 minute task into a 5 second task (by 5 seconds I mean a very short amount of time, not precisely 5 seconds). Then they turned a 15 minute task into a 5 second task.
Opus 4.6 completes 8 hour tasks all the time but (at least in my experience) it isn't spitting the answer out in 5 seconds anymore. It's using chain of thought and tools and the time to completion is measured in minutes or maybe hours.
In my experiments with local LLMs, a substantial part of the gap between frontier and local (for everyday use) is in tooling and infrastructure.
That is why I am sympathetic to the idea we are leveling off. But to bring in the air speed example from the article, I don't think we've reached the equivalent of the ramjet yet. I suspect in the coming years there will be new architectures, new hardware, and new ways to get even more capable models.
I don't know if they can get their numbers right this way, but this seems a way more useful metric, than theoretic capabilities.
It is purely a test of capabilities (can it do a thing that takes a human $X hours), not efficiency (how fast will it do it).
At least I want AI to solve my problems, not score high on a academic leaderboard.
I trained an LLM to write the whole Harry Potter series, and that took JK Rowling like 17 years.
For my next point on the graph, I'll train the LLM to write the Bible, something that took humans >1500 years.
The tasks are obviously all of the form "Go do this, and if you get the following output you passed". Setting up a web server apparently takes 15 minutes for a human, which is news to me since I'm able to search for https://gist.github.com/willurd/5720255, find the python one-liner, and copy it within about ten seconds.
Anyway, this is cool but it does not mean Claude can perform any human tasks that take less than 8 hours and are within its physical capabilities.
I'm curious what people really mean when they say this. Intelligence is famously hard to define, let alone measure; it certainly doesn't scale linearly; it only loosely correlates to real-world qualities that are easy to measure; etc. Are you referring to coding ability or...?
emoji face with eyes rolling upward
Scott makes a Lindy effect argument which is plausible, but don't let that fool you, we still don't know what's going to happen.
Short of a third sigmoid appearing in the ML CompSci space, perhaps in the form of ongoing, repeated step-optimisations which will also have diminishing returns, intelligence growth is now limited a few scaling problems that have already been worked on for a very long time.
Transistors, which have been doubling for almost a century now, but Moores Law has already plateaued and reached limits on energy efficiency, and simply building new fabs is not something that we can do exponentially. And the other growth limiter is electricity - there is no exponential supply of fossil fuels or power plants. Although manufacturing has scaled, PV tech improvements are also plateauing - and while storage is getting cheaper, it's still not economical vs fossil fuels (meaning: when we have to switch to it, the growth slows down further) and we are unlikely to see battery efficiency sigmoid enough to maintain the AI sigmoid.
I don't mean to be bearish here. There's so much money sloshing around that we can afford to put the smartest people, using unlimited tokens, on the task of finding small, incremental gains on the CompSci side of things that will have large monetary payoffs - hopefully allowing further scaling and increased emergent abilities of LLMs. Maybe we can squeeze the algos for quite a while. But I don't see that maintaining the same level of exponential as unlocking unlimited data or maxxing out the world's energy/fab capacity for long.
And I don't see why this is a massive issue except for the people who want to have some god-like super AI? Frontier LLMs are genuinely magic. Not "won't delete your production database" magic, but definitely a massive productivity gain for competent knowledge workers.
[1]. https://www.dwarkesh.com/p/dario-amodei-2
In this model, the exponential growth that everybody is freaking out about is only the realization of the modular software dream ("we'll only have to write an ORM once for all of human history!") and the sheer amount of knowledge in libraries.
It's at least falsifiable.
The idea is simply that the basic idea behind LLMs, that you're distilling the entropy out of the entire available world of text, is antithetical to creativity.
Further developing on the theme of self-play, humans have the ability to sense what we want (intellectually) and reach for it communally over thousands of years. It's an innate quality, and if AI starts participating (contrast to giving people psychosis) we will all be able to tell.
Well.. that's not true is it. There are human cultures that haven't reached for anything for thousands of years even though they clearly saw what western culture was doing and that they were being left behind badly.
My mental model has been 3D computer graphics: doubling the polygon count had huge returns early on but delivered diminishing returns over time.
Ultimately, you can't make something look more realistic than real.
I don't know what the future holds, but the answer to the question "can LLMs be more realistic than real" will determine much about whether or not you think the curve will level off soon.
In the same fashion, LLMs have to pay for themselves to keep the trendlines going. In a whole-systems -sense, mind, not "$2000/month is cheaper than hiring a developer" while the rest of the economy collapses.
The situation is drastically different for problems that require interaction was the physical world to determine success.
As soon as you add a powerful simulator for physical problems to the self learning experience of the AI, you are extremely hampered by the large amount of needed computation.
This is the crux of the article. To a large extent continued progress depends on a stable increase in compute, an increase in training data, and an increase in good ideas to squeeze more out of both of them.
One calculation you could do is a survival function: for each of the above, how long before it is disrupted? For example, China could crack down on AI or invade Taiwan. Or data centers become politically unpopular in the US. Or, we could run out of great ideas. Very hard to predict.
But what's going on in here is not that - it's reading tea leaves for maximum dramatic effect.
For example, When a car starts, it's speed and acceleration become more than zero. But what about rate of change in higher degrees? It suddenly doesn't change from zero acceleration to non-zero. That means the car has a non-zero derivative at all degrees. In other words, the movement is exponential. The same thing happens in reverse when the car reaches a constant speed.
All positive growth eventually flattens out and becomes sigmoid, but a lot of phenomena experience negative growth and nose dive. No gentle curve, but a hard kink and perfect flat line at zero. Forever. I think it would be a stretch to categorize that pattern as sigmoid. Predicting a sigmoid pattern for negative growth implies some sort of a soft landing (depending on your definition of soft).
We can think of many populations that are no longer with us. So just a caution about over applying this reasoning in the negative case.
https://en.wikipedia.org/wiki/Seneca_effect
2. What's even worse than predicting that some growth curve flattens before X happens is predicting it will flatten before X happens but after Y happens, which is what we see when it comes to AI in software development. Too many people predict that AI will be able to effectively write most software, replacing software engineers, yet not be able to replace the people who originate the ideas for the software or the people who use them. I see no reason why AI capability growth should stop after the point it's able to write air-traffic control or medical diagnosis software yet before the point where it's able to replace air traffic controllers and doctors.
3. While we don't know much about AI (or, indeed, intelligence in general), we do know something about computational complexity. Some predictions about "scary things" happening (the ones I'm guessing Alexander is alluding to, though I can't be certain) do hit known computational complexity limits. Most systems affecting people are nonlinear (from weather to the economy). Predicting them requires not intelligence but computational resources. Controlling them, similarly, requires not intelligence but either computational resources or other resources. It's possible that people choose to give control over resources to computers (although probably not enough to answer many tough, important questions), although given how some countries choose to give control to people with below-average intelligence (looking at you, America), I don't see why super-human intelligence (if such a thing even exists) would be, in itself, exceptionally risky.
This is kinda laughable. Scott has been thinking and writing about AI for a long time
https://news.ycombinator.com/item?id=46199723
Strictly speaking, the original paradigm of scaling laws doesn't work any more. The assumption that we could achieve better performance simply through "vertical scaling" ie infusing models with exponentially more parameters and pre-training data, is no longer the driving force of AI progress.
Instead, the industry has pivoted toward inference-time scaling. Rather than relying solely on a massive, static neural network, modern architectures allocate more compute during the actual generation process, allowing the model to "think" and verify its logic dynamically.
Furthermore, the latest state-of-the-art models are no longer pure LLMs; they are compound neuro-symbolic systems that integrate external tools like REPLs, databases, and structured skill documentation to archive things pure LLM vertical parameter scaling was not able to do.
Strictly speaking, "Why do scaling laws work?" is a question about the theoretical reasons the asymptotic decay takes the particular mathematical shape that it does.
Is the "capability" number on these LLM strengh graphs as tangible?
I think it would be interesting to visit a reality that obeys arbitrary abstractions, but I would personally never go there.
This is not the context in which I hear about sigmoids vs exponentials. I hear it in regards to “the singularity”, not that AI won’t reach some pre-specified level. You may get AGI, you aren’t getting a singularity.
All exponentials eventually become sigmoids? Don’t think this can be true without qualifiers.
The issue is that the exponential-looking part of the sigmoid might contain all of human history, sure, but most folks who espouse this theory probably agree that over time everything reaches a steady-enough state to be considered non-exponential, or become oscillatory.
https://xcancel.com/peterwildeford/status/202963666232244661...
Going to need a big citation for that claim
Ofc "full labor automation" has a certain spread of meaning. A sliver of population will always find ways to hold to a job or run one or many businesses. But there will be "enough" labor automation for it to be a social ticking bomb. That, in fact, does not depend on better models nor better AI than we have today. By 2045 there will be a couple of generations that has been outsourcing their thinking to AI for most of their adult lives. Some of them may still work as legal flesh of sorts, but many won't get to be middle man and will find no job.
Also, if you could replace your senator today by an untainted version of a frontier model (of today), would you do it? Would it be a better ruler? What are the odds of you not wanting to push that button in the next twenty years, after a few more batches of incompetent and self-serving politicians?
Yeah well my prophet says he can beat up your prophet in a fight.
---
Here in reality, I'm not accustomed to taking random predictions without backing evidence as if they were truth.
Lol
It's better to look at the underlying factors. Money sources are drying up, nobody is making a profit outside of nVidia, most blackwell GPU's are likely not even installed yet and will probably be 2 generations behind when they finally are being used, data centers are hitting all sorts of obstacles getting built and powered and they're getting built slowly, most AI researchers seem to think that LLMs are a dead end, the newer models seem to be getting more expensive and sometimes worse, or even potentially are showing signs of model collapse (goblins..), the supposed productivity gains are not materializing.. AI has worse public sentiment than congress.. I could keep going. Some obscure "law" seems to pale in comparison to the hard evidence that the status quo is utterly unsustainable and none of these companies seem to have a realistic plan other than trying to become too big to fail essentially.
I like some of this guy's writing on other topics, but to me this is a prime example of what happens when you get public "intellectuals" talking about subjects far outside of their area of expertise. It's not as bad as Richard Dawkins latest fall into psychosis but it's basically the same phenonmenon.
In Scott's mind, dangers from AI are not a known fact, but are somewhere between highly probable and a near-certainty. In his mind, there are well-grounded justifications for believing that AI poses substantial future dangers to the public. Therefore he also believes he should inform people about this, and strives to convince skeptics, so that we might steer clear.
It's easy to understand why someone who believes what you believe about AI would of course not warn people about AI. It's also easy to understand why someone who believes what Scott believes about AI would want to warn people about AI. Your contention is with his confidence for being worried about AI, not his reason for wanting to warn people.
Neither can any specific discussion of what the dangers are and how we can steer clear. It all comes preplanted in your head. The only thing that Scott is playing on (as far as we can see) is your ingrained fear, by using an ominous headline, and a vague reference to something "scary" in the conclusion.
Of course there was no reason to "warn" you, you already believed in the scary future. Scott is just giving you fuel, which you seem to appreciate.
If only there were a way to see more of Scott's thoughts on the subject of AI..
This does *not* imply the inevitability of AGI. It does not imply AGI is necessarily bad.
It does mean that "the capabilities of AI will eventually plateau" offers no meaningful predictive power or relevance to the overall AI discussion.
The entire plot of the Lord of the Rings could probably be compressed into less than 10 kB of text too.
Edit: this seems to be a controversial comment, but IMHO a blog of Scott Alexander's type is an art form, not just a communication channel.
While we're at it, the "exponentials are actually sigmoïds" meme is not necessarily true. While exponentials are never exponentials, sigmoids are not guaranteed. Overshoot-and-collapse examples also happen in tech, e.g. the dotcom bubble, or the successive AI winters.
Lindy's Law is not actually a law and many exact minds will be provoked by the very name; it also fails spectacularly in certain contexts (e.g. lifetime of a single organism, though not necessarily existence of entire species).
But at the same time, I am willing to take its invocation in the context of AI somewhat seriously. There is an international arms race with China, which has less compute, but more engineers and scientists. This sort of intellectual arms race does not exhaust itself easily.
A similar space race in the 1950s and 1960s progressed from first unmanned spaceflight to a moonwalk in mere 12 years, which is probably less than what it takes to approve a bicycle lane in Chicago now.
I keep seeing this. Where did it come from? Has China said that they intend to attack other countries using AI? Have other countries declared that they intend to attack China with AI?
Also, why does anyone believe that AI could actually be that dangerous, given it's inherent unpredictable and unreliable performance? I would be terrified to rely on AI in a life or death situation.
Inherent unpredictable and unreliable performance is also quite the feature of human beings as well.
BTW your handle is an actual Czech word, minus a diacritic sign ("křupan"), and a bit amusing one. It basically means hillbilly. Not that it matters, just FYI.
Anyway: AI will be used in military context, and it probably already is. Both for target acquisition and maybe even driving the weapon itself. As of now, the Ukrainians are almost certainly operating some AI-enabled killer drones.
You would be a "křupan" if you wore agricultural boots to a fancy restaurant, or talk to a lady in an uncultured way. Basically, a hickey who was never taught proper manners.
Except innovation. When one sigmoid tapers off we keep finding new ones to keep the climb going.
Good example of this is number of submissions to neurips/icml/iclr. In 2017 that curve was exponential.
I could probably make increasingly larger fires for years if I was willing to burn the entire world.
> But if someone claims that the trend toward [X] will never reach some particular scary level, then the burden is on them to explain either:
> If they’re not treating [X] as a black box, and claim to be modeling the dynamics explicitly, then what is their model? Have they calculated the obvious things…
> If they are treating [X] as a black box, why isn’t their default expectation based on Lindy’s Law?
Like, the whole point is that in real life we do actually know things about situations and can model them; we fall back to Lindy's law when we know nothing at all. Further, arguments have justification to deviate from Lindy only when they give specifics about the situation they're modelling.
This doesn't say much, and the author fights their own points a couple times, suggesting that they maybe didn't think through what they wanted to write until they were in the middle of writing it and started realizing their assumptions didn't match what they expected the data to say.
I really don't get the point of what I just read.
Model reasoning is on an s-curve, which is improving.
Model intelligence is not the same as reasoning. It's a different axis, and one I have not seen much movement on.
See, humans have a recursive form of intelligence which is capable of self-reflection and introspection. LLMs can only reason about tokens which have already been emitted. Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs. Therefore it is a mistake to assume that continual improvement on the reasoning scale will result in something that is equivalent enough to humans on the intelligence axis to replace all labor.
No definitely not saying this and I don’t quite know what it means
> Model reasoning is on an s-curve, which is improving.
Is this saying two different things? I think I might agree with this in principle as in maybe there is some sort of s curve or something like it but do we see evidence of this? Where?
> Model intelligence is not the same as reasoning. It's a different axis, and one I have not seen much movement on.
Can you clarify this? What is the distinction and what makes you say you have “not seen much progress?”
> See, humans have a recursive form of intelligence which is capable of self-reflection and introspection. LLMs can only reason about tokens which have already been emitted
LLMs do self reflection and introspection in context, and tweaks such as value functions (serving a similar purpose to intuition or emotion) may make this better? Why do you feel self reflection and introspection are a fundamental limitation here? Models reason over tokens they have emitted and also with their own sense and learned behavior already. Are you just talking about continual learning? Also I feel people just latch onto LLMs as if this is all of AI. Why? SSMs, memory networks, recurrent neural networks etc etc etc are all part of AI but aren’t as popular because they can’t yet compete with LLMs in terms of scaling laws and training efficiency due to e.g. hardware and software optimization and investment being focused on LLMs. If something else comes along that works better we’ll just start scaling that.
> Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs.
Very strong statement, any theoretical or experimental basis for this? I also don’t particularly care personally other than as a point of curiosity. Why does it matter if AI systems will develop equivalent reasoning mechanisms as humans? In fact it may be much better not to.
> Therefore it is a mistake to assume that continual improvement on the reasoning scale will result in something that is equivalent enough to humans to replace all labor.
Idk I didn’t say this explicitly but I also dont think it matters if we have a system “equivalent to humans” or one that “replaces all labor”.
I am making that argument that how we measure model intelligence is flawed, and we are actually measuring something that is closer to "reasoning" than "intelligence". If you want evidence, we'll need a different form of tests, but how about I just gesture at the fact that GPT supposedly outscored PhDs on a broad range of subjects at least a year ago and to date is not replacing PhD jobs.
We see this pattern of high scores on tests but mediocre performance in the real world all over the place. From that, I draw the conclusion that it can reason like a PhD, but it can't think like a PhD.
So, we may see an s-curve on the measure of model reasoning but that doesn't imply they will overtake us or even match us on measures of intelligence.
As to your other questions:
> LLMs do self reflection and introspection in context,
> Why do you feel self reflection and introspection are a fundamental limitation here? Models reason over tokens they have emitted and also with their own sense and learned behavior already. Are you just talking about continual learning?
I disagree that models are reflecting and introspecting in a way equivalent to human intelligence here. They can reason over tokens which have been emitted, but by the same measure they cannot reason over tokens which have not been emitted. It's hard to make this point without drawing some diagrams, but I believe that human intelligence has internal loops, where many ideas may be turned over simultaneously before an action is taken. In comparison, an LLM might "feel uncertain" about a token before emitting it, but once it is emitted that uncertainty and the other near neighbor options are lost and the LLM is locked into the track that was set by the top-choice token. I think this is where hallucinations arise from, amongst other issues.
Context isn't sufficient for an internal reasoning loop because the tokens that compose context lose a lot of the information the network itself generated when picking those tokens. They occupy a much lower dimensional space than the "internal reasoning" processes of the transformer do.
>> Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs.
> Very strong statement, any theoretical or experimental basis for this?
It's just my theory, but this is what I have been gesturing at. You already know about RNNs so I'll put it in those terms: the core of an intelligent network should be an RNN, not a transformer, but we fundamentally cannot train a network like that to work like an LLM because backprop doesn't work when there is infinite recursion and without being able to bootstrap off of the knowledge and reasoning baked into human text, there's no sufficient source of training material beyond being embodied.
---
EDIT:
I missed this, which I also want to reply to:
> Why does it matter if AI systems will develop equivalent reasoning mechanisms as humans? In fact it may be much better not to.
I actually agree that it may be better if they did not develop equivalent reasoning, but I don't see a world in which machines replace human labor without being intellectually equivalent.
As I think about it though, "dumb" machines which can following reasoning but not think like humans are a rather scary proposition, honestly. Seems like a tool that would be wielded without restraint by those in power to control those who aren't.
> But those skeptics are initially responding to the constant AI hype claims that we are exponentially growing to AGI.
This is a meaningless statement or at best just strawmanning.
The evidence is just whatever it is - we cannot make predictions with it.
So: yes we can and do make predictions with it and that’s how we get funding internally and externally to build at these scales.
Bottlenecks arise about 3 orders of magnitude from now.
On the other hand, any particular justification you have in mind for your point?
Attention is all you need took us by surprise and we don't know how big the wave is let alone if there are other waves behind it.
As for the basis of your objection, this smacks of intellectual gatekeeping. Plenty of good writing is by people who are not academically qualified or a recognized expert in the topic they're writing about. Indeed, very often, this kind of writing is better than writing by experts. Experts often write for other experts, and this can be exclusionary to lay readers. When a non-expert learns about a topic then writes about it for a general audience, they tend to be just a step ahead of the audience, and so the reader is able to learn about the topic by following the process of discovery and reasoning that the author just experienced. Sure, they often get some details or concepts wrong, but the discussion on a site like HN can draw other perspectives, and – very often – contributions from experts, which leads to further expansion in everyone's understanding of the topic.
HN's very ethos is to gratify intellectual curiosity, and this kind of writing is highly compatible with that.
- Making connections to other subjects that an expert would miss. The hall of fame of sigmoid predictions is just excellent, I already know I'm going to be reminded of it some time in the future. Very entertaining way to get the point across.
- Writing about tricky concepts in a very accessible and elegant way, which experts are notoriously bad at doing themselves - they are often optimizing for other specialists.
- Being able to write with an air of speculation and experimentation with ideas that experts and institutions often can't afford. Experts have to maintain their track record; Scott Alexander can say "lol just double the timeline"
It's good that you come to HN expecting high standards of content and discussion.
> sCotT aLexAndEr
This counts as a sneer, which is against the guidelines (https://news.ycombinator.com/newsguidelines.html). You may not owe the writer anything but you owe the audience better than this.
> as close as you can come to the modern dressed up version of a eugenicist
Their writing about genetic determinism is a turnoff to me too. But this essay is about a different topic, and a piece of writing by a writer who is known for writing substantively about a variety of topics should be evaluated on its own terms.
Allowing slop articles like this literally prints them evaluation money.
May be reading into things too much, but it is a bit odd.