Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲What can we take away from the ‘stochastic parrot’ saga? (inferencemagazine.substack.com)

27 points by Philpax 2 days ago | 55 comments

parpfish 2 hours ago [-]

My take on stochastic parrots is the similar to the authors concluding section.

This debate isn’t about the computations underlying cognition, it’s about wanting to feel special.

The contention that “it’s a stochastic parrot” usually implied “it’s merely a stochastic parrot, and we know that we must be so much more than that, so obviously this thing falls short”.

But… there never was any compelling proof that we’re anything more than stochastic parrots. Moreover, those folks would say that any explanation of cognition falls short because they can always move the goal posts to make sure that humans are special.

jfengel 2 hours ago [-]

The way I see it, the argument comes down to an assertion that as impressive as these technologies are, they are a local maximum that will always have some limitation that keeps it from feeling truly human (to us).

That's an assertion that has not been proven one way or the other. It's certainly true that progress has leveled off after an extraordinary climb, and most people would say it's still not a fully general intelligence yet. But we don't know if the next step requires incremental work or if it requires a radically different approach.

So it's just taking a stance on an unproven assertion rather than defining anything fundamental.

MichaelZuo 2 hours ago [-]

Well it’s even more dismal in reality.

Gather enough parrots on a stage and at least one can theoretically utter a series of seemingly meaningfuly word-like sounds that is legitimately novel, that has never been uttered before.

But I doubt any randomly picked HN user will actually accomplish that in fact before say age 40. Most people just don’t ever get enough meaningful speaking opportunities to make that statistically likely. There’s just too many tens of billions of people that have already existed and uttered words.

anon373839 2 hours ago [-]

That not reality, it’s theory.

MichaelZuo 2 hours ago [-]

Can you write down the actual argument?

It seems to be plausible, to me, given enough parrots.

thfuran 2 hours ago [-]

Novel utterances happen all the damn time. See https://venturebeat.com/business/15-of-all-google-searches-a... for tangential evidence.

Edit: actually that looks like it's just an offhand mention of Google's initial report, but I don't really feel like spending more time tracking down details to rebut so silly a claim.

MichaelZuo 51 minutes ago [-]

Unique gibberish and spelling errors also count as a “unique search” so I don’t see how it relates.

Do you have an argument that makes sense?

goatlover 2 hours ago [-]

We're conscious animals who communicate because we navigate social spaces, not because we're completing the next token. I wonder about hackers who think they're nothing more than the latest tech.

int_19h 2 hours ago [-]

You postulate it as if these two are mutually exclusive, but it's not at all clear why we can't be "completing the next token" to communicate in order to navigate social spaces. This last part is just where our "training" (as species) comes from, it doesn't really say anything about the mechanism.

parpfish 2 hours ago [-]

How do you know we’re not just completing the next token?

It seems eminently plausible that the way cognition works is to take in current context and select the most appropriate next action/token. In fact, it’s hard to think of a form of cognition that isnt “given past/context, predict next thing”

throwup238 1 hours ago [-]

Philosophers have been arguing a parallel point for centuries. Does intelligence require some sort of (ostensibly human-ish) qualia or does “if it quacks like a duck, it is a duck” apply?

I think it's better to look at large language models in the context of Wittgenstein. Humans are more than next token predictors because we participate in “language games” through which we experimentally build up a mental model for what each word means. LLMs learn to “rule follow” via a huge corpus of human text but there’s no actual intelligence there (in a Wittgensteinian analysis) because there’s no “participation” beyond RLHF (in which humans are playing the language games for the machine). There’s a lot to unpack there but that’s the gist of my opinion.

Until we get some rigorous definitions for intelligence or at least break it up into many different facets, I think pie in the sky philosophy is the best we can work with.

bluefirebrand 2 hours ago [-]

> How do you know we’re not just completing the next token

Because we (humans) weren't born into a world with computers, internet, airplanes, satellites, etc

"Complete next token" means that everything is already in the data set. It can remix things in interesting ways, sure. But that isn't the same as creating something new

Edit: I would love to hear someone's idea about how you could "parrot" your way into landing people on the moon without any novel discovery or invention

nopinsight 2 hours ago [-]

For the skeptics: Scoring just 10% or so in Math-Perturb-Hard below the original MATH Level 5 (hardest) dataset seems in line with or actually better than most people would do.

Does that mean most people are merely parrots too?

https://math-perturb.github.io/

https://arxiv.org/abs/2502.06453

Leaderboard: https://math-perturb.github.io/#leaderboard

Anyone who continues to use the parrot metaphor should support it with evidence at least as strong as the “On the Biology of a Large Language Model” research by Anthropic which the article refers to:

https://transformer-circuits.pub/2025/attribution-graphs/bio...

_heimdall 2 hours ago [-]

You seem to be coming with the assumption that the difference between parrots and what many would consider intelligence is math, or that math is a reliable indicator of those different groups.

What makes you believe that is the case?

nopinsight 2 hours ago [-]

Solving hard math problems requires understanding the structure of complex mathematical reasoning. No animal is known to be capable of that.

Most definitions and measurements of intelligence by most laypeople and psychologists include the ability to reason, with mathematical reasoning widely accepted as part of or a proxy for it. They are imperfect but “intelligence” does not have a universally accepted definition.

Do you have a better measurement or definition?

_heimdall 41 minutes ago [-]

Math is a contrived system though, there are no fundamental laws of nature that require math to be done the way we do it.

A human society may develop their own math in a base 13 system, or an entirely different way of representing the same concepts. When they can't solve our base 10 math problems in a way that matches how we expect does that mean they are parrots?

Part of the problem here is that we still have yet to land on a clear, standard definition of intelligence that most people agree with. We could look to IQ, and all of its problems, but then we should be giving LLMs an IQ test to answer rather than a math test.

nopinsight 26 minutes ago [-]

The fact that much of physics can be so elegantly described by math suggests the structures of our math could be quite universal, at least in our universe.

Check out the problems in the MATH dataset, especially Level 5 problems. They are fairly advanced (by most people’s standards) and most are not dependent on which N in the base-N system used to solve them. The answers would be different of course but the structures of the problems and solutions remain largely intact.

Website for tracking IQ measurements of LLMs:

https://www.trackingai.org/

The best one already scores higher than all but the top 10-20% of most populations.

bluefirebrand 2 hours ago [-]

> Solving hard math problems requires understanding the structure of mathematical reasoning

Not when you already know all of the answers and just have to draw a line between the questions and the answers!

nopinsight 2 hours ago [-]

Please check out the post on Math-Perturb-Hard conveniently linked to above before making a comment without responding to it.

A relevant bit:

“for MATH-P-Hard, we make hard perturbations, i.e., small but fundamental modifications to the problem so that the modified problem cannot be solved using the same method as the original problem. Instead, it requires deeper math understanding and harder problem-solving skills.”

bluefirebrand 1 hours ago [-]

Seems like that would explain why it scored 10%, not 100%, to me

A child could score the same knowing the outcomes and guessing randomly which ones go to which questions

nkurz 9 minutes ago [-]

I think 'nopinsight' and the paper are arguing that the drop is 10%, not that the final score is 10%. For example, Deepseek-R1 dropped from 96.30 to 85.19. Are you actually arguing that a child guessing randomly would be able to score the same, or was this a misunderstanding?

nopinsight 1 hours ago [-]

My request:

“Could you explain this sentence concisely?

For the skeptics: Scoring just 10% or so in Math-Perturb-Hard below the original MATH Level 5 (hardest) dataset seems in line with or actually better than most people would do.”

Gemini 2.5 Pro:

“The sentence argues that even if a model's score drops by about 10% on the "Math-Perturb-Hard" dataset compared to the original "MATH Level 5" (hardest) dataset, this is actually a reasonable, perhaps even good, outcome. It suggests this performance decrease is likely similar to or better than how most humans would perform when facing such modified, difficult math problems.”

Tossrock 3 hours ago [-]

My favorite part of the "stochastic parrot" discourse was all the people repeating it without truly understanding what they were talking about.

posnet 2 hours ago [-]

Clearly all the people repeating it without truely understanding it are just simple bots with a big lookup table of canned responses.

Tossrock 2 hours ago [-]

Actually I think they're tiny homunculi, trapped in a room full of meaningless symbols but given rules on how to manipulate them.

_heimdall 2 hours ago [-]

This argument isn't particularly compelling in my opinion.

I don't actually like the stochastic parrot argument either to be fair.

I feel like the author is ignoring the various knobs (randomization factors may be a better term) applied to the models during inference that are tuned specifically to make the output more believable or appealing.

Turn the knobs too far and the output is unintelligible garbage. Don't then them far enough and the output feels very robotic or mathematical, its obvious that the output isn't human. The other risk of not turning the knobs far enough would be copyright infringement, but I don't know if that happens often in practice.

Claiming that LLMs aren't stochastic parrots without dealing with the fact that we forced randomization factors into the mix misses a huge potential argument that they are just cleverly disguised stochastic parrots.

chongli 55 minutes ago [-]

This seems like it was inevitable. Most people do not understand the meaning of the word "stochastic" and so they're likely to simply ignore it in favour of reading the term as "_____ parrot."

What you have described, a probability distribution with carefully-tuned parameters, is perfectly captured by the word stochastic as it's commonly used by statisticians.

mbauman 1 hours ago [-]

Yes, this really seems like an argument between two contrived straw people at the absolute extremes.

int_19h 3 hours ago [-]

The whole "debate" around LMs being stochastic parrots is strictly a philosophical one, because the argument hinges on a very specific definition of intelligence. Thought experiments such as Chinese room make this abundantly clear.

jfengel 2 hours ago [-]

That's about the only thing the Chinese room makes clear. The argument otherwise strikes me as valueless.

jltsiren 1 hours ago [-]

It's important to understand the argument in context.

Theoretical computer science established early that input-output behavior captures the essence of computation. The causal mechanism underlying the computation does not matter, because all plausible mechanisms seem to be fundamentally equivalent.

The Chinese room argument showed that this does not extend to intelligence. That intelligence is fundamentally a causal rather than computational concept. That you can't use input-output behavior to tell the difference between an intelligent entity and a hard-coded lookup table.

On one level, LLMs are literally hard-coded lookup tables. But they are also compressed in a way that leads to emergent structures. If you use the LLM through a chat interface, you are interacting with a Chinese room. But if the LLM has other inputs beyond your prompt, or if it has agency to act on its own instead of waiting for input, it's causally a different system. And if the system can update the model on its own instead of using a fixed lookup table to deal with the environment, this is also a meaningful causal difference.

kentonv 60 minutes ago [-]

> The Chinese room argument showed that this does not extend to intelligence.

Searle's argument doesn't actually show anything. It just illustrates a complex system that appears intelligent. Searle then asserts, without any particular reasoning, that the system is not intelligent, simply because, well, how could it be, it's just a bunch of books and a mindless automaton following them?

It's a cyclic argument: A non-human system can't be intelligent because, uhh, it's not human.

This is wrong. The room as a whole is intelligent, and knows Chinese.

People have, of course, made this argument, since it is obivous. Searle responds by saying "OK, well now imagine that the man in the room memorizes all the books and does the entire computation in his head. Now where's the intelligence???" Ummm, ok, now the man is emulating a system in his head, and the system is intelligent and knows Chinese, even though the man emulating it does not -- just like how a NES emulator can execute NES CPU instructions even though the PC it runs on doesn't implement them.

Somehow Searle just doesn't comprehend this. I guess he's not a systems engineer.

As to whether a lookup table can be intelligent: I assert that a lookup table that responds intelligently to every possible query is, in fact, intelligent. Of course, such a lookup table would be infinite, and thus physically impossible to construct.

chongli 2 hours ago [-]

No, the Chinese Room is essentially the death knell for the Turing Test as a practical tool for evaluating whether an AI is actually intelligent.

pixl97 1 hours ago [-]

The Chinese Room is a sophisticated way for humans to say they don't understand systematic systems and processes.

chongli 1 hours ago [-]

No, I think the Chinese Room is widely misunderstood by non-philosophers. The goal of the argument is not to show that machines are incapable of intelligent behaviour.

Even a thermostat can show intelligent behaviour. The issue for the thermostat is that all the intelligence has happened ahead of time.

int_19h 2 hours ago [-]

Only if you buy into the whole premise, which is dubious to say the least, and is a good example of begging the question.

chongli 2 hours ago [-]

What exactly is dubious about faking an AI with a giant lookup table and fooling would-be Turing Test judges with it? Or did you mean the Turing Test is dubious? Because that’s what the Chinese Room showed (back in 1980).

treetalker 2 days ago [-]

> The parrot is dead. Don’t be the shopkeeper.

Continuing the metaphor, we never wanted to work in a pet shop in the first place. We wanted to be … lumberjacks! Floating down the mighty rivers of British Columbia! With our best girls by our side!

jrmg 2 hours ago [-]

For a while, some people dismissed language models as “stochastic parrots”. They said models could just memorise statistical patterns, which they would regurgitate back to users.
…
The problem with this theory, is that, alas, it isn’t true.

If a language model was just a stochastic parrot, when we looked inside to see what was going on, we’d basically find a lookup table. … But it doesn’t look like this.

But does that matter? My understanding is that, if you don’t inject randomness (“heat”) into a model while it’s running, it will always produce the same output for the same input. In effect, a lookup table. The fancy stuff happening inside that the article describes is, in effect, [de]compression of the lookup table.

Of course, maybe that’s all human intelligence is too (the whole ‘free will is an illusion in a deterministic universe’ argument is all about this) - but just because the internals are fancy and complicated doesn’t mean it’s not a lookup table.

skybrian 2 hours ago [-]

There’s still a lot to learn about how LLM’s do things. They could be doing it in either a deep or a shallow way (parroting information) depending on the task. It’s not something to be settled once and for all.

So what’s “dead?” Overconfidently assuming you can know how an LLM does something without actually investigating it.

cadamsdotcom 48 minutes ago [-]

Why the existential crisis?

LLMs are stochastic parrots and so are humans - but humans still get to be special. Humans are more stochastic as we act on far more input than a several-thousand token prompt.

alganet 2 hours ago [-]

"ESSE É UM ESPERTO", or, "this is a smart one", in portuguese.

So far, LLM models have not demonstrated grasp on dual language phonetic jokes and false cognates.

Humans learn a second language very quickly, and false cognates that work on phonetics are the first steps in doing so, doesn't require a genius to understand.

I am yet to see an LLM that can demonstrate that. They can translate it, or repeat known false cognates, but can't come up with new ones on the spot.

If they do acquire that, we will come up with another creative example of what humans can do that machines can't.

NooneAtAll3 2 hours ago [-]

do deaf/mute people recognize phonetic bilingual jokes?

_heimdall 2 hours ago [-]

I have a deaf friend who can read lips in two languages. As far as I am aware she can pick up humor of all kinds in both.

She knows ASL as well, but I don't think she knows any other dialect of sign language (is dialect the right term? I'm not actually sure).

alganet 2 hours ago [-]

Sign language in Brazil (Libras) is different from ASL.

I am sure there are false cognate signs among them, and dual users of both sign languages can appreciate them.

petermcneeley 3 hours ago [-]

And yet... https://uwaterloo.ca/news/media/qa-experts-why-chatgpt-strug...

lukasb 2 hours ago [-]

Given LLMs' OOD performance the parrot metaphor still looks good to me

2 hours ago [-]

devmor 2 hours ago [-]

I am getting fairly tired of seeing articles about LLMs that claim “[insert criticism] was wrong” but offer nothing other than the opinion of the author’s interpretation of a collection of other people’s writings with limited veracity.

derbOac 2 hours ago [-]

This struck me as a strawman argument against the "stochastic parrot" interpretation. I really disagree with this premise in particular: "if a language model was just a stochastic parrot, when we looked inside to see what was going on, we’d basically find a lookup table." I'm not sure how the latter follows from the former at all.

As someone else pointed out, I think there's deep philosophical issues about intelligence and consciousness underlying all this and I'm not sure it can be resolved this way. In some sense, we all might be stochastic parrots — or rather, I don't think the problem can be waved away without deeper and more sophisticated treatments on the topic.

anothernewdude 1 hours ago [-]

> If a language model was just a stochastic parrot, when we looked inside to see what was going on, we’d basically find a lookup table

I disagree right away. There are more sophisticated probability models than lookup tables.

> It'd be running a search for the most similar pattern in its training data and copying this.

Also untrue. Sophisticated probability models combine probabilities based on combining all the bits of context, and by fuzzing similar tokens together via compressing (i.e. you don't care what particular token is used, just that a similar one is used.)

They're parrots, just better parrots than this person can conceive of.

NooneAtAll3 2 hours ago [-]

my personal anecdote about stochastic parrot arguments is that the argument itself became so repetitive that its defenders sound as parrots...

hulitu 2 days ago [-]

> The Parrot Is Dead

The page says "Something has gone terribly wrong :(".

He's not dead, he's resting.

pyfon 3 hours ago [-]

Dead parrot is a Monty Python reference. Also where the Python language get's its name.

kerkeslager 2 hours ago [-]

> This kind of circuitry—to plan forwards and back—was learned by the model without explicit instruction; it just emerged from trying to predict the next word in other poems.

This author has no idea what's going on.

The AI didn't just start trying to predict the next word in other poems, it was explicitly instructed to do so. It then sucked in a bunch of poems and parroted them out.

And... the author drastically over-represents its success with a likely cherry-picked example. When I gave Claude lines to rhyme with, it gave me back "flicker" to rhyme with "killer" and "function" to rhyme with "destruction". Of the 10 rhymes I tried, only two actually matched two syllables ("later/creator" and "working"/"shirking")I'm not sure how many iterations the author had to run to find a truly unusual rhyme like "rabbit/grab it", but it pretty obviously is selection bias.

And...

I actually agree with the other poster who says that part of this stochastic parrot argument is about humans wanting to feel special. Exceptionalism runs deep: we want to believe our group (be it our nation, our species, etc.) are better than other groups. It's often wrong: I don't think we're particularly unique in a lot of aspects--it's sort of a combination of things that makes us special if we are at all.

AI are obviously stochastic parrots if you know how they work. The research is largely public and unless there's something going on in non-public research, they're all just varieties of stochastic parroting.

But, these systems were designed in part off of how the human brain works. I do no think it's in evidence at all that humans aren't stochastic parrots. The problem is that we don't have a clear definition of what it means to understand something that's clearly distinct from being a stochastic parrot. At a certain level of complexity of stochastic parroting, a stochastic parrot is likely indistinguishable from someone who truly understands concepts.

I think ultimately, the big challenge for AI isn't that it is a stochastic parrot (and it is a stochastic parrot)--I think a sufficiently complex and sufficiently trained stochastic parrot can probably be just as intelligent as a human.

I think the bigger challenge is simply that entire classes of data simply have not been made available to AI, and can't be made available with current technology. Sensory data. The kind of data a baby gets from doing something and seeing what happens. Real-time experimentation. I think a big part of why humans are still ahead of AI is that we have a lot of implicit training we haven't been able to articulate, let alone pass on to AI.

zeofig 2 hours ago [-]

I'm so glad We have all Decided this Together and we can now Enjoy the Koolaid

Rendered at 02:09:28 GMT+0000 (Coordinated Universal Time) with Vercel.