| MIND · MATTER · MEANING | No. 29 · May 2026 |
What a Machine Would Have to Earn
Understanding is earned in a world, not performed on a screen.
| An essay | mindmatterandmeaning.com |
A friend sent me a transcript last spring. He had asked a chatbot what a sunburn feels like the morning after — that specific tight, hot, can’t-find-a-way-to-lie-down misery — and the machine answered better than he could have. It named the flinch when a shirt seam drags across the shoulders. It knew the small betrayal of forgetting for a second and leaning back into a hot car seat. He found it uncanny, a little moving, and he wanted to know: does it understand what a sunburn is?
Good question, asked at the right moment. The honest answer takes a while to earn, so let me start with the answer most of us reach for first — because it’s reasonable, and because it’s wrong.
The reasonable view goes like this. Understanding shows up in what you can do. A student who can answer any question about the French Revolution, field the follow-ups, catch the trick ones, and explain the whole thing to a ten-year-old — that student understands the French Revolution, and we would be cranks to deny it on the grounds that we can’t peer inside her skull. Understanding is as understanding does. So if a machine handles every question about sunburns as well as a sunburned person could, the difference between the machine and the person starts to look like a difference we invented to feel special about ourselves. The picture has a respectable pedigree: it descends from behaviorism, and it has a famous instrument in Alan Turing’s imitation game, where the test for thinking just is indistinguishable performance.
Notice the quiet assumption, though. The picture takes understanding a word to be a matter of using it correctly, and takes “correctly” to be settled by looking only at the outputs. Pull on that thread and the whole thing comes apart in your hands.
Stevan Harnad, a cognitive scientist with a gift for naming traps, named this one in 1990: the symbol grounding problem.1 Imagine trying to learn Chinese from a Chinese-only dictionary. Every definition sends you to other entries, which send you to others, and you ride that merry-go-round forever without once touching the ground. A system whose symbols are defined only by more symbols never means anything by them. Meaning gets in only when some of the symbols connect to the things they are about by some route other than further symbols — when “red” hooks to red, not merely to “crimson,” “scarlet,” and “the color of a stop sign.”
What supplies the hook is not anything inside the system. Hilary Putnam made the case unforgettable with a thought experiment about Twin Earth — a planet just like ours except that the stuff they call “water” there is some other compound with all of water’s surface features.2 A person here and their molecular duplicate there can be internally identical, down to the atom, and still mean different things by “water,” because the word answers to the stuff in the world, not to the state of the head. “Meanings,” Putnam wrote, “just ain’t in the head.” Tyler Burge pushed the same point from the social side: what your word “arthritis” picks out depends on the practice of the community you defer to, not on a private definition you carry around.3 Content lives in a relation — between a system, a world, and the company it keeps.
There is even a natural story about how the relation gets built. On teleosemantic accounts — Ruth Millikan’s and Fred Dretske’s, chiefly — a state comes to be about something by acquiring the function of tracking it, the way a frog’s strike comes to be about flies through a long history in which catching flies is what kept frogs going.4 The clinching detail is misrepresentation: to get something wrong, a system has to have been in the business of getting it right. A state can mean fly and fire at a passing pellet only because its job, fixed by history, was flies. No history, no job; no job, nothing to be mistaken about; nothing to be mistaken about, no content.
So understanding a word turns out to be an achievement, not a knack: it consists in having states that are genuinely about the world — not states that merely accompany the right answers, but states directed at the very things the words name — and aboutness is something a system earns over time. Your “red” means red because red things have been pushing on you, through eyes and skin and the small stakes of an actual life, since before you could pronounce the word. This is what people are gesturing at, usually too vaguely, when they say minds are embodied. The word invites mysticism, so let me drain it of any. Embodiment names three sober requirements: the system takes in the world through senses and acts back on it; its inner states have been shaped by real traffic with the features they represent; and those states are there to track a world the system inhabits, not merely to emit the right strings. Michael Tye — who spent three decades building the most careful theory we have of how experience could be nothing more than representational content, and then argued that even his own theory needs history — makes the sharpest version of the point. Two creatures could be intrinsically identical at an instant, he argues, and still differ in what they experience, because one has a past of tracking the world and the other was assembled, atom for atom, five minutes ago.5 History is not decoration on content. It is part of what fixes it.
Which lets me say, at last, what a machine would actually need. Not the right stuff — I don’t think the barrier is silicon, and here I part company with John Searle, who ties understanding to the specific causal powers of biological brains.6 The barrier isn’t carbon; it’s a world. A system understands when its inner states have been shaped by, and stay answerable to, the things they represent — when it senses and acts, lives under stakes, and can get things wrong and pay for it. Build that, and the door to genuine artificial understanding stands open. I mean open, not slyly closed. The claim here is not the tired one that machines could never understand. It is that understanding is earned through engagement, and there is no coupon for skipping the engagement.
Skipping the engagement is precisely what today’s text-only language models do. A large model learns the statistics of how we talk — the staggeringly intricate shape of which words follow which — from a corpus of descriptions of the world, never from the world.7 It has read everything ever written about sunburns and has never once had skin. Its “red” is a position in an immense map of words, anchored to other words, anchored to nothing outside the map. The fluency is real and the achievement is genuine; it is simply not the achievement of understanding.
Here the strongest objection arrives, and it deserves a real hearing rather than a brush-off. If the machine’s answers became indistinguishable — in principle, not merely in today’s practice — from an understander’s, then insisting it still lacks understanding looks like clinging to a ghost. A difference that makes no detectable difference, the objection runs, is no difference at all. That is the whole moral of the imitation game, and it is not a silly one.
But “makes no difference you can detect in the output” is the definition of a good simulation, not the absence of a difference. Simulate a hurricane to any precision you please: the equations are flawless and your desk stays bone dry. Modeling a process is not running it.8 Two systems can produce the very same words while one means them and the other reports the statistics of how the word gets used — because meaning was never a property of the output. It lives in the history behind the output, and that history is exactly what an output test cannot see. The objection mistakes the instrument for the quarry. It notices that the meter reads the same and concludes there is nothing the meter is missing.
So: does the machine understand what a sunburn is? It has never had skin. It has never flinched, never dreaded an evening because of how the sheets would feel. It holds the words and not the world the words are about. Ask the question again in some later decade, of some later system that has spent years bumping into things and paying for its errors, and the answer could come back different — that is the part the doom-mongers and the hype-merchants both manage to miss. Understanding is not a performance a system delivers. It is a debt a system pays, to the world, in the one currency the world accepts: contact. Until the bill comes due, fluency is only fluency. It was always going to be the easy part.
References
Burge, Tyler. 1979. “Individualism and the Mental.” Midwest Studies in Philosophy 4: 73–121.
Dretske, Fred. 1988. Explaining Behavior: Reasons in a World of Causes. Cambridge, MA: MIT Press.
Harnad, Stevan. 1990. “The Symbol Grounding Problem.” Physica D 42: 335–346.
Harnad, Stevan. 2002. “Symbol Grounding and the Origin of Language.” In Computationalism: New Directions, edited by Matthias Scheutz. Cambridge, MA: MIT Press.
Havlík, Vladimír. 2025. “Meaning and Understanding in Large Language Models.” Synthese 205: 9.
Millikan, Ruth Garrett. 1989. “Biosemantics.” Journal of Philosophy 86 (6): 281–297.
Putnam, Hilary. 1975. “The Meaning of ‘Meaning.’” In Mind, Language and Reality: Philosophical Papers, Volume 2, 215–271. Cambridge: Cambridge University Press.
Searle, John R. 1980. “Minds, Brains, and Programs.” Behavioral and Brain Sciences 3 (3): 417–457.
Searle, John R. 1990. “Is the Brain a Digital Computer?” Proceedings and Addresses of the American Philosophical Association 64 (3): 21–37.
Tye, Michael. 2019. “Homunculi Heads and Silicon Chips: The Importance of History to Phenomenology.” In Blockheads! Essays on Ned Block’s Philosophy of Mind and Consciousness, edited by Adam Pautz and Daniel Stoljar. Cambridge, MA: MIT Press.
Notes
- Harnad (1990) coined “the symbol grounding problem” and framed it with the Chinese-dictionary regress; he later tied it to the origin of language (Harnad 2002). The problem is older than the label — it is the computational heir of the externalist worry about how any representation latches onto its object — but Harnad’s formulation is the one the AI literature inherited, and it is sharper than the Chinese Room for present purposes because it isolates grounding from Searle’s further claims about consciousness. ↩
- Putnam (1975). The conclusion is specifically about reference and extension: the content that fixes what “water” is true of does not supervene on the speaker’s intrinsic states. Note that Putnam later qualified his own semantic externalism in several directions; nothing here turns on the most contested versions of the thesis, only on the minimal claim that reference depends on causal-environmental relations the head alone does not settle. ↩
- Burge (1979) extends externalism from natural-kind reference (Putnam) to social content: holding a thinker’s physical history fixed while varying the surrounding linguistic community varies which concept the thinker exercises. The two cases are independent routes to the same structural conclusion — internal organization underdetermines content — which is why the essay leans on both rather than treating Burge as a footnote to Putnam. ↩
- The teleosemantic tradition, principally Millikan (1989) and Dretske (1988), grounds content in proper function: a state represents what it has the function of tracking, where functions are fixed by selection or learning history. Misrepresentation is the standard adequacy test for any naturalistic theory of content, since a theory on which states cannot be false has not yet described representation. Rival tracking theories handle reliable misrepresentation differently, but the historical structure — content fixed by what a state was for — is common ground and is what the embodiment argument borrows. ↩
- Tye (2019). The thesis is that two beings intrinsically alike at a time can differ in phenomenal character because they differ in history — a representationalist’s concession that current intrinsic structure does not suffice. Ned Block replies in the same volume (“Fading Qualia: A Response to Michael Tye”) that a subject could be radically wrong about their own phenomenology; the disagreement is real and unresolved, and the essay sides with Tye while granting that Block has located the genuine pressure point. That Tye, of all people, reaches for history is the relevant fact: the most developed representationalism on offer does not think structure alone fixes content. ↩
- Searle (1990) argues that computation is observer-relative — a physical system “computes” only under an interpretation we assign — so computational description cannot, by itself, explain intrinsic intentionality. The essay takes this negative point and leaves Searle’s positive doctrine behind. Searle’s biological naturalism holds that only the specific causal powers of brains can produce understanding; the view defended here replaces “the right biology” with “the right causal-environmental engagement,” which a non-biological system could in principle possess. The negative argument survives the amputation of the positive one. ↩
- Not everyone takes the contact gap to be fatal, and the most direct contrary voice deserves naming. Vladimír Havlík (2025) argues the reverse of this essay’s conclusion — that large language models do ground the meanings of their expressions, by way of what he calls semantic fragmentism, so that grounding in worldly reference is not a precondition of understanding. I think this mislocates the gap rather than closing it. Semantic fragmentism can explain how a model’s tokens come to bear stable relations to one another; the externalist and teleosemantic considerations above concern what fixes the relation between a token and the world, which is precisely what a text-only training signal never touches. The architectural premise is not what divides us — a text-only model is trained to predict the next token over a corpus of text, full stop — what divides us is whether that suffices for content, and Havlík’s affirmative answer is the live position this essay rejects. ↩
- The simulation/realization distinction is Searle’s reply to the Brain Simulator objection in “Minds, Brains, and Programs” (1980), generalized: a model of a process is not an instance of it, and whatever a process owes to its physical realization is not delivered by a description of that realization, however exact. The hurricane example makes the point without the contested premises about consciousness — no one is tempted to say the simulated storm is wet — which is why it does cleaner work here than the Chinese Room. ↩