The Intelligence Mirage: Building the Scaffolding of AGI

exkn
Dec 16, 2025
8 min read

Why Deep Learning is a Detour, Not the Destination

Part 1: The Modern Egyptian

Why we are building the Pyramids of Intelligence without understanding the Physics.

There is a dirty secret at the heart of the current AI boom: the people building the models don’t actually know how they work.

If you ask a researcher at a frontier lab why adding more GPUs and more data makes a model smarter, they will point you to “Scaling Laws.” These are charts that show a perfect, predictable line: increase compute by 10x, and the error rate drops by a specific percentage. It is reliable. It is empirical. It is the foundation of a trillion-dollar industry.

It is also, strictly speaking, not a scientific theory. It is an observation.

Gavin Baker recently compared modern AI researchers to the Ancient Egyptians. The Egyptians and other Neolithic peoples were masters of measurement. They understood the movement of the sun so precisely that they could align the Great Pyramids and Stonehenge to the equinoxes with perfect accuracy. But if you asked an Egyptian priest why the sun moved, he would tell you it was a god in a chariot. They had perfect measurement, but zero understanding of orbital mechanics.

We are in the same boat. We know that the neural network gets smarter. We don’t know the “orbital mechanics” of why.

The mystery deepens when you look at what these models are actually doing. We train them on a single, boring task: “predict the next token.” We feed them the internet, and we ask them to guess the next word. But to play that game at a superhuman level, the model is forced to do something unexpected. It doesn’t just memorize the text; it begins to compress the underlying logic of the text.

This is the Distributional Hypothesis. If you compress the entire corpus of human text efficiently enough, you aren’t just storing words. You are forced to store the “shape” of the reality that generated those words. To perfectly predict a line of code, you must internally model the logic of the programming language. To perfectly predict a move in a chess game (recorded in text), you must internally model the geometry of the board.

The intelligence isn’t magic, and it isn’t a ghost in the machine. It is archaeology. The logic, the physics, and the reasoning were already there, tangled up in the data of the internet. The model is simply a compression algorithm that found it.

We haven’t built a synthetic mind. We have built a map of our own mind so detailed that it looks like the territory.

Part 2: The Librarian and the Scientist

The hard ceiling between finding answers and creating them.

Current AI models are the greatest librarians in the history of the universe.

If you ask an LLM to solve a complex coding problem, it can scan its compressed map of all human code, find similar patterns, interpolate between them, and hand you a solution. It feels like intelligence. But strictly speaking, it is retrieval and interpolation. The answer existed within the bounds of the training data (the “library”), and the model found it.

But there is a profound difference between a Librarian and a Scientist.

A Librarian’s job is to organize and retrieve existing knowledge. A Scientist’s job is to create new knowledge. This distinction is often glossed over, but it is the hard ceiling of the current paradigm.

Consider the difference between Calculation and Explanation.

Solving a math problem or folding a protein is a calculation. The rules are known. The answer can be verified; you just need to find it. AI is spectacular at this. It’s why AlphaFold can solve protein structures—it is navigating a maze that already exists and it’s answers can be checked.

But scientific breakthroughs require Explanation. Einstein didn’t discover General Relativity by looking at more data points of Newtonian physics. He didn’t just “predict the next token” in a physics textbook. He engaged in Abduction—he conjectured a new rule (spacetime is curved) that actually contradicted the training data of his time.

A model trained to minimize error against the “distribution” of 19th-century physics would have punished Einstein’s theory as a hallucination. It would have looked at his paper and said, “This does not align with the corpus. Error.”

This is why LLMs are more likely the “Language Organ” of a future AGI, not the mind itself. They can generate fluent possibilities—they can summon the concept, collate what is known, draft the paper. But they lack the internal “Critic”—the algorithm that filters conjectures not by whether they are “likely” (based on past data), but by whether they are “true” (hard-to-vary explanations of reality).

We have built a machine that can pass every exam humans have ever written, because exams are tests of existing knowledge. We have yet to build a machine that can conjecture new exam.

Part 3: The Optimization

Why the path to truth requires walking uphill.

The engine powering the current AI revolution is an algorithm called Gradient Descent. It is a mathematical marvel, but it is inherently conservative.

Imagine a hiker trapped in a foggy mountain range, trying to find the absolute bottom of the valley. Gradient Descent works by feeling the ground with one foot, finding the steepest downward slope, and taking a step. It repeats this billions of times until it reaches the bottom, optimizing for the lowest possible error.

But there is a fatal flaw in this approach when it comes to creativity.

If the solution—the radical new idea or the scientific breakthrough—lives in a completely different valley, Gradient Descent will never find it. To reach that new valley, the algorithm would have to do something counter-intuitive: it would have to walk uphill. It would have to accept higher error to cross the ridge of uncertainty and find new territory.

Current AI cannot do this. It is trapped in the valley of “what is known.”

True conjecture requires a “leap”—a discontinuous jump across the search space that isn’t justified by the immediate data. This looks less like calculus and more like evolution. It requires mutating ideas to see if they survive. Evolution does this randomly, but humans do something not understood but clearly not random.

This brings us to the second missing piece: The “Hard to Vary” Filter.

The physicist David Deutsch argued that a true explanation is “hard to vary.” If you change one variable in General Relativity, the whole theory collapses. That is what makes it true.

Current AI is the opposite. It is a “Soft to Vary” machine. It is designed to be flexible—to hallucinate, to accept “close enough,” to fudge the details to minimize the loss function. It is a politician, not a physicist.

To create knowledge, we don’t just need a generator of ideas (the LLM); we need a Critic. We need a module that ruthlessly attempts to “vary” the conjecture to see if it breaks. Does this hypothesis contradict established physics? Is it internally consistent?

Right now, the industry uses RLHF (Reinforcement Learning from Human Feedback) as this critic. But RLHF is just “what do humans prefer?” It is subjective and narrow.. A true knowledge-creating engine needs a critic grounded in objective reality, not human preference.

And finally, there is the Epistemic Air Gap.

An AI lives in a universe of frozen text. For example, it could come up with a promising novel conjecture like “Protein X will cure Cancer Y.” It cannot know if that is true explanatory knowledge until it crosses the air gap and interacts with the physical world.

We probably need a different architecture where the “Brain” is a hybrid AI and a highly automated robotic lab is the “Body.”

The Brain (2 parts)

AI LLM (Language Organ): “I have crossed biology data with computer science data and intuited a new protein structure.” (Conjecture).
Critic (Algorithmic Mind): “This structure is chemically stable and valid.” (Filter).

The Robotic Body synthesizes the protein and tests it, sending the result back to the Brain for error correction.

Feedback: The lab sends the result back to the Brain. (Error Correction).

Until AI can fundamentally “touch grass”—interact with the causal structure of reality rather than just the statistical structure of text—it will remain a brilliant librarian, but not a scientist.

Part 4: The Cost of Curiosity

Why a silicon brain needs a nuclear plant to do what you do with a sandwich.

The human brain runs on about 20 watts of power. That’s a dim lightbulb. On that energy budget, it can write poetry, calculate trajectories, navigate complex social hierarchies, and invent calculus.

To do roughly the same cognitive work, a modern AI cluster requires a small nuclear reactor.

This efficiency gap—roughly 10,000,000 to 1—isn’t just an environmental problem. It is an intelligence problem. It stems from a flaw in how we build computers known as the Von Neumann Bottleneck.

In every chip in every data center, memory and calculation are divorced. To do a simple math operation, the chip has to spend energy fetching the data from memory (the “commute”), doing the math, and sending it back. 99% of the energy consumed by an AI model is spent on this commute. The brain, by contrast, uses “compute-in-memory.” The synapse is the storage and the processor. There is no commute.

Why does this matter for AGI? Because of the Curiosity Tax.

True creativity and learning require “play.” A child learns physics by knocking over blocks a thousand times. A scientist learns by running hundreds of failed experiments. This process of unstructured simulation is essential for discovery.

For a biological brain, this “play” is cheap. It costs a sandwich.

For a GPU cluster, “play” is astronomically expensive. Every “thought” costs veritable dollars in electricity.

Because of this hardware architecture, we are pricing curiosity out of the equation. We are forced to use AI only for high-value, verifiable tasks (like coding or stock analysis) where the ROI is immediate. We cannot afford to let the model “daydream,” explore dead ends, or ponder philosophy, because the meter is running at 100 megawatts.

Until we move to a hardware substrate that allows for “cheap thoughts”—likely neuromorphic or analog architectures—AI will remain a rigid, expensive worker, not a curious explorer.

Part 5: The Hardware Lottery

Are we building the Cathedral, or just the scaffolding?

There is a concept in computer science called the Hardware Lottery. It suggests that the “best” AI algorithms aren’t necessarily the ones that win. The winners are simply the ones that fit the dominant hardware of the day.

We are currently living through the greatest Hardware Lottery in history.

We have spent fifteen years optimizing a specific type of chip (the GPU) which is really good at one specific thing: dense matrix multiplication. Because we have these chips, we have poured all our research into algorithms that fit them (Transformers). It is a self-reinforcing loop.

The saving grace is Turing Completeness: A massive GPU cluster can simulate a Neuromorphic or Biological architecture, just very inefficiently.

The Simulation Phase: We may need these massive, inefficient, nuclear-powered GPU clusters to discover the efficient algorithm. We use the brute force of the H100s to “emulate” a brain, figure out how it works, and then design the new chip.
The “ENIAC” Analogy: The current datacenters are like the ENIAC (the first vacuum tube computer). It filled a room and used 150kW of power to do simple math. It was the “wrong path” (vacuum tubes were a dead end), but we needed to build it to realize we needed the transistor.
The Dead End: Once we find the “Explanatory Algorithm,” these GPU datacenters might essentially become “scrap metal” for AI purposes, though they will remain useful for “old world” tasks like rendering video, weather simulation, and crypto.

We know, for instance, that the brain is “sparse” (only a tiny fraction of neurons fire at once) and “asynchronous” (they don’t fire in a synchronized clock cycle). GPUs hate sparsity and they hate asynchrony. They want dense, synchronized blocks of numbers.

The optimism is that this is just “scaffolding.” We need these massive clusters to simulate the brain, figure out how it works, and then build the efficient chip. The pessimistic view is that we are getting addicted to the scaffolding. With hundreds of billions of dollars sunk into GPU infrastructure, and with all the promising use cases, the industry has a massive incentive to ignore any path to AGI that renders those chips obsolete.

We are acting like 19th-century industrialists trying to achieve flight by building bigger and bigger steam engines. We might get off the ground, but we won’t get to the moon until we invent the internal combustion engine.

The current datacenter buildout is awe-inspiring. But it could also quite possibly also be a long but productive technological detour on the way to true AGI.

EXKN