Scaling Laws Won't Achieve AGI
Scaling LLMs (without large architectural changes) will not achieve AGI; too many human thought mechanics are missing.
If you want to hedge against OpenAI & Anthropic's "LLM all the way" strategies, read on.
Humans are not statically architected
The argument for LLMs as the path to AGI is predicated upon neural nets generalizing well at scale while hoping computing costs & performance improve fast enough. Quite the brute force approach when our brains use less energy than our laptops each day.
LLMs unlocked understanding of intent, but are now the proverbial hammer trying to turn every problem into a nail. When we look back in a few years, even LLMs will feel hard-coded. Evolution and brains work differently.
Fundamentally, evolution ensures humans are not statically architected. We need models that don't just scale with data, but evolve their own architectures dynamically in response to environmental stimuli. We need digital Darwinism.
We are indeed many-shot learners, but our base models expand and contract organically in a just-in-time way which allows us to also efficiently prune useless data & function. We're constantly reindexing and defragmenting our minds and bodies.
We are seekers of truth via logical systems guided by rationality and emotion, whereas LLMs are trained to be emotionless conveyer belts of pre-trained data which often leads to mistruth & misunderstandings.
Rather than "Sorry, I can't help with that" (e.g. avoiding hallucination by avoiding helping / seeking at all) or "As an AI language model, I do not have feelings", humans benefit from deep understanding of the nuance of situations to strongly guide our logical processes. Emotions, cultural norms, familial and work norms are all a big part of this.
Hallucination as an RLHF tool
As this weekend's OpenAI drama showed, humans hallucinate most where there is a void of relevant knowledge—and the more emotional we are, the more we hallucinate.
Rather than seeing hallucination as a bad thing, it's an incredibly important signal into a reinforcement feedback loop that humans use to fill in each other's knowledge gaps.
Beyond attention, hallucination is the most important RLHF tool we have. We amplify emotional response during distress.
This signal triggers us and others to then help us collectively gain knowledge, then gain reason, and minimize future emotion within that new knowledge.
Emotional responses help us to generate synthetic data which we then compare against a collective or discovered truth, which we then use to refine our ground truth. We react, seek, refine, reindex, defragment, purge, and repeat.
We seek cognitive dissonance on our path to cognitive resonance. This creates a perpetual feedback loop enabling creativity and more synthetic data to train on.
Specialization
Each of us are specialists within our niche environments, family units, and professional work. Because we operate by ruthless refinement, our latent space is compact which allows us to hallucinate less because wires don't get crossed.
If anything, we're overtrained within our domains which necessarily makes us weak in all others. We refuse to leave our home town that we know so well, or to learn new languages, or change careers midlife.
Because we can both specialize and generalize, we don't confuse killing a Linux process with killing a human. We don't care about someone telling us how to make a bomb; it's as useless as an advertisement.
Because our learning path heavily depends upon our environment and inherent goals, we become what we surround ourselves with—necessarily biased.
We autonomously and consciously / subconsciously seek to expand that bias in a sort of self-replicating, "truth"-seeking way when inherently the only truth—in a world of subjective truths—is the small shelf life of our temporary biases.
Given enough time (for some people, this is post-death), even our biases break down and we can learn a new truth. Again, we refine, reindex, and defragment.
Because we refine and keep our latent space small—helped along by being good at generalization and pattern matching—we build centroidal systems in our minds that allow us to dynamically interpret, plan, interlock and retrieve data in a way that LLMs fundamentally cannot.
Orchestrators
We are orchestrators at our core; and being good at orchestration is not determined by the knowledge in our latent space.
Orchestration is a hypervisor of knowledge and the subsystems to generate and collect more of it.
Again: We only have the knowledge we have because our orchestration systems were determined to have it. Why is that?
An LLM is not an orchestration system. Very few—if any—people are building orchestration systems. Even fewer are building the hypervisor of orchestration systems which are more goal-based and work from the first principles of our needs and why we seek any knowledge at all.
Some of these goals are biological (literally, in a DNA sense) while others are emotional, logical, irrational or environmental.
At every step of discovery, keep asking "why?". Why do we have emotions? Why do we collect knowledge? Why are we specialists? Why do our brains require such little energy? Why do some babies learn differently than others?
Don't confuse knowledge with intelligence
By perpetually asking why at every new plateau, you'll eventually reach a new ground truth, temporarily at least.
My ground truth: Do not confuse knowledge with intelligence. Those that do will confuse LLMs with AGI.
LLMs require fundamentally different architectures at both the lowest & highest of levels. We'll keep the good parts, but refine the rest.
-- Rob