my latest in the The Wall Street Journal: We Now Know How AI ‘Thinks’—and It’s Barely Thinking at All Maybe you've heard that AIs are "black boxes" But a growing body of research keeps arriving at the same conclusion: Today's AIs all work in surprisingly similar -- and simplistic -- ways. Our lack of understanding of how ChatGPT and related "transformer" models work has allowed for endless speculation + hype. And also derision that they're just "stochastic parrots". Turns out they're neither. A new theory that explains much of their weirdness also points to their limitations. I give you: the Bag of Heuristics. https://lnkd.in/eBxXzmMJ
Appreciate the article, but a few clarifications from the AI research side: You're right that LLMs rely on heuristics and pattern interpolation rather than true abstraction. But calling this “barely thinking” implies failure, when it’s a design outcome. These models optimize token prediction and do not emulate human cognition. Different inductive biases yield different forms of generalization. Also, the segmentation of problem spaces (e.g. different strategies for 2- vs 5-digit math) isn’t a flaw but it reflects how high-dimensional interpolation works in transformer architectures. It’s alien cognition, not broken cognition. The real challenge is not that models don’t “think” but that we’re judging them by human standards. And that limits how we understand what intelligence can be.
Great article Christopher. I’m on board with this “BOH” theory. One underreported component of what gives AI its “spark” is the role of vector indexing that allows for efficient association of natural language terms—particularly those with nuanced meaning, such as “reasonableness” or “fair” in legal contexts. This will forever change the application and interpretation of language in general and the written laws that govern society in particular.
I think there’s a misunderstanding when people say “we don't know how a neural network works.” We actually know exactly how it works; what we don't know is what rules it has formed internally and in which ways. However, understanding the underlying mechanism allows us to predict its limits, and we've been getting a lot of experimental confirmation of this lately.
• These "heuristics" found by LLMs through statistical Bayesian inference are nothing more than conditionally dependent and statistically well-established phrases that can also be interpreted as habits or skills • When you say "good morning, dear" in the morning or turn on the kettle or coffee maker automatically without thinking, that's exactly what you're doing. And when you say some clever phrase without thinking, you do exactly how LLMs do it • LLMs may have a "partial reasoning / understanding", which I would call a "Bayesian reasoning / understanding" • This understanding is "partial" precisely because of its probabilistic nature • This arises as a result of "Bayesian inference" forming statistical relationships between entities / words https://x.com/VictorSenkevich/status/1637169908059389955 • "Bayesian" and Boolean / logical understandings are both important and complement each other • "Bayesian"/"Boolean" understandings of LLM/AGI are analogous of the Kahneman’s System 1 “fast” / 2 “slow” thinking • "Bayesian" / Kahneman's 1 “fast” thinking, which is responsible for "common sense"/intuition/skills/reflexes, also contains accumulated results of logical reasoning in order to quickly reproduce them as a ready-made solution.
A bag of heuristic... Yes but... We can't really say it can't use context. it really depends on the prompting/RAG data. And heuristic used subjectively by Emotional humans results in bias. Even though the ai also has its layer of bias. It still use more than one source/experience. I agree we should spend time learning how it operates OR what to avoid at least. But if we stop focusing on the inner workings of the feature (ai) and instead focus on the outcomes then we will see its actually not that bad considering it's not thinking at all.
Thanks for this perspective — the "bag of heuristics" idea really resonates. Building on your exploration, it seems that what differentiates human intelligence is not just having heuristics, but combining them with experience, emotions, meaning, context, long-term memory, and imagination. In a sense, humans are also a "bag of heuristics" — but infused with lived experience and flexible understanding, which lets us adapt far beyond rigid rules. Would you agree that this combination — not just heuristics, but the integration of them into a broader model of meaning — is what fundamentally separates human intelligence from the current architectures of AI? If not, in your view, what does make human thinking so distinct?