The big names in artificial intelligence—leaders at OpenAI, Anthropic, Google and others—still confidently predict that AI attaining human-level smarts is right around the corner. But the naysayers are growing in number and volume. AI, they say, just doesn’t think like us.
The work of these researchers suggests there’s something fundamentally limiting about the underlying architecture of today’s AI models. Today’s AIs are able to simulate intelligence by, in essence, learning an enormous number of rules of thumb, which they selectively apply to all the information they encounter.
This contrasts with the many ways that humans and even animals are able to reason about the world, and predict the future. We biological beings build “world models” of how things work, which include cause and effect.
Many AI engineers claim that their models, too, have built such world models inside their vast webs of artificial neurons, as evidenced by their ability to write fluent prose that indicates apparent reasoning. Recent advances in so-called “reasoning models” have further convinced some observers that ChatGPT and others have already reached human-level ability, known in the industry as AGI, for artificial general intelligence.
For much of their existence, ChatGPT and its rivals were mysterious black boxes.
There was no visibility into how they produced the results they did, because they were trained rather than programmed, and the vast number of parameters that comprised their artificial “brains” encoded information and logic in ways that were inscrutable to their creators. But researchers are developing new tools that allow them to look inside these models. The results leave many questioning the conclusion that they are anywhere close to AGI.
“There’s a controversy about what these models are actually doing, and some of the anthropomorphic language that is used to describe them,” says Melanie Mitchell, a professor at the Santa Fe Institute who studies AI.
Melanie Mitchell, a professor at the Santa Fe Institute. – Kate Joyce/Santa Fe Institute
New techniques for probing large language models—part of a growing field known as “mechanistic interpretability”—show researchers the way these AIs do mathematics, learn to play games or navigate through environments. In a series of recent essays, Mitchell argued that a growing body of work shows that it seems possible models develop gigantic “bags of heuristics,” rather than create more efficient mental models of situations and then reasoning through the tasks at hand. (“Heuristic” is a fancy word for a problem-solving shortcut.)
When Keyon Vafa, an AI researcher at Harvard University, first heard the “bag of heuristics” theory, “I feel like it unlocked something for me,” he says. “This is exactly the thing that we’re trying to describe.”
Vafa’s own research was an effort to see what kind of mental map an AI builds when it’s trained on millions of turn-by-turn directions like what you would see on Google Maps. Vafa and his colleagues used as source material Manhattan’s dense network of streets and avenues.
The map of Manhattan that an AI made up in its own ‘mind’ after being trained on millions of turn-by-turn directions, from the paper ‘Evaluating the World Model Implicit in a Generative Model’ by Keyon Vafa, Justin Y. Chen, Ashesh Rambachan, Jon Kleinberg and Sendhil Mullainathan. –
The result did not look anything like a street map of Manhattan. Close inspection revealed the AI had inferred all kinds of impossible maneuvers—routes that leapt over Central Park, or traveled diagonally for many blocks. Yet the resulting model managed to give usable turn-by-turn directions between any two points in the borough with 99% accuracy.
Even though its topsy-turvy map would drive any motorist mad, the model had essentially learned separate rules for navigating in a multitude of situations, from every possible starting point, Vafa says.
The vast “brains” of AIs, paired with unprecedented processing power, allow them to learn how to solve problems in a messy way which would be impossible for a person.
Other research looks at the peculiarities that arise when large language models try to do math, something they’re historically bad at doing, but are getting better at. Some studies show that models learn a separate set of rules for multiplying numbers in a certain range—say, from 200 to 210—than they use for multiplying numbers in some other range. If you think that’s a less than ideal way to do math, you’re right.
All of this work suggests that under the hood, today’s AIs are overly complicated, patched-together Rube Goldberg machines full of ad-hoc solutions for answering our prompts. Understanding that these systems are long lists of cobbled-together rules of thumb could go a long way to explaining why they struggle when they’re asked to do things even a little bit outside their training, says Vafa. When his team blocked just 1% of the virtual Manhattan’s roads, forcing the AI to navigate around detours, its performance plummeted.
This illustrates a big difference between today’s AIs and people, he adds. A person might not be able to recite turn-by-turn directions around New York City with 99% accuracy, but they’d be mentally flexible enough to avoid a bit of roadwork.
This research also suggests why many models are so massive: They have to memorize an endless list of rules of thumb, and can’t compress that knowledge into a mental model like a person can. It might also help explain why they have to learn on such enormous amounts of data, where a person can pick something up after just a few trials: To derive all those individual rules of thumb, they have to see every possible combination of words, images, game-board positions and the like. And to really train them well, they need to see those combinations over and over.
This research might also explain why AIs from different companies all seem to be “thinking” the same way, and are even converging on the same level of performance—performance that might be plateauing.
AI researchers have gotten ahead of themselves before. In 1970, Massachusetts Institute of Technology professor Marvin Minsky told Life magazine that a computer would have the intelligence of an average human being in “three to eight years.”
Last year, Elon Musk claimed that AI will exceed human intelligence by 2026. In February, Sam Altman wrote on his blog that “systems that start to point to AGI are coming into view,” and that this moment in history represents “the beginning of something for which it’s hard not to say, ‘This time it’s different.’” On Tuesday, Anthropic’s chief security officer warned that “virtual employees” will be working in U.S. companies within a year.
Even if these prognostications prove premature, AI is here to stay, and to change our lives. Software developers are only just figuring out how to use these undeniably impressive systems to help us all be more productive. And while their inherent smarts might be leveling off, work on refining them continues.
Meanwhile, research into the limitations of how AI “thinks” could be an important part of making them better. In a recent essay, MIT AI researcher Jacob Andreas wrote that better understanding of language models’ challenges leads to new ways to train them: “We can make LMs better (more accurate, more trustworthy, more controllable) as we start to address those limitations.”
Write to Christopher Mims at christopher.mims@wsj.com