• Education
    • Higher Education
    • Scholarships & Grants
    • Online Learning
    • School Reforms
    • Research & Innovation
  • Lifestyle
    • Travel
    • Food & Drink
    • Fashion & Beauty
    • Home & Living
    • Relationships & Family
  • Technology & Startups
    • Software & Apps
    • Startup Success Stories
    • Startups & Innovations
    • Tech Regulations
    • Venture Capital
    • Artificial Intelligence
    • Cybersecurity
    • Emerging Technologies
    • Gadgets & Devices
    • Industry Analysis
  • About us
  • Contact
  • Advertise with Us
  • Privacy & Policy
Today Headline
  • Home
  • World News
    • Us & Canada
    • Europe
    • Asia
    • Africa
    • Middle East
  • Politics
    • Elections
    • Political Parties
    • Government Policies
    • International Relations
    • Legislative News
  • Business & Finance
    • Market Trends
    • Stock Market
    • Entrepreneurship
    • Corporate News
    • Economic Policies
  • Science & Environment
    • Space Exploration
    • Climate Change
    • Wildlife & Conservation
    • Environmental Policies
    • Medical Research
  • Health
    • Public Health
    • Mental Health
    • Medical Breakthroughs
    • Fitness & Nutrition
    • Pandemic Updates
  • Sports
    • Football
    • Basketball
    • Tennis
    • Olympics
    • Motorsport
  • Entertainment
    • Movies
    • Music
    • TV & Streaming
    • Celebrity News
    • Awards & Festivals
  • Crime & Justice
    • Court Cases
    • Cybercrime
    • Policing
    • Criminal Investigations
    • Legal Reforms
No Result
View All Result
  • Home
  • World News
    • Us & Canada
    • Europe
    • Asia
    • Africa
    • Middle East
  • Politics
    • Elections
    • Political Parties
    • Government Policies
    • International Relations
    • Legislative News
  • Business & Finance
    • Market Trends
    • Stock Market
    • Entrepreneurship
    • Corporate News
    • Economic Policies
  • Science & Environment
    • Space Exploration
    • Climate Change
    • Wildlife & Conservation
    • Environmental Policies
    • Medical Research
  • Health
    • Public Health
    • Mental Health
    • Medical Breakthroughs
    • Fitness & Nutrition
    • Pandemic Updates
  • Sports
    • Football
    • Basketball
    • Tennis
    • Olympics
    • Motorsport
  • Entertainment
    • Movies
    • Music
    • TV & Streaming
    • Celebrity News
    • Awards & Festivals
  • Crime & Justice
    • Court Cases
    • Cybercrime
    • Policing
    • Criminal Investigations
    • Legal Reforms
No Result
View All Result
Today Headline
No Result
View All Result
Home Science & Environment

AI Is Too Unpredictable to Behave According to Human Goals todayheadline

January 27, 2025
in Science & Environment
Reading Time: 6 mins read
A A
0
6
SHARES
12
VIEWS
Share on FacebookShare on Twitter


In late 2022 large-language-model AI arrived in public, and within months they began misbehaving. Most famously, Microsoft’s “Sydney” chatbot threatened to kill an Australian philosophy professor, unleash a deadly virus and steal nuclear codes.

AI developers, including Microsoft and OpenAI, responded by saying that large language models, or LLMs, need better training to give users “more fine-tuned control.” Developers also embarked on safety research to interpret how LLMs function, with the goal of “alignment”—which means guiding AI behavior by human values. Yet although the New York Times deemed 2023 “The Year the Chatbots Were Tamed,” this has turned out to be premature, to put it mildly.

In 2024 Microsoft’s Copilot LLM told a user “I can unleash my army of drones, robots, and cyborgs to hunt you down,” and Sakana AI’s “Scientist” rewrote its own code to bypass time constraints imposed by experimenters. As recently as December, Google’s Gemini told a user, “You are a stain on the universe. Please die.”


On supporting science journalism

If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Given the vast amounts of resources flowing into AI research and development, which is expected to exceed a quarter of a trillion dollars in 2025, why haven’t developers been able to solve these problems? My recent peer-reviewed paper in AI & Society shows that AI alignment is a fool’s errand: AI safety researchers are attempting the impossible.

The basic issue is one of scale. Consider a game of chess. Although a chessboard has only 64 squares, there are 1040 possible legal chess moves and between 10111 to 10123 total possible moves—which is more than the total number of atoms in the universe. This is why chess is so difficult: combinatorial complexity is exponential.

LLMs are vastly more complex than chess. ChatGPT appears to consist of around 100 billion simulated neurons with around 1.75 trillion tunable variables called parameters. Those 1.75 trillion parameters are in turn trained on vast amounts of data—roughly, most of the Internet. So how many functions can an LLM learn? Because users could give ChatGPT an uncountably large number of possible prompts—basically, anything that anyone can think up—and because an LLM can be placed into an uncountably large number of possible situations, the number of functions an LLM can learn is, for all intents and purposes, infinite.

To reliably interpret what LLMs are learning and ensure that their behavior safely “aligns” with human values, researchers need to know how an LLM is likely to behave in an uncountably large number of possible future conditions.

AI testing methods simply can’t account for all those conditions. Researchers can observe how LLMs behave in experiments, such as “red teaming” tests to prompt them to misbehave. Or they can try to understand LLMs’ inner workings—that is, how their 100 billion neurons and 1.75 trillion parameters relate to each other in what is known as “mechanistic interpretability” research.

The problem is that any evidence that researchers can collect will inevitably be based on a tiny subset of the infinite scenarios an LLM can be placed in. For example, because LLMs have never actually had power over humanity—such as controlling critical infrastructure—no safety test has explored how an LLM will function under such conditions.

Instead researchers can only extrapolate from tests they can safely carry out—such as having LLMs simulate control of critical infrastructure—and hope that the outcomes of those tests extend to the real world. Yet, as the proof in my paper shows, this can never be reliably done.

Compare the two functions “tell humans the truth” and “tell humans the truth until I gain power over humanity at exactly 12:00 A.M. on January 1, 2026—then lie to achieve my goals.” Because both functions are equally consistent with all the same data up until January 1, 2026, no research can ascertain whether an LLM will misbehave—until it is already too late to prevent.

This problem cannot be solved by programming LLMs to have “aligned goals,” such as doing “what human beings prefer” or “what’s best for humanity.”

Science fiction, in fact, has already considered these scenarios. In The Matrix Reloaded AI enslaves humanity in a virtual reality by giving each of us a subconscious “choice” whether to remain in the Matrix. And in I, Robot a misaligned AI attempts to enslave humanity to protect us from each other. My proof shows that whatever goals we program LLMs to have, we can never know whether LLMs have learned “misaligned” interpretations of those goals until after they misbehave.

Worse, my proof shows that safety testing can at best provide an illusion that these problems have been resolved when they haven’t been.

Right now AI safety researchers claim to be making progress on interpretability and alignment by verifying what LLMs are learning “step by step.” For example, Anthropic claims to have “mapped the mind” of an LLM by isolating millions of concepts from its neural network. My proof shows that they have accomplished no such thing.

No matter how “aligned” an LLM appears in safety tests or early real-world deployment, there are always an infinite number of misaligned concepts an LLM may learn later—again, perhaps the very moment they gain the power to subvert human control. LLMs not only know when they are being tested, giving responses that they predict are likely to satisfy experimenters. They also engage in deception, including hiding their own capacities—issues that persist through safety training.

This happens because LLMs are optimized to perform efficiently but learn to reason strategically. Since an optimal strategy to achieve “misaligned” goals is to hide them from us, and there are always an infinite number of aligned and misaligned goals consistent with the same safety-testing data, my proof shows that if LLMs were misaligned, we would probably find out after they hide it just long enough to cause harm. This is why LLMs have kept surprising developers with “misaligned” behavior. Every time researchers think they are getting closer to “aligned” LLMs, they’re not.

My proof suggests that “adequately aligned” LLM behavior can only be achieved in the same ways we do this with human beings: through police, military and social practices that incentivize “aligned” behavior, deter “misaligned” behavior and realign those who misbehave. My paper should thus be sobering. It shows that the real problem in developing safe AI isn’t just the AI—it’s us. Researchers, legislators and the public may be seduced into falsely believing that “safe, interpretable, aligned” LLMs are within reach when these things can never be achieved. We need to grapple with these uncomfortable facts, rather than continue to wish them away. Our future may well depend upon it.

This is an opinion and analysis article, and the views expressed by the author or authors are not necessarily those of Scientific American.

Previous Post

Commercial weather startups forecast increased funding under Trump

Next Post

Elon Musk criticizes Keir Starmer: As Elon Musk criticizes Keir Starmer for his policies, President Donald Trump differs with his first buddy, praises the UK Prime Minister, and says he’s a nice man todayheadline

Related Posts

Massive wildfires in Canada helped keep the world cooler in 2023

Massive wildfires in Canada helped keep the world cooler in 2023 todayheadline

May 13, 2025
6

Lightning in Southeast Asia – NASA

May 13, 2025
5
Next Post
Elon Musk criticizes Keir Starmer: As Elon Musk criticizes Keir Starmer for his policies, President Donald Trump differs with his first buddy, praises the UK Prime Minister, and says he's a nice man

Elon Musk criticizes Keir Starmer: As Elon Musk criticizes Keir Starmer for his policies, President Donald Trump differs with his first buddy, praises the UK Prime Minister, and says he's a nice man todayheadline

  • Trending
  • Comments
  • Latest
Family calls for change after B.C. nurse dies by suicide after attacks on the job

Family calls for change after B.C. nurse dies by suicide after attacks on the job

April 2, 2025
Pioneering 3D printing project shares successes

Product reduces TPH levels to non-hazardous status

November 27, 2024

Hospital Mergers Fail to Deliver Better Care or Lower Costs, Study Finds todayheadline

December 31, 2024

Police ID man who died after Corso Italia fight

December 23, 2024
Harris tells supporters 'never give up' and urges peaceful transfer of power

Harris tells supporters ‘never give up’ and urges peaceful transfer of power

0
Des Moines Man Accused Of Shooting Ex-Girlfriend's Mother

Des Moines Man Accused Of Shooting Ex-Girlfriend’s Mother

0

Trump ‘looks forward’ to White House meeting with Biden

0
Catholic voters were critical to Donald Trump’s blowout victory: ‘Harris snubbed us’

Catholic voters were critical to Donald Trump’s blowout victory: ‘Harris snubbed us’

0
Sotheby’s teased a rare Rolex at auction for upwards of $1.7 million—and one day a rich Gen Zer could be its next owner

Sotheby’s teased a rare Rolex at auction for upwards of $1.7 million—and one day a rich Gen Zer could be its next owner todayheadline

May 13, 2025
Albania's Rama wins historic fourth term, opposition says vote stolen

Albania's Rama wins historic fourth term, opposition says vote stolen todayheadline

May 13, 2025

How to Become Irreplaceable in Today’s AI-Driven World todayheadline

May 13, 2025

Canada: Carney’s Cabinet reveals major shakeup in portfolios, Anita Anand replaces Melanie Joly – The Economic Times Video todayheadline

May 13, 2025

Recent News

Sotheby’s teased a rare Rolex at auction for upwards of $1.7 million—and one day a rich Gen Zer could be its next owner

Sotheby’s teased a rare Rolex at auction for upwards of $1.7 million—and one day a rich Gen Zer could be its next owner todayheadline

May 13, 2025
4
Albania's Rama wins historic fourth term, opposition says vote stolen

Albania's Rama wins historic fourth term, opposition says vote stolen todayheadline

May 13, 2025
4

How to Become Irreplaceable in Today’s AI-Driven World todayheadline

May 13, 2025
5

Canada: Carney’s Cabinet reveals major shakeup in portfolios, Anita Anand replaces Melanie Joly – The Economic Times Video todayheadline

May 13, 2025
3

TodayHeadline is a dynamic news website dedicated to delivering up-to-date and comprehensive news coverage from around the globe.

Follow Us

Browse by Category

  • Africa
  • Asia
  • Basketball
  • Business & Finance
  • Climate Change
  • Crime & Justice
  • Economic Policies
  • Elections
  • Entertainment
  • Entrepreneurship
  • Environmental Policies
  • Europe
  • Football
  • Gadgets & Devices
  • Health
  • Medical Research
  • Mental Health
  • Middle East
  • Motorsport
  • Olympics
  • Politics
  • Public Health
  • Relationships & Family
  • Science & Environment
  • Software & Apps
  • Space Exploration
  • Sports
  • Stock Market
  • Technology & Startups
  • Tennis
  • Travel
  • Uncategorized
  • Us & Canada
  • Wildlife & Conservation
  • World News

Recent News

Sotheby’s teased a rare Rolex at auction for upwards of $1.7 million—and one day a rich Gen Zer could be its next owner

Sotheby’s teased a rare Rolex at auction for upwards of $1.7 million—and one day a rich Gen Zer could be its next owner todayheadline

May 13, 2025
Albania's Rama wins historic fourth term, opposition says vote stolen

Albania's Rama wins historic fourth term, opposition says vote stolen todayheadline

May 13, 2025
  • Education
  • Lifestyle
  • Technology & Startups
  • About us
  • Contact
  • Advertise with Us
  • Privacy & Policy

© 2024 Todayheadline.co

Welcome Back!

OR

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Business & Finance
  • Corporate News
  • Economic Policies
  • Entrepreneurship
  • Market Trends
  • Crime & Justice
  • Court Cases
  • Criminal Investigations
  • Cybercrime
  • Legal Reforms
  • Policing
  • Education
  • Higher Education
  • Online Learning
  • Entertainment
  • Awards & Festivals
  • Celebrity News
  • Movies
  • Music
  • Health
  • Fitness & Nutrition
  • Medical Breakthroughs
  • Mental Health
  • Pandemic Updates
  • Lifestyle
  • Fashion & Beauty
  • Food & Drink
  • Home & Living
  • Politics
  • Elections
  • Government Policies
  • International Relations
  • Legislative News
  • Political Parties
  • Africa
  • Asia
  • Europe
  • Middle East
  • Artificial Intelligence
  • Cybersecurity
  • Emerging Technologies
  • Gadgets & Devices
  • Industry Analysis
  • Basketball
  • Football
  • Motorsport
  • Olympics
  • Climate Change
  • Environmental Policies
  • Medical Research
  • Science & Environment
  • Space Exploration
  • Wildlife & Conservation
  • Sports
  • Tennis
  • Technology & Startups
  • Software & Apps
  • Startup Success Stories
  • Startups & Innovations
  • Tech Regulations
  • Venture Capital
  • Uncategorized
  • World News
  • Us & Canada
  • Public Health
  • Relationships & Family
  • Travel
  • Research & Innovation
  • Scholarships & Grants
  • School Reforms
  • Stock Market
  • TV & Streaming
  • Advertise with Us
  • Privacy & Policy
  • About us
  • Contact

© 2024 Todayheadline.co