• Education
    • Higher Education
    • Scholarships & Grants
    • Online Learning
    • School Reforms
    • Research & Innovation
  • Lifestyle
    • Travel
    • Food & Drink
    • Fashion & Beauty
    • Home & Living
    • Relationships & Family
  • Technology & Startups
    • Software & Apps
    • Startup Success Stories
    • Startups & Innovations
    • Tech Regulations
    • Venture Capital
    • Artificial Intelligence
    • Cybersecurity
    • Emerging Technologies
    • Gadgets & Devices
    • Industry Analysis
  • About us
  • Contact
  • Advertise with Us
  • Privacy & Policy
Today Headline
  • Home
  • World News
    • Us & Canada
    • Europe
    • Asia
    • Africa
    • Middle East
  • Politics
    • Elections
    • Political Parties
    • Government Policies
    • International Relations
    • Legislative News
  • Business & Finance
    • Market Trends
    • Stock Market
    • Entrepreneurship
    • Corporate News
    • Economic Policies
  • Science & Environment
    • Space Exploration
    • Climate Change
    • Wildlife & Conservation
    • Environmental Policies
    • Medical Research
  • Health
    • Public Health
    • Mental Health
    • Medical Breakthroughs
    • Fitness & Nutrition
    • Pandemic Updates
  • Sports
    • Football
    • Basketball
    • Tennis
    • Olympics
    • Motorsport
  • Entertainment
    • Movies
    • Music
    • TV & Streaming
    • Celebrity News
    • Awards & Festivals
  • Crime & Justice
    • Court Cases
    • Cybercrime
    • Policing
    • Criminal Investigations
    • Legal Reforms
No Result
View All Result
  • Home
  • World News
    • Us & Canada
    • Europe
    • Asia
    • Africa
    • Middle East
  • Politics
    • Elections
    • Political Parties
    • Government Policies
    • International Relations
    • Legislative News
  • Business & Finance
    • Market Trends
    • Stock Market
    • Entrepreneurship
    • Corporate News
    • Economic Policies
  • Science & Environment
    • Space Exploration
    • Climate Change
    • Wildlife & Conservation
    • Environmental Policies
    • Medical Research
  • Health
    • Public Health
    • Mental Health
    • Medical Breakthroughs
    • Fitness & Nutrition
    • Pandemic Updates
  • Sports
    • Football
    • Basketball
    • Tennis
    • Olympics
    • Motorsport
  • Entertainment
    • Movies
    • Music
    • TV & Streaming
    • Celebrity News
    • Awards & Festivals
  • Crime & Justice
    • Court Cases
    • Cybercrime
    • Policing
    • Criminal Investigations
    • Legal Reforms
No Result
View All Result
Today Headline
No Result
View All Result
Home Science & Environment

How AI Systems Fail the Human Test todayheadline

November 22, 2024
in Science & Environment
Reading Time: 3 mins read
A A
0
How AI Systems Fail the Human Test
2
SHARES
5
VIEWS
Share on FacebookShare on Twitter

Economists have a game that reveals how deeply individuals reason. Known as the 11-20 money request game, it is played between two players who each request an amount of money between 11 and 20 shekels, knowing that both will receive the amount they ask for.

But there’s a twist: if one player asks for exactly one shekel less than the other, that player earns a bonus of 20 shekels. This tests each player’s ability to think about what their opponent might do — a classic challenge of strategic reasoning.

The 11-20 game is an example of level-k reasoning in game theory, where each player tries to anticipate the other’s thought process and adjust their own choices accordingly. For example, a player using level-1 reasoning might pick 19 shekels, assuming the other will pick 20. But a level-2 thinker might ask for 18, predicting that their opponent will go for 19. This kind of thinking gets layered, creating an intricate dance of strategy and second-guessing.

Human Replacements?

In recent years, various researchers have suggested that large language models (LLMs) like ChatGPT and Claude can behave like humans in a wide range of tasks. That’s raised the possibility that LLMs could replace humans in tasks like testing opinions of new products and adverts before they are released to the human market, an approach that would be significantly cheaper than current methods.

But that raises the important question of whether LLM behavior really is similar to humans’. Now we get an answer thanks to the work of Yuan Gao and colleagues at Boston University, who have used a wide range of advanced LLMs to play the 11-20 game. They found that none of these AI systems produced results similar to human players and say that extreme caution is needed when it comes to using LLMs as surrogates for humans.

The team’s approach is straightforward. They explained the rules of the game to LLMs, including several models from ChatGPT, Claude, and Llama. They asked each to choose a number and then explain its reasoning. And they repeated the experiment a thousand times for each LLM.

But Gao and co were not impressed with the results. Human players typically use sophisticated strategies that reflect deeper reasoning levels. For example, a common human choice might be 17, reflecting an assumption that their opponent will select a higher value like 18 or 19. But the LLMs showed a starkly different pattern: many simply chose 20 or 19, reflecting basic level-0 or level-1 reasoning.

The researchers also tried to improve the performance of LLMs with techniques like writing more suitable prompts and fine-tuning the models. GPT-4 showed more human-like responses as a result, but the others all failed to.

The behavior of LLMs was also highly inconsistent depending on irrelevant factors, such as the language they were prompted in.

Gao and co say the reason LLMs fail to reproduce human behavior is that they don’t reason like humans. Human behavior is complex, driven by emotions, biases, and varied interpretations of incentives, like the desire to beat an opponent. LLMs give their answer using patterns in language to predict the next word in a sentence, a process that is fundamentally different to human thinking.

Sobering Result

That’s likely to be a sobering result for social scientists, for whom the idea that LLMs could replace humans in certain types of experiments is tempting.

But Gao and co say: “Expecting to gain insights into human behavioral patterns through experiments on LLMs is like a psychologist interviewing a parrot to understand the mental state of its human owner.” The parrot might use similar words and phrases to its owner but manifestly without insight.

“These LLMs are human-like in appearance yet fundamentally and unpredictably different in behavior,” they say.

Social scientists: you have been warned!


Ref: Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina : arxiv.org/abs/2410.19599

Tags: artificial intelligence
Previous Post

Hezbollah leader who planned attacks on US soldiers killed in IAF strike

Next Post

Pam Bondi: Who is Pam Bondi, Trump’s nominee for Attorney General after controversial Matt Gaetz withdraws – The Economic Times Video todayheadline

Related Posts

illustration of donut, pasta, rice, potatoes, banana and other carbohydrate rich foods

New Study Reveals an Easier Alternative to Intermittent Fasting : ScienceAlert todayheadline

May 11, 2025
5
petition button resized 3

Petition: Stop Wind Turbine Bat Deaths in Australia – A Simple Fix Can Save Thousands

May 11, 2025
7
Next Post

Pam Bondi: Who is Pam Bondi, Trump's nominee for Attorney General after controversial Matt Gaetz withdraws - The Economic Times Video todayheadline

  • Trending
  • Comments
  • Latest
Family calls for change after B.C. nurse dies by suicide after attacks on the job

Family calls for change after B.C. nurse dies by suicide after attacks on the job

April 2, 2025
Pioneering 3D printing project shares successes

Product reduces TPH levels to non-hazardous status

November 27, 2024

Hospital Mergers Fail to Deliver Better Care or Lower Costs, Study Finds todayheadline

December 31, 2024

Police ID man who died after Corso Italia fight

December 23, 2024
Harris tells supporters 'never give up' and urges peaceful transfer of power

Harris tells supporters ‘never give up’ and urges peaceful transfer of power

0
Des Moines Man Accused Of Shooting Ex-Girlfriend's Mother

Des Moines Man Accused Of Shooting Ex-Girlfriend’s Mother

0

Trump ‘looks forward’ to White House meeting with Biden

0
Catholic voters were critical to Donald Trump’s blowout victory: ‘Harris snubbed us’

Catholic voters were critical to Donald Trump’s blowout victory: ‘Harris snubbed us’

0
US, Iran to hold 4th round of nuclear talks amid standoff over enrichment rights

US, Iran to hold 4th round of nuclear talks amid standoff over enrichment rights

May 11, 2025
ET logo

Trump’s disregard for intelligence briefings raises national security alarms todayheadline

May 11, 2025
Tufts University student back in Boston after release from Louisiana detention center

Tufts University student back in Boston after release from Louisiana detention center

May 11, 2025
Bongino calls out 'nonsense' media reports, defends FBI leadership in X post

Bongino calls out ‘nonsense’ media reports, defends FBI leadership in X post

May 11, 2025

Recent News

US, Iran to hold 4th round of nuclear talks amid standoff over enrichment rights

US, Iran to hold 4th round of nuclear talks amid standoff over enrichment rights

May 11, 2025
4
ET logo

Trump’s disregard for intelligence briefings raises national security alarms todayheadline

May 11, 2025
5
Tufts University student back in Boston after release from Louisiana detention center

Tufts University student back in Boston after release from Louisiana detention center

May 11, 2025
6
Bongino calls out 'nonsense' media reports, defends FBI leadership in X post

Bongino calls out ‘nonsense’ media reports, defends FBI leadership in X post

May 11, 2025
5

TodayHeadline is a dynamic news website dedicated to delivering up-to-date and comprehensive news coverage from around the globe.

Follow Us

Browse by Category

  • Africa
  • Asia
  • Basketball
  • Business & Finance
  • Climate Change
  • Crime & Justice
  • Economic Policies
  • Elections
  • Entertainment
  • Entrepreneurship
  • Environmental Policies
  • Europe
  • Football
  • Gadgets & Devices
  • Health
  • Medical Research
  • Mental Health
  • Middle East
  • Motorsport
  • Olympics
  • Politics
  • Public Health
  • Relationships & Family
  • Science & Environment
  • Software & Apps
  • Space Exploration
  • Sports
  • Stock Market
  • Technology & Startups
  • Tennis
  • Travel
  • Uncategorized
  • Us & Canada
  • Wildlife & Conservation
  • World News

Recent News

US, Iran to hold 4th round of nuclear talks amid standoff over enrichment rights

US, Iran to hold 4th round of nuclear talks amid standoff over enrichment rights

May 11, 2025
ET logo

Trump’s disregard for intelligence briefings raises national security alarms todayheadline

May 11, 2025
  • Education
  • Lifestyle
  • Technology & Startups
  • About us
  • Contact
  • Advertise with Us
  • Privacy & Policy

© 2024 Todayheadline.co

Welcome Back!

OR

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Business & Finance
  • Corporate News
  • Economic Policies
  • Entrepreneurship
  • Market Trends
  • Crime & Justice
  • Court Cases
  • Criminal Investigations
  • Cybercrime
  • Legal Reforms
  • Policing
  • Education
  • Higher Education
  • Online Learning
  • Entertainment
  • Awards & Festivals
  • Celebrity News
  • Movies
  • Music
  • Health
  • Fitness & Nutrition
  • Medical Breakthroughs
  • Mental Health
  • Pandemic Updates
  • Lifestyle
  • Fashion & Beauty
  • Food & Drink
  • Home & Living
  • Politics
  • Elections
  • Government Policies
  • International Relations
  • Legislative News
  • Political Parties
  • Africa
  • Asia
  • Europe
  • Middle East
  • Artificial Intelligence
  • Cybersecurity
  • Emerging Technologies
  • Gadgets & Devices
  • Industry Analysis
  • Basketball
  • Football
  • Motorsport
  • Olympics
  • Climate Change
  • Environmental Policies
  • Medical Research
  • Science & Environment
  • Space Exploration
  • Wildlife & Conservation
  • Sports
  • Tennis
  • Technology & Startups
  • Software & Apps
  • Startup Success Stories
  • Startups & Innovations
  • Tech Regulations
  • Venture Capital
  • Uncategorized
  • World News
  • Us & Canada
  • Public Health
  • Relationships & Family
  • Travel
  • Research & Innovation
  • Scholarships & Grants
  • School Reforms
  • Stock Market
  • TV & Streaming
  • Advertise with Us
  • Privacy & Policy
  • About us
  • Contact

© 2024 Todayheadline.co