Nearly All AI Chatbots Show Signs of Cognitive Decline todayheadline

In an unexpected twist that bridges the worlds of artificial intelligence and neurology, researchers have discovered that leading AI chatbots display patterns similar to mild cognitive impairment when subjected to standard dementia screening tests. The findings raise intriguing questions about the true capabilities of AI in medical settings.

Published in The BMJ | Estimated reading time: 5 minutes

The race to integrate artificial intelligence into healthcare has been marked by bold predictions about AI replacing human doctors. However, a fascinating new study published in The BMJ’s Christmas issue reveals that even the most sophisticated AI language models may have significant cognitive limitations that mirror human cognitive decline.

Researchers put several leading AI chatbots through the Montreal Cognitive Assessment (MoCA), a standardized test widely used to detect early signs of dementia. The study included the latest versions of prominent AI models: OpenAI’s ChatGPT-4 and 4o, Anthropic’s Claude 3.5 “Sonnet”, and Alphabet’s Gemini versions 1 and 1.5.

The results were striking. ChatGPT-4o emerged as the top performer with a score of 26 out of 30 – just barely reaching the threshold considered normal for human cognitive function. ChatGPT-4 and Claude tied at 25 points, while Gemini 1.0 scored significantly lower at 16 points. Perhaps most telling was that “older” versions of the chatbots tended to perform worse on the tests, mimicking age-related cognitive decline in humans.

The AI models showed consistent weaknesses in specific areas. As the study notes, “All chatbots showed poor performance in visuospatial skills and executive tasks,” struggling particularly with challenges like connecting numbers and letters in sequence and drawing clock faces. Most models excelled in areas like naming, attention, and language comprehension, but stumbled when faced with tasks requiring visual abstraction and executive function.

These findings challenge the narrative of AI’s imminent takeover of medical diagnosis. As the researchers conclude, “Not only are neurologists unlikely to be replaced by large language models any time soon, but our findings suggest that they may soon find themselves treating new, virtual patients – artificial intelligence models presenting with cognitive impairment.”

Glossary

Large Language Models (LLMs): Advanced AI systems trained on vast amounts of text data to understand and generate human-like language.
Montreal Cognitive Assessment (MoCA): A standardized screening tool used by healthcare professionals to detect cognitive impairment and early signs of dementia.
Executive Function: Mental skills that help with planning, focusing attention, remembering instructions, and handling multiple tasks successfully

Test Your Knowledge

What was the highest score achieved by any AI model on the MoCA test?

ChatGPT-4o scored 26 out of 30, the highest among all tested models.

What is considered a normal score on the MoCA test?

A score of 26 or above is generally considered normal.

Which specific types of tasks proved most challenging for the AI chatbots?

The chatbots particularly struggled with visuospatial skills and executive tasks, such as trail making and clock drawing tests.

How did the performance of “older” versions of chatbots compare to newer ones, and what human parallel does this suggest?

Older versions of chatbots tended to perform worse on the tests, mirroring the pattern of age-related cognitive decline seen in human patients.

Enjoy this story? Subscribe to our newsletter at scienceblog.substack.com.

Nearly All AI Chatbots Show Signs of Cognitive Decline todayheadline

Trish Torenbeek – Greenpeace Australia Pacific

Fed claims Us economic expansions solid, cuts rate to 4.50-4.75-pct, stocks tank todayheadline

Related Posts

A conjunction of Mercury and Jupiter

Can This Blue Chemical Really Boost Your Brain? Here’s What We Know. : ScienceAlert todayheadline

Fed claims Us economic expansions solid, cuts rate to 4.50-4.75-pct, stocks tank todayheadline

Family calls for change after B.C. nurse dies by suicide after attacks on the job

Product reduces TPH levels to non-hazardous status

Hospital Mergers Fail to Deliver Better Care or Lower Costs, Study Finds todayheadline

Police ID man who died after Corso Italia fight

Harris tells supporters ‘never give up’ and urges peaceful transfer of power

Des Moines Man Accused Of Shooting Ex-Girlfriend’s Mother

Trump ‘looks forward’ to White House meeting with Biden

Catholic voters were critical to Donald Trump’s blowout victory: ‘Harris snubbed us’

Travel ban may shut door for Afghan family to bring niece to US for a better life

DICT eyes eight million jobs through ‘Trabahong Digital’ program

History Illustrated: Israel expands illegal settlements (redux)

Jerusalemite of the Week: Forager and fighter Adara Peskin Shalem

Recent News

Travel ban may shut door for Afghan family to bring niece to US for a better life

DICT eyes eight million jobs through ‘Trabahong Digital’ program

History Illustrated: Israel expands illegal settlements (redux)

Jerusalemite of the Week: Forager and fighter Adara Peskin Shalem

Browse by Category

Recent News

Travel ban may shut door for Afghan family to bring niece to US for a better life

DICT eyes eight million jobs through ‘Trabahong Digital’ program

Welcome Back!

Retrieve your password