• Education
    • Higher Education
    • Scholarships & Grants
    • Online Learning
    • School Reforms
    • Research & Innovation
  • Lifestyle
    • Travel
    • Food & Drink
    • Fashion & Beauty
    • Home & Living
    • Relationships & Family
  • Technology & Startups
    • Software & Apps
    • Startup Success Stories
    • Startups & Innovations
    • Tech Regulations
    • Venture Capital
    • Artificial Intelligence
    • Cybersecurity
    • Emerging Technologies
    • Gadgets & Devices
    • Industry Analysis
  • About us
  • Contact
  • Advertise with Us
  • Privacy & Policy
Today Headline
  • Home
  • World News
    • Us & Canada
    • Europe
    • Asia
    • Africa
    • Middle East
  • Politics
    • Elections
    • Political Parties
    • Government Policies
    • International Relations
    • Legislative News
  • Business & Finance
    • Market Trends
    • Stock Market
    • Entrepreneurship
    • Corporate News
    • Economic Policies
  • Science & Environment
    • Space Exploration
    • Climate Change
    • Wildlife & Conservation
    • Environmental Policies
    • Medical Research
  • Health
    • Public Health
    • Mental Health
    • Medical Breakthroughs
    • Fitness & Nutrition
    • Pandemic Updates
  • Sports
    • Football
    • Basketball
    • Tennis
    • Olympics
    • Motorsport
  • Entertainment
    • Movies
    • Music
    • TV & Streaming
    • Celebrity News
    • Awards & Festivals
  • Crime & Justice
    • Court Cases
    • Cybercrime
    • Policing
    • Criminal Investigations
    • Legal Reforms
No Result
View All Result
  • Home
  • World News
    • Us & Canada
    • Europe
    • Asia
    • Africa
    • Middle East
  • Politics
    • Elections
    • Political Parties
    • Government Policies
    • International Relations
    • Legislative News
  • Business & Finance
    • Market Trends
    • Stock Market
    • Entrepreneurship
    • Corporate News
    • Economic Policies
  • Science & Environment
    • Space Exploration
    • Climate Change
    • Wildlife & Conservation
    • Environmental Policies
    • Medical Research
  • Health
    • Public Health
    • Mental Health
    • Medical Breakthroughs
    • Fitness & Nutrition
    • Pandemic Updates
  • Sports
    • Football
    • Basketball
    • Tennis
    • Olympics
    • Motorsport
  • Entertainment
    • Movies
    • Music
    • TV & Streaming
    • Celebrity News
    • Awards & Festivals
  • Crime & Justice
    • Court Cases
    • Cybercrime
    • Policing
    • Criminal Investigations
    • Legal Reforms
No Result
View All Result
Today Headline
No Result
View All Result
Home World News Asia

Alibaba’s AI model Qwen3: A smart kid prone to hallucinations

May 2, 2025
in Asia
Reading Time: 4 mins read
A A
0
Alibaba’s AI model Qwen3: A smart kid prone to hallucinations
6
SHARES
12
VIEWS
Share on FacebookShare on Twitter


Alibaba Group’s newly-released large language model Qwen3 has shown higher mathematical-proving and code-writing abilities than its previous models and some American peers, putting it at the top of benchmark charts. 

Qwen3 offers two mixture-of-experts (MoE) models (Qwen3-235B-A22B and Qwen3-32B-A3B) and six dense models. 

A MoE, also used by OpenAI’s ChatGPT and Anthropic’s Claude, can assign a specialized “expert” model to answer questions on a specific topic. A dense model can perform a wide range of tasks, such as image classification and natural language processing, by learning complex patterns in data.

Alibaba, a Hangzhou-based company, used 36 trillion tokens to train Qwen3, doubling the number used for training the Qwen2.5 model. DeepSeek, another Hangzhou-based firm, used 14.8 trillion tokens to train its R1 model. The higher the number of tokens used, the more knowledgeable an AI model is.

At the same time, Qwen3 has a lower deployment threshold than DeepSeek V3, meaning users can deploy it at lower operating costs and with reduced energy consumption.

Qwen3-235B-A22B features 235 billion parameters but requires activating only 22 billion. DeepSeek R1 features 671 billion parameters and requires activating 37 billion. Fewer parameters mean lower operation costs.

The US stock market slumped after DeepSeek launched its R1 model on January 20. AI stock investors were shocked by DeepSeek R1’s high performance and low training costs.

Media reports said DeepSeek will unveil its R2 model in May. Some AI fans expected DeepSeek R2 to have greater reasoning ability than R1 and the ability to catch up with OpenAI o4-mini.

‘Nonsensical benchmark hacking’

Since Alibaba released Qwen3 early on the morning of April 29, AI fans have performed various tests to check its performance.

The Yangtze Evening News reported that Qwen3 scored 70.7 on LiveCodeBench v5, which tests AI models’ code-writing ability. This beat DeepSeek R1 (64.3), OpenAI o3-mini (66.3), Gemini2.5 Pro (70.4), and Grok 3 Beta (70.6).

On AIME’24, which tests AI models’ mathematical-proofing ability, Qwen3 scored 85.7, better than DeepSeek R1 (79.8), OpenAI o3-mini (79.6), and Grok 3 Beta (83.9). However, it lagged behind Gemini2.5 Pro, which scored 92.

The newspaper’s reporter found that Qwen3 fails to deal with complex reasoning tasks and lacks knowledge in some areas, resulting in “hallucinations,” a typical situation in which an AI model provides false information.

“We asked Qwen3 to write some stories in Chinese. We feel that the stories are more delicate and fluent than those written by previous AI models, but their flows and scenes are illogical,” the reporter said. “The AI model seems to be putting everything together without thinking.”

In terms of scientific reasoning, Qwen3 scored 70%, lagging behind Gemini 2.5 Pro (84%), OpenAI o3-mini (83%), Grok 3 mini (79%), and DeepSeek R1 (71%), according to Artificial Analysis, an independent AI benchmarking & analysis company. 

In terms of reasoning and knowledge in humanity, Qwen3 scored 11.7%, beating Grok 3 mini (11.1%), Claude 3.7 (10.3%), and DeepSeek R1 (9.3%). However, it still lagged behind OpenAI o3-mini (20%) and Gemini 2.5 Pro (17.1%).

In February of this year, Microsoft Chief Executive Satya Nadella said that focusing on self-proclaimed milestones, such as achieving artificial general intelligence (AGI), is only a form of “nonsensical benchmark hacking.” 

He said an AI model can declare victory only if it helps achieve a 10% annual growth in gross domestic product. 

Chip shortage

While Chinese AI firms need more time to catch up with American players, they face a new challenge – a shortage of AI chips.

In early April, Chinese media reported that ByteDance, Alibaba, and Tencent reportedly ordered more than 100,000 H20 chips from Nvidia for 16 billion yuan (US$2.2 billion).

On April 15, Nvidia said it had been informed by the US government informed that the company would need a license to ship its H20 AI chips to China. The government cited the risk that Chinese firms would use the H20 chips in supercomputers.

The Information reported on May 2 that Nvidia had told some of its biggest Chinese customers that it is tweaking the design of its AI chips so they can continue to ship AI chips to China. A sample of the new chip will be available as early as June.

Nvidia has already tailored AI chips for the Chinese market several times. After Washington restricted the export of A100 and H100 chips to China in October 2022, Nvidia designed the A800 and H800 chips. However, the US government extended its export controls to cover them in October 2023. Then, Nvidia unveiled the H20.

Although the H20 only performs equivalent to 15% of the H100, Chinese firms are still rushing to buy it, instead of Huawei’s Ascend 910B chip, which faces a limited supply due to a low production yield.

A Chinese IT columnist said the Ascend 910B is a faster chip than the H20, but the H20’s bandwidth is ten times that of the 910B’s. He said a higher bandwidth in an AI chip, like a better gearbox in a sports car, can achieve a more stable performance.

The Application of Electronic Technique, a Chinese scientific journal, said China’s AI firms could try to use homegrown chips, such as Cambricon Technologies’ Siyuan 590, Hygon Information Technology’s DCU series, Moore Threads’ MTT S80, Biren Technology’s BR104, or Huawei’s upcoming Ascend 910C.  

Read: After DeepSeek: China’s Manus – the hot new AI under the spotlight

Previous Post

How the party ended for Wall Street bank’s Málaga experiment

Next Post

Trump might keep Marco Rubio as national security adviser and secretary of state

Related Posts

Moody’s tells us what we already know about U.S. debt

Moody’s tells us what we already know about U.S. debt

May 19, 2025
5
‘China’s Ozempic’ may vie with Eli Lilly, Novo Nordisk for global weight-loss drug market

‘China’s Ozempic’ may vie with Eli Lilly, Novo Nordisk for global weight-loss drug market

May 19, 2025
4
Next Post
Trump might keep Marco Rubio as national security adviser and secretary of state

Trump might keep Marco Rubio as national security adviser and secretary of state

  • Trending
  • Comments
  • Latest
Family calls for change after B.C. nurse dies by suicide after attacks on the job

Family calls for change after B.C. nurse dies by suicide after attacks on the job

April 2, 2025
Pioneering 3D printing project shares successes

Product reduces TPH levels to non-hazardous status

November 27, 2024

Hospital Mergers Fail to Deliver Better Care or Lower Costs, Study Finds todayheadline

December 31, 2024

Police ID man who died after Corso Italia fight

December 23, 2024
Harris tells supporters 'never give up' and urges peaceful transfer of power

Harris tells supporters ‘never give up’ and urges peaceful transfer of power

0
Des Moines Man Accused Of Shooting Ex-Girlfriend's Mother

Des Moines Man Accused Of Shooting Ex-Girlfriend’s Mother

0

Trump ‘looks forward’ to White House meeting with Biden

0
Catholic voters were critical to Donald Trump’s blowout victory: ‘Harris snubbed us’

Catholic voters were critical to Donald Trump’s blowout victory: ‘Harris snubbed us’

0
National Poll: Some parents say they waited too long to stop pacifier use or thumb-sucking in children

Poll reveals some parents say they waited too long to stop pacifier use or thumb-sucking in children

May 19, 2025
Moody’s tells us what we already know about U.S. debt

Moody’s tells us what we already know about U.S. debt

May 19, 2025
NATO corruption probe 'reminder' of defense boom risks – DW – 05/19/2025

NATO corruption probe ‘reminder’ of defense boom risks – DW – 05/19/2025

May 19, 2025
‘Fear is real’: Why young Kashmiris are removing tattoos of guns, ‘freedom’

‘Fear is real’: Why young Kashmiris are removing tattoos of guns, ‘freedom’

May 19, 2025

Recent News

National Poll: Some parents say they waited too long to stop pacifier use or thumb-sucking in children

Poll reveals some parents say they waited too long to stop pacifier use or thumb-sucking in children

May 19, 2025
4
Moody’s tells us what we already know about U.S. debt

Moody’s tells us what we already know about U.S. debt

May 19, 2025
5
NATO corruption probe 'reminder' of defense boom risks – DW – 05/19/2025

NATO corruption probe ‘reminder’ of defense boom risks – DW – 05/19/2025

May 19, 2025
5
‘Fear is real’: Why young Kashmiris are removing tattoos of guns, ‘freedom’

‘Fear is real’: Why young Kashmiris are removing tattoos of guns, ‘freedom’

May 19, 2025
5

TodayHeadline is a dynamic news website dedicated to delivering up-to-date and comprehensive news coverage from around the globe.

Follow Us

Browse by Category

  • Africa
  • Asia
  • Basketball
  • Business & Finance
  • Climate Change
  • Crime & Justice
  • Economic Policies
  • Elections
  • Entertainment
  • Entrepreneurship
  • Environmental Policies
  • Europe
  • Football
  • Gadgets & Devices
  • Health
  • Medical Research
  • Mental Health
  • Middle East
  • Motorsport
  • Olympics
  • Politics
  • Public Health
  • Relationships & Family
  • Science & Environment
  • Software & Apps
  • Space Exploration
  • Sports
  • Stock Market
  • Technology & Startups
  • Tennis
  • Travel
  • Uncategorized
  • Us & Canada
  • Wildlife & Conservation
  • World News

Recent News

Regulus and R Leonis

The variable star R Leonis

May 19, 2025
National Poll: Some parents say they waited too long to stop pacifier use or thumb-sucking in children

Poll reveals some parents say they waited too long to stop pacifier use or thumb-sucking in children

May 19, 2025
  • Education
  • Lifestyle
  • Technology & Startups
  • About us
  • Contact
  • Advertise with Us
  • Privacy & Policy

© 2024 Todayheadline.co

Welcome Back!

OR

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Business & Finance
  • Corporate News
  • Economic Policies
  • Entrepreneurship
  • Market Trends
  • Crime & Justice
  • Court Cases
  • Criminal Investigations
  • Cybercrime
  • Legal Reforms
  • Policing
  • Education
  • Higher Education
  • Online Learning
  • Entertainment
  • Awards & Festivals
  • Celebrity News
  • Movies
  • Music
  • Health
  • Fitness & Nutrition
  • Medical Breakthroughs
  • Mental Health
  • Pandemic Updates
  • Lifestyle
  • Fashion & Beauty
  • Food & Drink
  • Home & Living
  • Politics
  • Elections
  • Government Policies
  • International Relations
  • Legislative News
  • Political Parties
  • Africa
  • Asia
  • Europe
  • Middle East
  • Artificial Intelligence
  • Cybersecurity
  • Emerging Technologies
  • Gadgets & Devices
  • Industry Analysis
  • Basketball
  • Football
  • Motorsport
  • Olympics
  • Climate Change
  • Environmental Policies
  • Medical Research
  • Science & Environment
  • Space Exploration
  • Wildlife & Conservation
  • Sports
  • Tennis
  • Technology & Startups
  • Software & Apps
  • Startup Success Stories
  • Startups & Innovations
  • Tech Regulations
  • Venture Capital
  • Uncategorized
  • World News
  • Us & Canada
  • Public Health
  • Relationships & Family
  • Travel
  • Research & Innovation
  • Scholarships & Grants
  • School Reforms
  • Stock Market
  • TV & Streaming
  • Advertise with Us
  • Privacy & Policy
  • About us
  • Contact

© 2024 Todayheadline.co