Apple’s New Research Shows That LLM Reasoning Is Completely Broken | by Dr. Ashish Bamania | Jun, 2025 | AI Advances

AI Summary Hide AI Generated Summary

Apple Research on LLM Reasoning

Apple's recent research casts doubt on the effectiveness of Large Reasoning Models (LLMs), particularly their reasoning abilities. The study challenges the prevalent belief that these models, designed for thoughtful problem-solving, significantly outperform non-reasoning counterparts.

Key Findings

The research indicates that:

Top-performing LLMs demonstrate no improvement, or even worse performance, compared to their non-reasoning counterparts on simpler tasks.
The accuracy of LLMs dramatically declines as problem complexity rises.

These findings contradict optimistic predictions about achieving Artificial General Intelligence (AGI) in the near future, based on the initial impressive performance of LLMs on reasoning benchmarks.

Implications

Apple's research suggests that the current generation of LLMs have limitations in their reasoning abilities, raising concerns about their true capabilities and the feasibility of near-term AGI.

Apple’s New Research Shows That LLM Reasoning Is Completely Broken

A deep dive into Apple research that exposes the flawed thinking process in state-of-the-art Reasoning LLMs

Image generated by author using Google ImageFX

Large Reasoning Models (LRMs), or simply called Reasoning LLMs, are becoming quite popular.

These models are specifically trained to take their time and think before they answer, especially when solving tough problems.

Their thinking mechanism comes from generating a long Chain-of-Thought (CoT) and self-verifying it at inference/ test-time before giving the final answer.

The performance of these models on multiple reasoning benchmarks is very impressive, leading many to believe that we might achieve AGI in just the next few years.

This claim is far from the truth, though.

These are not merely my thoughts, but a recent research paper from Apple has just confirmed this (and this is not the first time Apple has exposed similar lies of the big tech giants).

The research shows that the best reasoning LLMs perform no better (or even worse) than their non-reasoning counterparts on low-complexity tasks and that their accuracy completely collapses beyond a certain problem complexity.

Was this article displayed correctly? Not happy with what you see?

See Archived Versions Request Manual Review

Category: AI

Tags: LLM Reasoning Apple AI research Artificial Intelligence

Tabs Reminder: Tabs piling up in your browser? Set a reminder for them, close them and get notified at the right time.

If you often open multiple tabs and struggle to keep track of them, Tabs Reminder is the solution you need. Tabs Reminder lets you set reminders for tabs so you can close them and get notified about them later. Never lose track of important tabs again with Tabs Reminder!

Try our Chrome extension today!

Add to Chrome

Save As Favorite

Add To Reading List

Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Twitter/X

WhatsApp

Facebook

Save articles to reading lists
and access them on any device