Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

Science

Showing Original Post only (View all)

cbabe

(5,275 posts)
Tue Jun 10, 2025, 11:29 AM Jun 10

When billion-dollar AIs break down over puzzles a child can do, it's time to rethink the hype [View all]

https://www.theguardian.com/commentisfree/2025/jun/10/billion-dollar-ai-puzzle-break-down

When billion-dollar AIs break down over puzzles a child can do, it’s time to rethink the hype

Gary Marcus

The tech world is reeling from a paper that shows the powers of a new generation of AI have been wildly oversold

Tue 10 Jun 2025 06.18 EDT



Apple did this by showing that leading models such as ChatGPT, Claude and Deepseek may “look smart – but when complexity rises, they collapse”. In short, these models are very good at a kind of pattern recognition, but often fail when they encounter novelty that forces them beyond the limits of their training, despite being, as the paper notes, “explicitly designed for reasoning tasks”.



The Tower of Hanoi is a classic game with three pegs and multiple discs, in which you need to move all the discs on the left peg to the right peg, never stacking a larger disc on top of a smaller one. With practice, though, a bright (and patient) seven-year-old can do it.

What Apple found was that leading generative models could barely do seven discs, getting less than 80% accuracy, and pretty much can’t get scenarios with eight discs correct at all. It is truly embarrassing that LLMs cannot reliably solve Hanoi.



What the Apple paper shows, most fundamentally, regardless of how you define AGI, is that these LLMs that have generated so much hype are no substitute for good, well-specified conventional algorithms. (They also can’t play chess as well as conventional algorithms, can’t fold proteins like special-purpose neurosymbolic hybrids, can’t run databases as well as conventional databases, etc.)

… more …

12 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
Latest Discussions»Culture Forums»Science»When billion-dollar AIs b...»Reply #0