Large language models not fit for real-world use, scientists warn — even slight changes cause their world models to collapse

Full Article
Large language models not fit for real-world use, scientists warn — even slight changes cause their world models to collapse

New research reveals that large language models (LLMs) like GPT-4 and Claude 3 Opus do not possess a coherent understanding of the real world. Despite their ability to generate accurate outputs, such as driving directions, the underlying models are flawed, containing non-existent routes. This raises significant concerns about their reliability in dynamic environments, such as autonomous vehicles.

The study highlights that even minor changes, like detours, can drastically reduce the accuracy of LLMs, indicating their fragility. Researchers tested LLMs using deterministic finite automations to assess their world models, finding that most failed to produce accurate representations. This emphasizes the need for improved approaches to ensure LLMs can adapt to real-world complexities.

• LLMs struggle to create accurate world models for real-world applications.

• Minor changes in input can lead to significant drops in LLM performance.

Key AI Terms Mentioned in this Article

Large Language Models

LLMs are AI systems designed to understand and generate human language, but they lack real-world coherence.

Transformer Models

These are neural networks that enable LLMs to process data and learn from it, forming the basis of their functionality.

Deterministic Finite Automations

DFAs are used to evaluate LLMs by assessing their ability to handle sequences and states in problem-solving.

Companies Mentioned in this Article

MIT

MIT is involved in AI research, contributing to the study of LLMs and their limitations.

Harvard

Harvard researchers are exploring the capabilities and shortcomings of LLMs in real-world scenarios.

Cornell

Cornell's involvement in the study highlights the academic interest in AI's real-world applications.

Get Email Alerts for AI News

By creating an email alert, you agree to AIleap's Terms of Service and Privacy Policy. You can pause or unsubscribe from email alerts at any time.

Latest Articles

Alphabet's AI drug discovery platform Isomorphic Labs raises $600M from Thrive
TechCrunch 6month

Isomorphic Labs, the AI drug discovery platform that was spun out of Google's DeepMind in 2021, has raised external capital for the first time. The $600

AI In Education - Up-level Your Teaching With AI By Cloning Yourself
Forbes 6month

How to level up your teaching with AI. Discover how to use clones and GPTs in your classroom—personalized AI teaching is the future.

Trump's Third Term - How AI Can Help To Overthrow The US Government
Forbes 6month

Trump's Third Term? AI already knows how this can be done. A study shows how OpenAI, Grok, DeepSeek & Google outline ways to dismantle U.S. democracy.

Sam Altman Says OpenAI Will Release an 'Open Weight' AI Model This Summer
Wired 6month

Sam Altman today revealed that OpenAI will release an open weight artificial intelligence model in the coming months. "We are excited to release a powerful new open-weight language model with reasoning in the coming months," Altman wrote on X.

Popular Topics