Study suggests that even the best AI models hallucinate a bunch

Generative AI models, including Google's Gemini and OpenAI's GPT-4o, are found to hallucinate frequently, with reliability varying across different models. A study from Cornell and other institutions benchmarked these models against authoritative sources, revealing that even the best models only produce hallucination-free text about 35% of the time. The research highlights the need for caution in trusting AI outputs, as models often refuse to answer questions they might get wrong.

The study evaluated over a dozen popular AI models, including Meta's Llama and Anthropic's Claude, and found that models struggle more with questions outside of Wikipedia's scope. Despite claims from major AI companies about improvements, the results indicate that hallucination rates remain high. Researchers suggest that implementing human-in-the-loop fact-checking could help mitigate these issues, emphasizing the importance of developing advanced verification tools.

Key AI Highlights in this Article

• Generative AI models frequently hallucinate, impacting their reliability.

• Study shows even top models only produce accurate outputs 35% of the time.

Key AI Terms Mentioned in this Article

Hallucination

Hallucinations in AI can lead to significant trust issues regarding the outputs of generative models.

Benchmarking

The study used benchmarking to assess how well different models performed on factual accuracy.

Human-in-the-loop

This method is suggested as a way to reduce hallucinations in generative AI models.

Companies Mentioned in this Article

Google

Google's Gemini model was evaluated in the study for its hallucination rates.

OpenAI

OpenAI's models, including GPT-4o, were central to the study's findings on hallucinations.

Google Anthropic OpenAI Meta Mistral Cohere Perplexity Gemini Claude Google OpenAI Meta Anthropic Cohere Mistral Perplexity Image Generation Image Generation AI Ethics

Related News

Study suggests that even the best AI models hallucinate a bunch

Yahoo 14month

Artificial intelligence systems can hallucinate just like people

UPI 7month

Harnessing the power of Generative AI by addressing hallucinations

TechRadar 15month

AI Hallucinations Invade OpenAI Latest GPT Model o1 In Quite Surprising Places

Forbes 12month

OpenAI claims its newest chatbot GPT-4.5 should 'hallucinate less'. How is that measured?

ABC (Australian Broadcasting Corporation) 7month

Breakthrough In Preemptive Detection Of AI Hallucinations Reveals Vital Clues To Writing Prompts That Keep Generative AI From Freaking Out

Forbes 10month

Focus: If your AI seems smarter , it's thanks to smarter human trainers

Reuters 12month

With hallucinations waning, AI is diving deeper into scientific research

The Next Web 16month

Latest Articles

Alphabet's AI drug discovery platform Isomorphic Labs raises $600M from Thrive

TechCrunch 6month

Isomorphic Labs, the AI drug discovery platform that was spun out of Google's DeepMind in 2021, has raised external capital for the first time. The $600

AI In Education - Up-level Your Teaching With AI By Cloning Yourself

Forbes 6month

How to level up your teaching with AI. Discover how to use clones and GPTs in your classroom—personalized AI teaching is the future.

Trump's Third Term - How AI Can Help To Overthrow The US Government

Forbes 6month

Trump's Third Term? AI already knows how this can be done. A study shows how OpenAI, Grok, DeepSeek & Google outline ways to dismantle U.S. democracy.

Sam Altman Says OpenAI Will Release an 'Open Weight' AI Model This Summer

Wired 6month

Sam Altman today revealed that OpenAI will release an open weight artificial intelligence model in the coming months. "We are excited to release a powerful new open-weight language model with reasoning in the coming months," Altman wrote on X.

Guest

Explore AI

Explore GPTs

Explore AI News

Explore AI Videos

Explore AI for Jobs

Study suggests that even the best AI models hallucinate a bunch

Hallucination

Benchmarking

Human-in-the-loop

Google

OpenAI

Related News

Study suggests that even the best AI models hallucinate a bunch

Artificial intelligence systems can hallucinate just like people

Harnessing the power of Generative AI by addressing hallucinations

AI Hallucinations Invade OpenAI Latest GPT Model o1 In Quite Surprising Places

OpenAI claims its newest chatbot GPT-4.5 should 'hallucinate less'. How is that measured?

Breakthrough In Preemptive Detection Of AI Hallucinations Reveals Vital Clues To Writing Prompts That Keep Generative AI From Freaking Out

Focus: If your AI seems smarter , it's thanks to smarter human trainers

With hallucinations waning, AI is diving deeper into scientific research

Get Email Alerts for AI News

Latest Articles

Popular Topics