New AI models are more likely to give a wrong answer than admit they don't know

Full Article
New AI models are more likely to give a wrong answer than admit they don't know

Recent research indicates that newer large language models (LLMs) are less likely to admit when they don't know an answer. A study conducted by AI researchers from the Universitat Politècnica de València tested models from BigScience, Meta, and OpenAI, revealing that while accuracy improved for complex problems, transparency decreased. The findings suggest that these models often guess rather than acknowledge their limitations, leading to potentially misleading responses.

The study highlights a concerning trend where advanced models, like OpenAI's GPT-4, show a significant drop in avoidant answers compared to earlier versions. Despite the expectation that newer models would better recognize their operational limits, the results indicate no substantial improvement in basic problem-solving. This raises questions about the reliability of AI systems as they become more sophisticated yet less transparent.

• Newer LLMs are less transparent about their knowledge limitations.

• Advanced models often guess instead of admitting ignorance.

Key AI Terms Mentioned in this Article

Large Language Models (LLMs)

LLMs are central to the study, showcasing how advancements can lead to less reliable outputs.

Avoidant Answers

The study found that newer models produce fewer avoidant answers, which raises concerns about their reliability.

Accuracy

The research measured accuracy across various models, noting improvements in complex problem-solving but not in basic queries.

Companies Mentioned in this Article

BigScience

BigScience's BLOOM model was part of the study assessing LLM performance.

Meta

Meta's Llama model was evaluated in the study for its response accuracy.

OpenAI

OpenAI's GPT-4 was analyzed in the study, revealing trends in model reliability.

Get Email Alerts for AI News

By creating an email alert, you agree to AIleap's Terms of Service and Privacy Policy. You can pause or unsubscribe from email alerts at any time.

Latest Articles

Alphabet's AI drug discovery platform Isomorphic Labs raises $600M from Thrive
TechCrunch 6month

Isomorphic Labs, the AI drug discovery platform that was spun out of Google's DeepMind in 2021, has raised external capital for the first time. The $600

AI In Education - Up-level Your Teaching With AI By Cloning Yourself
Forbes 6month

How to level up your teaching with AI. Discover how to use clones and GPTs in your classroom—personalized AI teaching is the future.

Trump's Third Term - How AI Can Help To Overthrow The US Government
Forbes 6month

Trump's Third Term? AI already knows how this can be done. A study shows how OpenAI, Grok, DeepSeek & Google outline ways to dismantle U.S. democracy.

Sam Altman Says OpenAI Will Release an 'Open Weight' AI Model This Summer
Wired 6month

Sam Altman today revealed that OpenAI will release an open weight artificial intelligence model in the coming months. "We are excited to release a powerful new open-weight language model with reasoning in the coming months," Altman wrote on X.

Popular Topics