OpenAI claims its newest chatbot GPT-4.5 should 'hallucinate less'. How is that measured?

OpenAI has introduced GPT-4.5, claiming it will 'hallucinate less' than previous models. Hallucinations refer to errors where the AI generates incorrect information, which can lead to serious consequences. The company has developed a benchmark called SimpleQA to measure the accuracy of its models.

In testing, GPT-4.5 hallucinated 37% of the time, an improvement over its predecessor, GPT-4o, which had a 62% error rate. However, experts argue that the evaluation method is flawed, as it focuses on short, factual queries rather than the complex responses users typically seek. The ongoing challenge remains whether AI can ever be completely free of hallucinations.

Key AI Highlights in this Article

• OpenAI's GPT-4.5 aims to reduce hallucination errors in AI responses.

• Experts question the effectiveness of OpenAI's evaluation methods for AI accuracy.

Key AI Terms Mentioned in this Article

Hallucinations

Hallucinations in AI refer to instances where the model generates incorrect or misleading information.

Benchmark

A benchmark is a standard test used to evaluate the performance and accuracy of AI models.

Large Language Models (LLMs)

LLMs are AI models designed to understand and generate human-like text based on vast datasets.

Companies Mentioned in this Article

OpenAI

OpenAI Text generation Entertainment

Related News

OpenAI claims its newest chatbot GPT-4.5 should 'hallucinate less'. How is that measured?

ABC (Australian Broadcasting Corporation) 7month

OpenAI finally unveils GPT-4.5. Here's what it can do

ZDNet 7month

Study suggests that even the best AI models hallucinate a bunch

Yahoo 14month

Artificial intelligence systems can hallucinate just like people

UPI 7month

I just tested ChatGPT-4.5 vs ChatGPT-4o with 7 prompts — here's my verdict

Tom's Guide 7month

OpenAI reveals new and improved GPT-4o model - but can't quite explain why it's better

ZDNet 14month

OpenAI unveils newest AI model, GPT-4o

Channel 3000 17month

Latest Articles

Alphabet's AI drug discovery platform Isomorphic Labs raises $600M from Thrive

TechCrunch 6month

Isomorphic Labs, the AI drug discovery platform that was spun out of Google's DeepMind in 2021, has raised external capital for the first time. The $600

AI In Education - Up-level Your Teaching With AI By Cloning Yourself

Forbes 6month

How to level up your teaching with AI. Discover how to use clones and GPTs in your classroom—personalized AI teaching is the future.

Trump's Third Term - How AI Can Help To Overthrow The US Government

Forbes 6month

Trump's Third Term? AI already knows how this can be done. A study shows how OpenAI, Grok, DeepSeek & Google outline ways to dismantle U.S. democracy.

Sam Altman Says OpenAI Will Release an 'Open Weight' AI Model This Summer

Wired 6month

Sam Altman today revealed that OpenAI will release an open weight artificial intelligence model in the coming months. "We are excited to release a powerful new open-weight language model with reasoning in the coming months," Altman wrote on X.

Guest

Explore AI

Explore GPTs

Explore AI News

Explore AI Videos

Explore AI for Jobs

OpenAI claims its newest chatbot GPT-4.5 should 'hallucinate less'. How is that measured?

Hallucinations

Benchmark

Large Language Models (LLMs)

OpenAI

Related News

OpenAI claims its newest chatbot GPT-4.5 should 'hallucinate less'. How is that measured?

OpenAI finally unveils GPT-4.5. Here's what it can do

Study suggests that even the best AI models hallucinate a bunch

Artificial intelligence systems can hallucinate just like people

I just tested ChatGPT-4.5 vs ChatGPT-4o with 7 prompts — here's my verdict

OpenAI reveals new and improved GPT-4o model - but can't quite explain why it's better

OpenAI unveils newest AI model, GPT-4o

Get Email Alerts for AI News

Latest Articles

Popular Topics