Google has introduced DataGemma, a new pair of open-source AI models designed to reduce hallucinations in large language models (LLMs) when handling statistical queries. These models leverage extensive data from Google's Data Commons, which contains over 240 billion data points from trusted sources. By utilizing this real-world data, DataGemma aims to enhance the accuracy of responses to statistical questions.
The DataGemma models employ two innovative approaches, Retrieval Interleaved Generation (RIG) and Retrieval Augmented Generation (RAG), to improve factual accuracy. Initial tests show promising results, with RIG achieving a factuality improvement of up to 58% on certain queries. Google anticipates that the public release of these models will stimulate further research and development in AI accuracy.
• DataGemma models aim to reduce hallucinations in statistical queries.
• Initial tests show significant improvements in factual accuracy with new approaches.
This issue is particularly prevalent in large language models when responding to statistical queries.
RIG generates natural language queries to retrieve accurate data from external sources.
RAG uses original statistical questions to extract variables and fetch relevant data.
Google developed DataGemma to address challenges in AI accuracy, particularly in statistical queries.
Hugging Face hosts the DataGemma models for academic and research use.
Data Commons serves as a foundational resource for the DataGemma models to enhance their factual accuracy.
VentureBeat 13month
TechCrunch on MSN.com 10month
ABC (Australian Broadcasting Corporation) 7month
Isomorphic Labs, the AI drug discovery platform that was spun out of Google's DeepMind in 2021, has raised external capital for the first time. The $600
How to level up your teaching with AI. Discover how to use clones and GPTs in your classroom—personalized AI teaching is the future.
Trump's Third Term? AI already knows how this can be done. A study shows how OpenAI, Grok, DeepSeek & Google outline ways to dismantle U.S. democracy.
Sam Altman today revealed that OpenAI will release an open weight artificial intelligence model in the coming months. "We are excited to release a powerful new open-weight language model with reasoning in the coming months," Altman wrote on X.