AI has a stupid secret: we're still not sure how to test for human levels of intelligence

San Francisco's AI leaders, Scale AI and the Center for AI Safety, have launched an initiative called 'Humanity's Last Exam' to test large language models like Google Gemini and OpenAI's o1. This initiative invites the public to submit questions that can effectively evaluate the capabilities of these AI systems, with a prize of $5,000 for the best submissions. The goal is to assess how close AI has come to achieving expert-level intelligence.

Despite current AI models excelling in various established tests, there remains uncertainty about the significance of these results due to potential pre-learned answers from vast training datasets. The challenge extends to defining and measuring intelligence, particularly artificial general intelligence (AGI), which surpasses human capabilities. As AI continues to evolve, new benchmarks and testing methods are essential to ensure accurate assessments of their intelligence.

Key AI Highlights in this Article

• Scale AI and CAIS challenge public to test AI capabilities.

• New benchmarks needed to measure AI intelligence effectively.

Key AI Terms Mentioned in this Article

Large Language Models (LLMs)

LLMs like Google Gemini and OpenAI's o1 are central to the testing initiative launched by Scale AI and CAIS.

Artificial General Intelligence (AGI)

The article discusses the importance of measuring AGI as AI systems demonstrate broader intelligent behavior.

Model Collapse

The article highlights concerns about model collapse as AI-generated content floods training datasets.

Companies Mentioned in this Article

Scale AI

Scale AI is collaborating with CAIS to evaluate AI capabilities through public engagement.

Center for AI Safety

CAIS partners with Scale AI to launch the initiative aimed at testing AI intelligence.

Scale AI Center for AI Safety Google OpenAI Ethical AI frameworks Education

Related News

AI has a stupid secret: we're still not sure how to test for human levels of intelligence

The Conversation 12month

How AI is testing the boundaries of human intelligence

The Thaiger 16month

Apple researchers suggest artificial intelligence is still mostly an illusion

Tech Xplore on MSN.com 12month

Michael Hiltzik: These Apple researchers just showed that AI bots can't think, and possibly never will

TwinCities.com 11month

AI versus the brain and the race for general intelligence

Ars Technica 7month

When robots can't riddle: What puzzles reveal about the depths of our own minds

BBC 12month

AI and the Human Mind: Uncovering the Machine "Unconscious"

Psychology Today 12month

Government project to test 'human resilience' to AI via deceptive games

Public Technology 11month

Latest Articles

Alphabet's AI drug discovery platform Isomorphic Labs raises $600M from Thrive

TechCrunch 6month

Isomorphic Labs, the AI drug discovery platform that was spun out of Google's DeepMind in 2021, has raised external capital for the first time. The $600

AI In Education - Up-level Your Teaching With AI By Cloning Yourself

Forbes 6month

How to level up your teaching with AI. Discover how to use clones and GPTs in your classroom—personalized AI teaching is the future.

Trump's Third Term - How AI Can Help To Overthrow The US Government

Forbes 6month

Trump's Third Term? AI already knows how this can be done. A study shows how OpenAI, Grok, DeepSeek & Google outline ways to dismantle U.S. democracy.

Sam Altman Says OpenAI Will Release an 'Open Weight' AI Model This Summer

Wired 6month

Sam Altman today revealed that OpenAI will release an open weight artificial intelligence model in the coming months. "We are excited to release a powerful new open-weight language model with reasoning in the coming months," Altman wrote on X.

Guest

Explore AI

Explore GPTs

Explore AI News

Explore AI Videos

Explore AI for Jobs

AI has a stupid secret: we're still not sure how to test for human levels of intelligence

Large Language Models (LLMs)

Artificial General Intelligence (AGI)

Model Collapse

Scale AI

Center for AI Safety

Related News

AI has a stupid secret: we're still not sure how to test for human levels of intelligence

How AI is testing the boundaries of human intelligence

Apple researchers suggest artificial intelligence is still mostly an illusion

Michael Hiltzik: These Apple researchers just showed that AI bots can't think, and possibly never will

AI versus the brain and the race for general intelligence

When robots can't riddle: What puzzles reveal about the depths of our own minds

AI and the Human Mind: Uncovering the Machine "Unconscious"

Government project to test 'human resilience' to AI via deceptive games

Get Email Alerts for AI News

Latest Articles

Popular Topics