Cognition's Devin, touted as the first AI software engineer, has been found to perform poorly, achieving only a 15% success rate in tasks. Researchers from Answer.AI conducted a month-long analysis, revealing that out of 20 tasks, 14 resulted in failures. The findings raise concerns about the reliability of AI in software engineering roles, especially given the high expectations set by Cognition's marketing.
The analysis highlighted Devin's inability to recognize fundamental blockers, leading to prolonged attempts at impossible tasks. Despite early impressive demos, the reality of Devin's performance starkly contrasts with Cognition's claims of its capabilities. This situation underscores the ongoing gap between AI promises and actual performance, questioning the feasibility of AI replacing human engineers in the near future.
• Cognition's Devin achieved only a 15% success rate in software tasks.
• Devin's performance raises doubts about AI's readiness to replace human engineers.
An AI software engineer refers to an AI system designed to perform software development tasks, as claimed by Cognition's Devin.
Autonomous AI systems operate independently to complete tasks, but Devin's failures highlight limitations in this capability.
Machine learning involves algorithms that improve through experience, which Devin struggled to apply effectively in real-world scenarios.
Cognition is an AI tech company that developed Devin, claiming it to be the first AI software engineer.
AI is an independent AI research lab that conducted the analysis of Devin's performance.
Futurism on MSN.com 8month
Isomorphic Labs, the AI drug discovery platform that was spun out of Google's DeepMind in 2021, has raised external capital for the first time. The $600
How to level up your teaching with AI. Discover how to use clones and GPTs in your classroom—personalized AI teaching is the future.
Trump's Third Term? AI already knows how this can be done. A study shows how OpenAI, Grok, DeepSeek & Google outline ways to dismantle U.S. democracy.
Sam Altman today revealed that OpenAI will release an open weight artificial intelligence model in the coming months. "We are excited to release a powerful new open-weight language model with reasoning in the coming months," Altman wrote on X.