ByteDance has introduced a new AI system called DAPO, which enhances complex reasoning in large language models. This scalable reinforcement learning algorithm outperformed DeepSeek's R1 reasoning model in a recent examination, achieving a score of 50 compared to R1's 47. The development was a collaborative effort with Tsinghua University's Institute for AI Industry Research, showcasing ByteDance's commitment to advancing AI technology.
The DAPO algorithm not only achieved better results but did so with 50% fewer training steps, indicating significant efficiency improvements. The project, led by an intern, has garnered positive feedback from industry experts, although some skepticism remains regarding the comparison of training steps. ByteDance continues to recruit top AI talent, aiming to push the boundaries of intelligence and open-source collaboration.
• DAPO algorithm improves AI reasoning with fewer training steps.
• ByteDance's collaboration with Tsinghua University enhances AI model performance.
Reinforcement learning is a machine learning paradigm where agents learn to make decisions by receiving rewards or penalties based on their actions.
A large language model is a type of AI that processes and generates human-like text based on vast amounts of data.
A scalable algorithm can efficiently handle increasing amounts of data or complexity without a significant drop in performance.
ByteDance is a technology company known for its investment in AI and development of platforms like TikTok.
Alibaba Group Holdings is a multinational conglomerate specializing in e-commerce and technology, providing AI models for various applications.
South China Morning Post 3month
South China Morning Post 4month
Isomorphic Labs, the AI drug discovery platform that was spun out of Google's DeepMind in 2021, has raised external capital for the first time. The $600
How to level up your teaching with AI. Discover how to use clones and GPTs in your classroom—personalized AI teaching is the future.
Trump's Third Term? AI already knows how this can be done. A study shows how OpenAI, Grok, DeepSeek & Google outline ways to dismantle U.S. democracy.
Sam Altman today revealed that OpenAI will release an open weight artificial intelligence model in the coming months. "We are excited to release a powerful new open-weight language model with reasoning in the coming months," Altman wrote on X.