Meet DeepSeek: the Chinese start-up that is changing how AI models are trained

Full Article
Meet DeepSeek: the Chinese start-up that is changing how AI models are trained

DeepSeek, a Chinese start-up, has made a significant impact in the open-source large language model (LLM) space with its latest release, DeepSeek V3. This model boasts 671 billion parameters and was developed at a fraction of the cost compared to larger tech firms like Meta and OpenAI. Jim Fan from Nvidia highlighted how resource constraints have driven DeepSeek to innovate effectively.

The training of DeepSeek V3 required only 2.78 million GPU hours, showcasing a remarkable efficiency compared to Meta's Llama 3.1 model, which needed over 30 million GPU hours. This achievement underscores the advancements of Chinese AI firms despite facing US sanctions on semiconductor access. The model's performance has sparked discussions about the future of AI development in a competitive landscape.

• DeepSeek V3 trained with 671 billion parameters at a low cost.

• DeepSeek's efficiency challenges larger firms like Meta and OpenAI.

Key AI Terms Mentioned in this Article

Large Language Model (LLM)

LLMs are AI models that process and generate human-like text, crucial for applications like chatbots.

GPU Hours

GPU hours measure the computational time used for training AI models, indicating efficiency and resource usage.

Open Source

Open source allows public access to software code, enabling collaboration and innovation in AI development.

Companies Mentioned in this Article

DeepSeek

DeepSeek is a start-up that has developed an efficient LLM, DeepSeek V3, showcasing innovation in AI.

Nvidia

Nvidia provides the GPUs used for training DeepSeek's models, playing a critical role in AI development.

Get Email Alerts for AI News

By creating an email alert, you agree to AIleap's Terms of Service and Privacy Policy. You can pause or unsubscribe from email alerts at any time.

Latest Articles

Alphabet's AI drug discovery platform Isomorphic Labs raises $600M from Thrive
TechCrunch 3month

Isomorphic Labs, the AI drug discovery platform that was spun out of Google's DeepMind in 2021, has raised external capital for the first time. The $600

AI In Education - Up-level Your Teaching With AI By Cloning Yourself
Forbes 3month

How to level up your teaching with AI. Discover how to use clones and GPTs in your classroom—personalized AI teaching is the future.

Trump's Third Term - How AI Can Help To Overthrow The US Government
Forbes 3month

Trump's Third Term? AI already knows how this can be done. A study shows how OpenAI, Grok, DeepSeek & Google outline ways to dismantle U.S. democracy.

Sam Altman Says OpenAI Will Release an 'Open Weight' AI Model This Summer
Wired 3month

Sam Altman today revealed that OpenAI will release an open weight artificial intelligence model in the coming months. "We are excited to release a powerful new open-weight language model with reasoning in the coming months," Altman wrote on X.

Popular Topics