DeepSeek’s figures are a stark contrast to Meta’s Llama 3.1, which needed 30.8 million GPU hours and more advanced hardware to train. Image Credit: Reuters Chinese start-up DeepSeek is making waves in AI developers all over the world, with the release of its latest large language model (LLM), DeepSeek V3. Launched in December 2025, this model has been hailed as a game-changer for its remarkable efficiency in development and cost-effectiveness. The Hangzhou-based company has quickly become a standout player in the global AI community , showcasing innovative strategies to overcome resource constraints and geopolitical challenges.
DeepSeek’s model boasts an impressive 671 billion parameters, placing it on par with some of the most advanced models globally. Yet, it was developed at a fraction of the cost incurred by giants like Meta and OpenAI, requiring only $5.58 million and 2.78 million GPU hours. These figures are a stark contrast to Meta’s Llama 3.1, which needed 30.8 million GPU hours and more advanced hardware to train. DeepSeek’s success highlights the rapid advancements of Chinese AI firms, even under US semiconductor sanctions. Revolutionary approach to LLM training
DeepSeek attributes its efficiency to a novel architecture designed for cost-effective training. By leveraging NVIDIA’s H800 GPUs, customised for the Chinese market , the company optimised its resources to achieve results that rival those of much larger players. This pragmatic approach underscores the potential of resource constraints to drive innovation, as noted by industry experts like NVIDIA’s Jim Fan and OpenAI’s Andrej Karpathy.
Fan commended DeepSeek for demonstrating how limited resources can lead to groundbreaking achievements in AI. Similarly, Jia Yangqing, founder of Lepton AI, praised the start-up’s ability to produce world-class outcomes through intelligent research and strategic investments. DeepSeek’s early acquisition of over 10,000 GPUs, prior to US export restrictions, laid the groundwork for its success. DeepSeek and controversies
DeepSeek has embraced open-source principles, making its models accessible to the global community. Its V1 model remains the most popular on Hugging Face, a leading platform for machine learning and open-source AI tools. This openness has put pressure on commercial AI developers to accelerate their own innovations.
However, DeepSeek […]
How a Chinese start-up is changing how AI models are trained and outperforming OpenAI, Meta