There has been a widespread belief in the field of artificial intelligence that creating state-of-the-art large language models necessitates substantial financial and technical resources. That’s one of the primary reasons why President Donald Trump’s $500 billion Stargate Project has the backing of the U.S. government.
However, that idea has been upended by DeepSeek, a Chinese AI development company. DeepSeek published its R1 LLM on January 20, 2025, for a fraction of the price that other vendors paid for their own developments. Additionally, DeepSeek is making its R1 models available for free usage under an open-source license.
The DeepSeek AI assistant, a mobile app that offers a chatbot interface for DeepSeek R1, surpassed OpenAI’s ChatGPT mobile software to become the top-ranked app in the Apple App Store chart within days of its debut. On January 27, 2025, investors began to question the worth of major U.S.-based AI providers, such as Nvidia, as a result of DeepSeek’s explosive growth in popularity and usage. As investors reevaluated AI values, major declines were also observed in Microsoft, Meta Platforms, Oracle, Broadcom, and other industry giants.
DeepSeek: What is it?
Based in Hangzhou, China, DeepSeek is an AI development company. Liang Wenfeng, a Zhejiang University alumnus, launched the business in May 2023. In addition, Wenfeng is a co-founder of High-Flyer, a quantitative hedge fund based in China that owns DeepSeek. DeepSeek currently functions as a stand-alone AI research facility under High-Flyer’s auspices. Both DeepSeek’s valuation and the entire amount of money have not been made public.
The goal of DeepSeek is to create open-source LLMs. In November 2023, the company’s first model was made available. The business has developed a number of versions and iterated on its core LLM numerous times. However, the company didn’t get international recognition until January 2025, following the publication of their R1 reasoning model.
The business offers a variety of services for its models, including mobile applications, web interfaces, and API access.
DeepSeek vs. OpenAI
The most recent threat to OpenAI, which became the industry leader with the release of ChatGPT in 2022, is DeepSeek. With its O1 class of reasoning models and GPT family of models, OpenAI has contributed to the advancement of the generative AI sector. Although both businesses are creating generative AI LLMs, their methods differ.
DeepSeek training innovations
Compared to OpenAI, DeepSeek trains its R1 models in a different way. Less time, money, and AI accelerators were needed to develop the training. The goal of DeepSeek is to create artificial general intelligence, and the company’s improvements in reasoning skills mark a major step forward in the development of AI.
DeepSeek describes the various advances it created as part of the R1 model in a research paper, including the following:
- Reinforcement learning. DeepSeek focused on reasoning tasks using a large-scale reinforcement learning approach.
- Reward engineering. For the model, researchers created a rule-based reward system that performs better than other widely used neural reward models. The act of creating the incentive structure that directs an AI model’s learning while it is being trained is known as reward engineering.
- distillation. Researchers at DeepSeek were able to condense skills into models with as little as 1.5 billion parameters by employing effective knowledge transfer strategies.
- Network of emergent activity. The finding that sophisticated reasoning patterns can emerge spontaneously through reinforcement learning without explicit programming is DeepSeek’s breakthrough in emergent behavior.
Large language models from DeepSeek
DeepSeek has published a number of generative AI models since its founding in 2023. The business has made an effort to improve the models’ performance and capabilities with every subsequent generation:
- Coder DeepSeek. This is the company’s first open-source model created especially for coding-related jobs, and it was released in November 2023.
- DeepSeek LLM. This is the initial iteration of the business’s general-purpose model, which was released in December 2023.
- DeepSeek-V2. The second iteration of the company’s LLM, which focuses on high performance and reduced training expenses, was released in May 2024.
- DeepSeek-Coder-V2. This 236 billion-parameter model, which was released in July 2024, is intended for challenging coding tasks and has a context window of 128,000 tokens.
- DeepSeek-V3. DeepSeek-V3, which was released in December 2024, can handle a variety of jobs thanks to its mixture-of-experts architecture. The model has a context length of 128,000 and 671 billion parameters.
- DeepSeek-R1. This DeepSeek-V3-based model, which was released in January 2025, is geared toward complex reasoning tasks and directly competes with OpenAI’s O1 model in terms of performance while keeping a much lower cost structure. The model includes 671 billion parameters and a context length of 128,000, just like DeepSeek-V3.
- Janus-Pro-7B. Janus-Pro-7B, a vision model that can comprehend and produce images, was released in January 2025.
DeepSeek cyberattack
DeepSeek faced significant cyberattacks on Jan. 27, 2025, causing the company to temporarily restrict new user registrations. The attack coincided with DeepSeek’s AI assistant app surpassing ChatGPT as the top-downloaded app on the Apple App Store.
DeepSeek, despite a DDoS attack targeting its API and web chat platform, maintained service for existing users until Jan. 28, when it reported identifying the issue and deploying a fix. The exact nature of the attack remains unknown.