A Chinese artificial intelligence company called DeepSeek develops large language models (LLMs) that are open-source (Pinyin: Shēndù Qiúsu��; Chinese: 深度求索). With its headquarters located in Hangzhou, Zhejiang, it is fully owned and funded by the Chinese hedge fund High-Flyer. The company was founded in 2023 by Liang Wenfeng, who also serves as its CEO today.
DeepSeek performs similarly to ChatGPT, even though it was developed at a significantly lower cost—reported at $6 million USD in 2023 as opposed to $100 million for OpenAI’s GPT-4—and uses only a tenth of the processing power for a similar LLM. While the US is imposing limitations on Nvidia chips made in China,

DeepSeek developed its AI model to circumvent limitations on the country’s ability to produce advanced AI systems. Nvidia’s stock price dropped 18% after the company’s first free chatbot app was released on January 10, 2025, and by January 27, it had overtaken ChatGPT as the most downloaded free program in the US on the iOS App Store. Some have dubbed DeepSeek’s victory over larger, more established rivals the “rise of emerging AI” and a “first shot in the global AI space race.”
Background.
In February 2016, Liang Wenfeng, an AI enthusiast and Zhejiang University student who has been trading since the 2007–2008 financial crisis, co-founded High-Flyer. By 2019, High-Flyer was a hedge fund that focused on developing and utilizing trading algorithms powered by artificial intelligence. By 2021, the company’s exclusive focus was on trading with AI.
Prior to the U.S. government imposing restrictions on AI chips for China, Liang amassed an estimated 50,000 Nvidia A100 GPUs. In order to develop AI tools separate from its financial activities, High-Flyer set up an artificial general intelligence (AGI) lab in April 2023.

High-Flyer was co-founded in February 2016 by Liang Wenfeng, an AI enthusiast and student at Zhejiang University who has been trading since the 2007–2008 financial crisis. By 2019, High-Flyer was a hedge fund that focused on developing and utilizing trading algorithms driven by artificial intelligence. By 2021, the company’s exclusive focus was on trading with AI.
Liang reportedly amassed 50,000 Nvidia A100 GPUs prior to the US government banning AI hardware for China. To develop AI technologies outside of its financial activities, High-Flyer set up an artificial general intelligence (AGI) lab in April 2023.
The company gained notoriety as the main force behind China’s AI model pricing war in May 2024 when it unveiled DeepSeek-V2, which provided robust performance at an affordable price. It soon became known as the “Pinduoduo of AI,” and in order to compete, big Internet companies like ByteDance, Tencent, Baidu, and Alibaba lowered the cost of their AI models. DeepSeek made money even though its prices were lower than those of its competitors.
DeepSeek has no specific marketing goals and is solely focused on research. In China’s AI ecosystem, this specialization enables the corporation to evade stringent rules, such as government oversight of consumer-facing products.

Because DeepSeek prioritizes technical talents above work experience, the majority of new hires are engineers with less known AI professions or fresh university graduates. The company also employs individuals with no prior computer science background to help its technology understand a wide range of fields and expertise areas, including writing poetry and doing exceptionally well on the notoriously difficult Chinese college admission tests.
Release History of DeepSeek LLMs.
On November 2, 2023, researchers and corporate clients could download DeepSeek’s initial model series, DeepSeek-Coder, for free. The MIT license was used to make the models’ code publicly available, with an additional “DeepSeek License” agreement for future proper use.
The eight models in the series are separated into four pre-trained (Base) and four instruction-fine-tuned (Instruct) models, each with a context length of 16,000 tokens. The training process was as follows:
Pretraining involved 1.8 trillion tokens, of which 87% were source code, 10% were code-related English from Stack Exchange and GitHub Markdown, and 3% were non-code-related Chinese.
Extended Context In order to create the basic models, the context length was extended from 4,000 to 16,000 tokens, utilizing 200 billion tokens for pretraining.
2 billion tokens of instruction data are used in Supervised Fine-Tuning (SFT), which builds the Instruct models.
Clusters of Nvidia A100 and H800 GPUs connected by InfiniBand, NVLink, and NVSwitch were used for the training.

Features of DeepSeek-Coder Models.
Parameter range: 1.3B to 33B.
The architecture includes layers, model size, and other technical aspects that are designed for performance.
On November 29, 2023, DeepSeek released their LLM series, which contains 7B and 67B parameter models in base and chat formats. These iterations were developed to compete with other LLMs that were available at the time. The benchmark results showed that LLaMA 2 outperformed most open-source LLMs.
DeepSeek maintains technological advancements and competitive profitability while providing a strong foundation for AI research and innovation in China through its persistent focus on open-source contributions.