DeepSeek, a Chinese AI research lab, made headlines last week with the launch of its open-source AI model, DeepSeek-R1. The Chinese lab claims that its model competes with major players like OpenAI, especially in areas such as math reasoning, code creation, and cost-effectiveness.
What is DeepSeek?
DeepSeek is an AI research lab that came from Fire-Flyer, a deep-learning division of High-Flyer, a Chinese hedge fund. High-Flyer was founded in 2015 and became well-known for using advanced computing to analyze financial data. In 2023, its founder, Liang Wenfeng, shifted focus to creating DeepSeek, with the goal of developing cutting-edge AI models.
Unlike many Chinese AI firms, DeepSeek is not tied to tech giants like Baidu or Alibaba. Liang started this project out of scientific curiosity rather than for quick financial gains. He noted, “Basic science research rarely offers high returns on investment.”
What is DeepSeek-R1?
DeepSeek-R1 is a powerful AI model that claims to outperform others in several important tasks. The model, along with its variations like DeepSeek-R1-Zero, uses large-scale reinforcement learning (RL) and multi-stage training to build its capabilities.
The company has also taken a big step by open-sourcing not only its flagship model but also six smaller versions, ranging from 1.5 billion to 70 billion parameters. These models are MIT-licensed, which means researchers and developers can freely refine and commercialize them.
How Does DeepSeek Compare to OpenAI?
Both OpenAI and DeepSeek have developed large language models (LLMs), but there’s a key difference. Traditional models, like those from OpenAI, use supervised fine-tuning, while DeepSeek-R1-Zero claims to excel at reasoning tasks after only being trained using RL. To improve readability, DeepSeek introduced DeepSeek-R1, which performs similarly to OpenAI's model on reasoning tasks.
DeepSeek also made advances with techniques like multi-head latent attention (MLA) and a mixture of experts, making its models more affordable. The latest DeepSeek model requires just a fraction of the computing power compared to Meta’s similar Llama 3.1 model, as per a report from Epoch AI.
Who is Behind DeepSeek?
Liang Wenfeng, born in 1985, is the founder and CEO of DeepSeek. He also co-founded the hedge fund High-Flyer. Liang has a background in electronic and communication engineering, with degrees from Zhejiang University. In 2016, he co-founded Ningbo High-Flyer, which used AI for investment strategies. He later expanded into AI algorithms and applications by founding High-Flyer AI in 2019. With DeepSeek, Liang aims to lead in AI research.
Young Talent Driving AI at DeepSeek
DeepSeek’s team consists of recent graduates from top Chinese universities like Peking and Tsinghua University. Despite their lack of industry experience, these young researchers bring strong academic knowledge and a collaborative mindset, which Liang believes is essential for tackling challenging, long-term AI problems.
According to Liang, these researchers are determined to break down global technological barriers and help China become a leader in innovation.
Overcoming US Chip Restrictions
DeepSeek's success is particularly remarkable considering the ongoing tech competition between the US and China. In October 2022, the US imposed export controls on advanced computing hardware, including Nvidia’s H100 chips, which limited Chinese AI firms' access to necessary resources.
Although DeepSeek started with a stockpile of 10,000 H100 chips, it quickly realized more were needed to stay competitive with companies like OpenAI and Meta. Liang explained that funding wasn’t the issue, but the restrictions on advanced chips were a significant challenge.
With limited access to high-tech chips, many Chinese firms have focused more on applying existing AI models rather than advancing the underlying science. But DeepSeek has defied this trend by rethinking AI’s design and optimizing its use of resources.
Efficient Strategies Fueling DeepSeek’s AI
A tech analyst pointed out that DeepSeek represents a new wave of Chinese firms focused on long-term innovation rather than short-term profits. To overcome limitations, DeepSeek adopted several efficiency-focused strategies, including:
DeepSeek's Global Impact
By open-sourcing its models under an MIT license, DeepSeek has earned recognition in the global AI community. The company shares model weights and outputs, allowing developers worldwide to build upon its technology. This move not only makes advanced AI tools accessible to more people but also challenges the dominance of Western firms in the AI industry.
Global Market Outlook
The stock market is bracing for a rough day after a Chinese company called DeepSeek made some big waves in the world of artificial intelligence.
Reacting to the developments, investors are wondering if the U.S can still be the leader in AI, even if it spends billions of dollars on computer chips. This uncertainty is causing stocks to tumble, especially in the tech sector.
In the US, the Nasdaq index is expected to drop by 3.1 per cent, while the S&P 500 index is set to fall by 1.8 per cent. Some big tech companies are already taking a hit, with Nvidia down 6.5 per cent and Microsoft weakening by 3.5 per cent.
In Europe, things are not looking much better. Shares in a major chip equipment maker called ASML are down by over 8 per cent, leading to a 4 per cent drop in the Stoxx Europe 600 technology index.
It is clear that DeepSeek's breakthrough is sending shockwaves through the tech industry, and investors are scrambling to adjust.