It’s not every day that a relatively unknown AI model goes from obscurity to shaking up the industry. But that’s exactly what DeepSeek has done.
In just a matter of weeks, it became the #1 free app on the US Apple App Store, sparked debates about AI openness vs. security, and even sent Nvidia’s stock tumbling. While much of the debate frames DeepSeek as a China vs. US story, Meta’s Chief AI Scientist Yann LeCun argues the real takeaway is slightly different:
“The correct reading is: ‘Open-source models are surpassing proprietary ones.’
LeCun points out that DeepSeek has built upon open research and models like PyTorch and LLaMA, and now the world can build upon DeepSeek’s own innovations in the same way. So, what makes DeepSeek so special—and why is it sending ripples through the industry? Is it usable for US companies, or do security and censorship concerns outweigh its cost advantages?
Let’s break down the potential, the problems, and the conversations it’s triggered.
Big Potential: How Did DeepSeek R1 Cut Costs So Dramatically?
In its first week, DeepSeek R1 achieved milestones few saw coming:
- Top free AI app in the U.S. Apple Store.
- Hundreds of derivative models spun off quickly, thanks to its open-source availability.
- Adoption by major players like AWS, Microsoft, and Nvidia AI platforms.
But what really turned heads was the price tag.
AI at a Fraction of the Cost
The company claims to have built its model for just $5.6 million—a fraction of the $80M–$100M estimated for GPT-4.
Why is DeepSeek so much cheaper?
Its secret lies in three interconnected innovations:
- Mixture-of-Experts (MoE) architecture – Instead of running all 671 billion parameters at once, it only activates 37 billion at a time, significantly reducing computational overhead.
- FP8 mixed-precision computation – Uses lower precision where possible, cutting training costs while maintaining accuracy.
- Knowledge Distillation – DeepSeek didn’t train R1 from scratch. Instead, it used a smaller student model to learn from the behaviors of larger, more capable teacher models (based on architectures like Meta’s LLaMA and Qwen). This compressed the training process and eliminated the need for massive data-labelling costs.
Until now, AI companies have justified billion-dollar investments in massive computing infrastructure as necessary for building competitive models. DeepSeek suggests that there may be a cheaper way forward.
If AI models can be trained for a fraction of today’s costs, it could lead to:
- Lower AI service costs, making advanced AI more accessible to startups.
- A shift from foundation model development to specialized AI applications.
- More rapid AI adoption across industries that previously couldn’t afford to implement it.
However, efficiency alone doesn’t tell the whole story. The environmental impact of AI remains a growing concern, and DeepSeek’s cost savings could come with an unintended consequence.
Benchmark Performance: DeepSeek R1 vs. OpenAI
Besides the lower training and inference costs, what makes DeepSeek R1 a surprising alternative is that it has achieved performance comparable to OpenAI’s models on several industry-standard benchmarks.
- DeepSeek R1 matched or surpassed GPT-4 in mathematical reasoning tasks like GSM8K and MATH benchmarks.
- On general language understanding (evaluated by MMLU and HellaSwag), R1 showed competitive performance—especially when reasoning step-by-step.
- Its multi-language capabilities also performed well across Asian and Western languages, making it attractive for businesses with global operations.
Efficiency vs. Environmental Impact: The Jevons Paradox at Play
One of AI’s biggest challenges is its hunger for energy. Already, data centers contribute 3.7% of global greenhouse gas emissions, and as AI adoption grows, so will its environmental footprint.
Could this more efficient model be part of the solution? Satya Nadella recently weighed in on X.
This concept was first described by British economist William Stanley Jevons in 1865 in his book The Coal Question. Jevons observed that as coal-powered engines became more efficient, overall coal consumption increased rather than decreased. Why? Because greater efficiency made coal cheaper and more accessible, driving higher demand across industries.
Microsoft CEO pointed out that as AI models become more efficient and cheaper, their overall usage skyrockets. Instead of reducing total energy consumption, these improvements often result in more widespread AI adoption, ultimately increasing demand for computational resources.
In DeepSeek’s case, its low-cost efficiency could trigger a wave of new AI applications—some from companies that previously lacked the budget for advanced AI. As a result:
- More organizations will deploy AI models, leading to higher overall power consumption.
- Data center demands will continue to rise, further straining global energy infrastructure.
- The sustainability impact of AI could worsen, despite individual models being more efficient.
Already, AI accounts for:
- 2.5–3.7% of global greenhouse gas emissions—more than the aviation industry.
- Water-intensive cooling systems in data centers, using up to 500mL of water per AI request.
- Increasing workloads from GenAI adoption, with processing demands expected to grow 50x by 2028.
If DeepSeek’s cost-saving methods lead to an AI adoption boom, we might face a new phase of infrastructure demand, pushing:
- Cloud providers to invest in greener data centers.
- More companies to adopt smaller, purpose-built models.
- Regulators to establish clearer environmental guidelines for AI operations.
But energy concerns aside, there’s an even bigger issue to consider: Can DeepSeek be trusted with corporate and personal data?
Big Problems: Can U.S. Companies Trust DeepSeek?
DeepSeek may be open-source, but open doesn’t always mean safe. The model was developed in China, where companies are required to comply with government intelligence requests. That alone makes many U.S. businesses hesitant to use it, especially in regulated industries.
Privacy and Compliance Risks
Several government agencies, including NASA and the U.S. Navy, have already banned DeepSeek due to security concerns. The main issues?
- Possible government access to user data – Chinese law mandates that companies assist intelligence agencies upon request.
- Keystroke tracking – Some reports indicate that DeepSeek’s app collects keystroke rhythm data, raising questions about user identification and monitoring.
- Potential for unseen telemetry – Even if you run DeepSeek locally, some experts worry about hidden data collection mechanisms.
For companies handling financial, healthcare, or intellectual property data, these risks are too big to ignore.
A Trojan Horse for Your Data? China’s Data Laws and Their Impact on DeepSeek
Under the 2017 Chinese National Intelligence Law, all companies operating in China—including DeepSeek—are required to:
- Cooperate with intelligence-gathering efforts upon request.
- Provide data access if authorities deem it necessary for national security.
- Ensure secrecy regarding the involvement of intelligence agencies.
This means that, regardless of DeepSeek’s privacy policy or public assurances, the Chinese government can compel the company to hand over data if it believes it serves national interests.
Data Routing and the Cloud Factor
Even when AI models like DeepSeek are run locally and you can download and build upon them, this presents an important limitation beyond security—one of the key factors that make LLMs such powerful tools: retrieving current data.
Even setting that aside, cybersecurity experts warn of additional risks:
- AI models can contain backdoor behaviors designed to extract information during certain interactions.
- Seemingly benign telemetry data might reveal insights into user behavior or system architecture.
Built-in Censorship at the Training Level
Beyond privacy, content neutrality has become another focal point in discussions about DeepSeek. A Wired investigation tested DeepSeek R1 in offline mode and found certain content-related behaviors that sparked questions about model neutrality.
According to Wired, the R1 model exhibited noticeable response patterns:
- It provided detailed answers about historical events in the U.S.
- But when asked about sensitive events within China’s modern history, like the 1989 Tiananmen Square protests, it responded with variations of “I cannot answer”—even without an active internet connection.
The investigation suggested that these tendencies stem from training-time decisions rather than application-layer censorship. In other words, the model was likely optimized to avoid certain topics based on the data it was exposed to during training.
For U.S. businesses, this finding raises pragmatic concerns:
- Content Bias in Global Operations: If AI-powered tools are used for research, journalism, or regulatory analysis, content patterns like these could skew outcomes or omit context in certain regions.
- Corporate Messaging and Knowledge Bases: Businesses relying on AI for cross-market communication may need to evaluate whether regional biases align with their global objectives.
- Ethical Transparency: The presence of content gaps in historical events could become problematic for industries that depend on factual, unbiased information, such as education, legal advisory, and policy research.
To be fair, regional biases in AI models aren’t exclusive to DeepSeek. OpenAI’s ChatGPT has also faced criticism for overly cautious content moderation, particularly on politically charged or ethically complex topics.
Meta’s LLaMA models have implemented cultural and linguistic filters to comply with local regulations when deployed internationally.
In this sense, DeepSeek R1’s patterns might reflect a broader challenge in the AI industry: How do you balance content neutrality with region-specific legal requirements?
The Knowledge Distillation Controversy
The latest controversy centers on DeepSeek’s use of knowledge distillation. This technique involves training a smaller “student” model to replicate the behavior of a larger “teacher” model by learning from its outputs, thereby avoiding the need for extensive training from scratch. While common in AI development, its application here has raised ethical and legal questions, particularly concerning intellectual property rights.
Reports suggest that DeepSeek may have employed knowledge distillation using outputs from proprietary models like OpenAI’s GPT-4 without proper authorization. This could violate OpenAI’s terms of service, which prohibit using their outputs to develop competing models. OpenAI has stated it possesses evidence supporting these claims and has initiated an investigation into the matter.
But what exactly is knowledge distillation?
Think of it like teaching an intern everything a seasoned expert knows—without having them start from scratch. In AI terms, this process involves taking a large, computationally intensive model (the “teacher”) and transferring its insights to a smaller, more efficient model (the “student”).
The Big Picture: Why It Matters for U.S. Businesses
Open Source vs. Proprietary Models: A Growing Divide
The emergence of DeepSeek has reignited the debate between open-source and proprietary AI models, presenting critical considerations for U.S. enterprises. While some view DeepSeek’s rise through a geopolitical lens, industry leaders like Yann LeCun, Meta’s Chief AI Scientist, emphasize a different perspective. LeCun suggests that DeepSeek’s success underscores the ascendancy of open-source models over proprietary ones, highlighting the collaborative nature of AI advancements. He points out that DeepSeek built upon open research and models such as Meta’s Llama, enabling further innovation through shared knowledge.
This open-source paradigm offers businesses greater control, customization, and cost efficiency. Jonathan Ross, CEO of Groq, notes that enterprises are increasingly adopting open-source large language models (LLMs) to avoid vendor lock-in and to tailor AI solutions to specific needs. Meta’s Llama models exemplify this trend, with downloads approaching 350 million, reflecting a tenfold increase from the previous year.
Major corporations like Zoom, Spotify, and Goldman Sachs have integrated these models, benefiting from their flexibility and transparency.
OpenAI’s CEO, Sam Altman, has acknowledged that the company may have been “on the wrong side of history” regarding open-source strategies, indicating a possible shift towards more openness. For enterprises, this debate centers on balancing control and customization offered by open-source models against the perceived security and competitive advantages of proprietary systems. As open-source models continue to close the quality gap, businesses are increasingly considering them for greater flexibility and cost efficiency, while remaining mindful of potential risks and the need for robust governance.
However, small to medium-sized businesses (SMBs) may still prefer proprietary models due to their ease of implementation and the reduced need for in-house AI expertise. Proprietary models often come with dedicated support and user-friendly interfaces, making them attractive to companies with limited resources.
Conclusion: What DeepSeek’s Rise Means for AI’s Future
For U.S. companies, the risks of adopting DeepSeek R1 extend beyond technical performance. They enter the domain of geopolitical tension, regulatory compliance, and trust.
While open-source AI models like DeepSeek offer cost advantages and performance gains, their regulatory environment and training practices create significant uncertainty for companies that prioritize data privacy and operational neutrality.
In essence, the promise of cheaper AI comes with a high-stakes gamble on data integrity—a trade-off that many U.S. enterprises are unwilling to make. DeepSeek R1 has challenged long-held assumptions about AI costs and performance. It’s shown that large-scale models can be built efficiently—and has ignited debates about:
- Open-source vs. proprietary AI.
- AI’s efficiency/environmental impact.
- Global AI competition.
- Data and security concerns.
For American companies, the fundamental question remains: Is DeepSeek R1 worth the risk? For some applications—like non-sensitive R&D or academic research—the low costs and strong performance make it compelling. But in industries dealing with intellectual property, financial data, or regulated information, the uncertainties surrounding data security and built-in content biases present significant hurdles.
What makes us think that DeepSeek R1 wouldn’t be a model adopted massivlly for businesses but tell us a valuable lesson: AI architecture is always possible of improve and convert it more cheaper, and this is just the tip of the iceberg because more discussions about the importance of collective intelligence, security, biases, and AI costs keep appearing more intenslly.
In the meanwhile, at follow us on LinkedIn to understend every move on the industry while it’s happening,
And if your company is in the processes of building the data foundation for AI, or truying to improve the data pipelines and their integration, or even in the phase of model development, we can help you with the right stratgy and the best resources. Book a call with us and get started your AI projects with confidence.