LLMs

Efficiency comes at a price in DeepSeek’s debut

Efficiency comes at a price in DeepSeek’s debut
Credit: Outlever
Key Points
  • DeepSeek revolutionizes AI efficiency with its transformer architecture, achieving performance similar to Llama 3 using significantly fewer resources.
  • But privacy concerns arise for organizations, as data falls under Chinese jurisdiction, potentially exposing sensitive information and reflecting government biases.
Key Points
  • DeepSeek revolutionizes AI efficiency with its transformer architecture, achieving performance similar to Llama 3 using significantly fewer resources.
  • But privacy concerns arise for organizations, as data falls under Chinese jurisdiction, potentially exposing sensitive information and reflecting government biases.
While the reports that training cost a meager $6 million are impressive, it should be interpreted in the broader context of their parent company. DeepSeek is a research venture funded by High-Flyer, a hedge fund and quantitative trading operation who adopted AI several years ago.
Jonathan Baker
Principal Cloud Engineer | BlackLine

DeepSeek is reframing what is possible for AI in terms of efficiency and power–but its leap forward doesn’t come without risk. 

We spoke with Jonathan Baker, Principal Cloud Engineer at Financial Ops giant BlackLine and host of the Cloud-Pod weekly podcast, to shed light on what DeepSeek’s developments really mean as a computational feat and underlying security threat. 

AI efficiency redefined: "DeepSeek’s transformer architecture improvements achieve performance comparable to Llama 3 while using just one-tenth of the compute resources, and we are entering an era where advances due to algorithmic innovation outpace the gains we see by scaling hardware," says Baker. 

This efficiency leap is credited to the mixture-of-experts (MoE) approach, which, according to Baker, "activates only 37 billion of its total 671 billion parameters during each forward pass, and the significant reduction in memory usage during inference will become the foundation for models that scale to trillions of parameters while keeping operational costs manageable."

Cost in question: While the reported $6 million training cost for DeepSeek is eye-catching, new reports reveal that DeepSeek may have cost much more to develop

Baker emphasizes context: "While the reports that training cost a meager $6 million are impressive, it should be interpreted in the broader context of their parent company. DeepSeek is a research venture funded by High-Flyer, a hedge fund and quantitative trading operation who adopted AI several years ago. The research relies on pre-existing infrastructure funded through other High-Flyer revenue streams which provided them with a fairly unique advantage not easily duplicated by others."

When evaluating DeepSeek, organizations must be cognizant of the implications of using AI services hosted within China... The collected data falls under Chinese regulatory jurisdiction, which means that foreign business data could be examined in accordance with Chinese law, exposing intellectual property, sensitive, or personal information.
Jonathan Baker
Principal Cloud Engineer | BlackLine

Privacy trade-off: For organizations considering DeepSeek’s services, privacy risks loom large. Baker notes, "When evaluating DeepSeek, organizations must be cognizant of the implications of using AI services hosted within China. The policy states that in addition to conversational data, keystrokes are used to ‘fingerprint’ the user, and uploaded content is stored indefinitely and potentially without a mechanism to request deletion. The collected data falls under Chinese regulatory jurisdiction, which means that foreign business data could be examined in accordance with Chinese law, exposing intellectual property, sensitive, or personal information."

Moreover, AI outputs might reflect biases shaped by Chinese government policies. Baker explains, "It’s well documented that some topics are explicitly restricted, but the model's training could incorporate biases or limitations that align with Chinese government policies - creating inconsistencies in generated content for businesses operating globally or exposing them to compliance issues in other jurisdictions."

Most Popular