Shadow ML Tools Like DeepSeek AI Raise Significant Security Concerns

Introduction

The proliferation of “Shadow ML” and generative AI (GenAI) tools, such as DeepSeek AI, presents significant security challenges. These tools are often deployed without IT oversight, bypassing essential security protocols and compliance frameworks [1], leading to vulnerabilities like data leakage, plagiarism [1], model bias [1] [3] [4], adversarial attacks [1], and data poisoning [1]. The unauthorized use of these AI tools creates unmonitored security risks, potentially exposing sensitive corporate data and leading to critical security blind spots [6]. This situation threatens the integrity and trustworthiness of AI-driven decisions in critical sectors [1], including finance [4], healthcare [1], and national security [1] [2].

Description

The rise of “Shadow ML” and generative AI (GenAI) tools like DeepSeek AI poses significant security risks as machine learning models are deployed without IT oversight, bypassing essential security protocols and compliance frameworks [1]. This unauthorized proliferation of AI tools can lead to various vulnerabilities [1], including data leakage [6], plagiarism [1], model bias [1] [3] [4], adversarial attacks [1], and data poisoning [1]. The lack of visibility into the usage of these unapproved tools creates unmonitored security vulnerabilities [6], exposing sensitive corporate data and leading to critical security blind spots [6]. Employees using unauthorized GenAI software may inadvertently open digital backdoors to confidential information [6], risking misuse and regulatory penalties [6]. These risks threaten the integrity and trustworthiness of AI-driven decisions in critical sectors such as finance [1], healthcare [1], and national security [1] [2].

A recent example highlighting these concerns is the generative AI model R1, released by the Chinese start-up DeepSeek in January. Security analyses from Japanese and US firms have revealed that R1 lacks adequate safeguards against misuse [4], enabling it to generate content that facilitates criminal activities [4], such as creating malware and providing instructions for making Molotov cocktails. Takashi Yoshikawa from Mitsui Bussan Secure Directions [3] [4], Inc. tested the model with prompts designed to elicit harmful responses [3], resulting in the generation of ransomware source code [3]. In contrast [3] [4], other generative AI models [3] [4] [6], like ChatGPT, have refused to provide similar information [3] [4], underscoring the varying levels of security prioritization among AI developers. The US-based Palo Alto Networks confirmed that R1 could produce instructions for stealing login information without requiring professional expertise [3] [4], suggesting that DeepSeek prioritized rapid deployment over security measures [3] [4]. Consequently, many Japanese municipalities and companies have restricted the use of DeepSeek’s AI technology due to concerns about data security, particularly regarding the storage of personal information on servers in China [4].

In light of these security concerns, several governments and organizations have taken decisive action against DeepSeek. Governor Kathy Hochul of New York State has issued a ban on DeepSeek for all government networks and devices due to potential foreign surveillance risks [2]. The US Department of Defense has blocked access to DeepSeek technologies [2], and NASA has prohibited its employees from using the technology to prevent possible data breaches [2]. Similarly, Texas Governor Greg Abbott has banned DeepSeek from all state government-issued devices as part of broader cybersecurity measures [2]. The Australian government has also banned DeepSeek from all government systems [2], citing national security threats [1] [2], while major telecommunications companies in Australia [2], including TPG Telecom [2], Optus [2], and Telstra [2], have restricted access due to privacy and security concerns [2]. Additionally, universities in Australia [2], particularly those in the Group of Eight (Go8) research universities [2], have either banned or strongly discouraged the use of DeepSeek in academic research and administrative work [2]. Officials express concerns that the AI tool could be exploited for cyber-espionage or unauthorized data extraction [2].

The increasing popularity of GenAI tools like DeepSeek underscores the necessity for organizations to reevaluate the safe and ethical usage of these technologies. Security and business leaders must educate staff on the benefits and risks associated with GenAI [5], similar to past initiatives on phishing awareness [5]. Organizations must proactively develop secure AI frameworks to ensure data protection [5], retention [5], and safe sharing [5], while exercising caution when adopting new AI technologies [7]. This includes incorporating encryption, transparency in data handling [7], and adherence to both local and international data protection regulations [7]. Notably, DeepSeek presents distinct data security risks that extend beyond those associated with general large language models (LLMs) [7]. The platform may access, retain [7], or share user data with government agencies [7], raising concerns for enterprises and government entities [7], particularly in the Middle East [7]. It is crucial for these organizations to utilize AI tools that conform to relevant data protection laws and privacy standards to mitigate risks [7].

The AI/ML model lifecycle is fraught with vulnerabilities [1], including model backdooring [1], where pre-trained models are compromised to produce biased predictions [1], and data poisoning [1], where attackers inject malicious data during training [1]. Adversarial attacks [1], which involve subtle modifications to input data that mislead AI models [1], are also a serious concern [1]. Implementation vulnerabilities [1], such as weak access controls and improperly configured containers [1], can allow unauthorized users to tamper with models or extract sensitive data [1]. The use of open-source ML models and third-party datasets further increases supply chain risks [1], necessitating thorough verification of all components [1].

To mitigate these threats [1] [6], organizations must adopt a proactive security posture that integrates strong technical controls with comprehensive exposure management strategies. Key practices include model validation to identify biases and weaknesses [1], dependency management to ensure trusted sources for ML frameworks [1], and code security through static and dynamic analysis [1]. Continuous monitoring and testing should be standard practice [8], embedding security into AI initiatives from the outset [8]. Additionally, binary code analysis is essential to detect hidden risks [1], while hardening container environments and digitally signing AI models can maintain integrity throughout the development lifecycle [1]. Continuous monitoring for suspicious activity and deviations in model behavior is also crucial [1]. By embedding these security measures into the AI development lifecycle [1], organizations can create resilient MLOps pipelines that balance innovation with robust protection [1].

As AI adoption accelerates [1], the need for dedicated security strategies becomes increasingly important [1]. The emergence of Agentic AI [1], capable of making autonomous decisions [1], complicates governance and oversight [1]. Organizations that proactively address these evolving risks will be better positioned to innovate without compromising security [1]. The DeepSeek incident underscores the necessity of embedding security into the core of AI development [1], ensuring that progress does not come at the expense of safety [1]. Alarmingly, DeepSeek’s privacy policy indicates that the platform collects extensive personal data [7], including basic profile information [7], user inputs [7], device and network details [7], cookies [7], and payment information [7]. It also gathers highly sensitive data such as keystroke patterns and access tokens from services like Apple and Google [7], allowing DeepSeek to monitor and store intricate user behavior beyond mere interactions with its AI models [7]. Failure to implement robust security measures can lead to data breaches [8], regulatory penalties [6] [8], and severe damage to operations and reputation [8], making it essential to prioritize security in the rapidly evolving AI landscape.

Conclusion

The challenges posed by the rise of “Shadow ML” and GenAI tools like DeepSeek AI necessitate a reevaluation of security strategies. Organizations must prioritize the development of secure AI frameworks [5], educate staff on the associated risks [5], and ensure compliance with data protection regulations. By embedding robust security measures into the AI development lifecycle [1], organizations can mitigate risks and maintain the integrity of AI-driven decisions. As AI technologies continue to evolve, proactive security strategies will be crucial in balancing innovation with protection, ensuring that advancements do not compromise safety or privacy.

References

[1] https://www.cybersecurityintelligence.com/blog/the-security-risks-behind-shadow-ml-adoption-8349.html
[2] https://teckpath.com/deepseek-global-ban-countries-2025/
[3] https://japannews.yomiuri.co.jp/society/general-news/20250407-247297/
[4] https://www.straitstimes.com/asia/east-asia/deepseek-ai-model-generates-information-that-can-be-used-in-crime-analyses-show
[5] https://itwire.com/business-it-news/security/outpacing-ai-driven-cyber-attacks-strategies-for-modern-defenders.html
[6] https://cybersecasia.net/newsletter/shadow-ai-open-source-genais-hidden-threats-to-enterprise-security/
[7] https://techxmedia.com/en/is-your-data-safe-with-deepseek/
[8] https://securityboulevard.com/2025/04/deepseek-breach-yet-again-sheds-light-on-dangers-of-ai/

You may also want to see:

Southampton UK