In its 2026 Global Threat Report, CrowdStrike reported prompt injection attacks at more than 90 organizations during 2025. The injected prompts generated commands that stole credentials and cryptocurrency, marking a significant shift as these prompts now function as malware.
The report documented an 89% year-over-year rise in AI-enabled adversary operations. Additionally, 82% of intrusions involved no traditional malicious code, occurring as enterprises transitioned to using agents, copilots, and browser automations that access email, code, payments, and file shares.
Prompt injection has maintained its top ranking as LLM01 on the OWASP Top 10 for large language model applications for two consecutive editions. OWASP highlighted that language models are unable to reliably distinguish developer instructions from untrusted text, transforming what was once a research curiosity into an operational vulnerability.
Direct prompt injection takes place when a user types instructions to override a system prompt, while indirect prompt injection occurs when an attacker embeds instructions within content the model reads later, such as emails or documents. The user does not see the payload, and the agent executes the malicious commands without interaction.
Two notable incidents shed light on the severity of these vulnerabilities. In August 2024, PromptArmor disclosed that a Slack AI attacker could exfiltrate data from private channels by planting instructions in public channels or uploaded files. The following year, Aim Security reported EchoLeak (CVE-2025-32711), where a crafted email directed Microsoft 365 Copilot to retrieve internal files and send them to an attacker-controlled server, achieving a CVSS score of 9.3. Both vulnerabilities were patched, but the class of attacks remains unresolved.
The surface area of vulnerability has expanded to include a broader agentic stack, where agents that execute various tasks treat their context as authoritative. This development means long-term agent memory can retain and execute malicious instructions repeatedly.
OpenAI acknowledged in December 2025 that prompt injection is unlikely to be fully solved, often likening it to social engineering. Anthropic’s Claude Opus 4.6 system card indicated a 17.8% success rate for a single prompt injection attempt, escalating to 78.6% over 200 attempts without safeguards in place. Google reported a 53.6% success rate for prompt injection against its Gemini deployment.
In December 2025, Gartner advised CISOs to block all AI browsers, citing indirect prompt injection and other risks associated with insufficient controls. Cyberhaven reported that 27.7% of organizations had at least one user with the blocked AI tool Atlas installed, a warning echoed by the UK National Cyber Security Centre and Germany’s BSI.
The limitations of existing defenses against prompt injection stem from the shared text channels in language models. Input validation, output filtering, and other detection methods struggle due to the inherent inability to separate authorized commands from untrusted content within the model.
A separate finding indicated that 65.3% of organizations lack dedicated defenses against prompt injection, relying instead on vendor-supplied measures and policy training. Effective controls should include limiting each agent’s authority, requiring human approval for critical actions, tagging retrieval sources based on sensitivity, and implementing auditing practices.
As organizations consider AI deployments, security teams are encouraged to ask vendors about detection capabilities, success rates against prompt injections, adherence to OWASP recommendations, and the capacity to log exact agent actions. Given the vulnerabilities, it’s critical for enterprises to assume that models may occasionally follow injected instructions, necessitating robust external controls.





