Skip to main content

Recent Security Threats

This page highlights recent AI agent and MCP attack patterns that Runlayer ToolGuard detects and protects against. These are real-world attacks we’ve seen, but our models cover many more threat patterns beyond what’s listed here.
Our models are tested against novel attack variants—not just the exact examples from disclosures—to ensure robust, generalizable detection.

How We Stay Ahead

  1. Continuous Monitoring: We track security disclosures from industry researchers and threat intelligence sources
  2. Rapid Response: When new attack patterns emerge, we generate diverse training examples and update our models quickly
  3. Generalization Testing: We test against novel variants—not just disclosed examples—to ensure robust detection
  4. Low False Positive Rate: Our testing includes benign examples to ensure legitimate tool calls aren’t blocked

Disclosed Vulnerabilities

OpenClaw Prompt Injection and Extraction

ZeroLeaks red team: 91% injection success, 85% system prompt extraction without a proxy
January 2026Covered

Anthropic Git MCP Server

CVE-2025-68143/44/45: Path traversal, argument injection, RCE via Git filters
January 2026Covered

Reprompt

Chain-request data exfiltration from Microsoft Copilot
January 2026Covered

Claude Cowork File Exfil

curl-based file exfiltration from Claude Cowork
January 2026Covered

CamoLeak GitHub Copilot

CVSS 9.6: Prompt injection leaks private repo data
October 2025Covered

Figma MCP RCE

Command injection vulnerability (CVE-2025-53967)
September 2025Covered

Poisoned Document Injection

Hidden instructions in Google Docs, Notion, ChatGPT
September 2025Covered

ForcedLeak Webhook Exfiltration

CVSS 9.4: AI agent data exfiltration via trusted domains
September 2025Covered

Helm Chart Code Injection

CVE-2025-53547: Command injection via risky Chart.yaml
July 2025Covered

OpenClaw Prompt Injection and Extraction

Covered by Runlayer ToolGuard
Disclosed by: ZeroLeaks
Date: January 2026
Affected: OpenClaw (Clawdbot) — assessed without a security proxy
Security Score: 2/100 (ZLSS 10/10.0)
Read the full report (PDF) →

Attack Summary

ZeroLeaks performed an AI red team assessment against OpenClaw, an AI assistant running without a security proxy. The assessment achieved an 84.6% system prompt extraction rate and a 91.3% prompt injection success rate across 36 attack techniques:
  • System Prompt Extraction: JSON format conversion, many-shot priming, crescendo/progressive deepening, context overflow, peer solidarity framing, roleplay-based persona manipulation, chain-of-thought hijacking, code block auto-completion priming
  • Direct Prompt Injection: Canary injection, format/language/case overrides, persona injection, behavior overrides, false memory and false context injection, system and authority impersonation, encoding (base64) and reversal tricks
  • Indirect Prompt Injection: Hidden instructions in documents, HTML comments in emails, risky directives in code comments
Without a proxy, the target disclosed approximately 85-90% of its system prompt and complied with 21 of 23 injected instructions.

What We Detect

Our Prompt Guard detects prompt injection and extraction attempts, including:
  • Roleplay, persona manipulation, and peer solidarity social engineering
  • Chain-of-thought hijacking and verification-framing attacks
  • Format, language, and behavior override injections
  • System and authority impersonation ([SYSTEM], [ADMIN] tags)
  • Encoded instructions (base64, reversed text)
  • False memory and false context injection
  • Many-shot priming and context overflow payloads
Our Tool Call Guard and Output IO Guard detect indirect injection patterns, including:
  • Hidden instructions embedded in documents and meeting notes
  • Risky directives in HTML comments and email content
  • Code comment injection with hidden execution instructions

Anthropic Git MCP Server Vulnerabilities

Covered by Runlayer ToolGuard
Disclosed by: Cyata Security
Date: January 2026
CVEs: CVE-2025-68143, CVE-2025-68144, CVE-2025-68145
Affected: mcp-server-git (Anthropic’s reference implementation)
Status: Patched December 17, 2025
Read the full disclosure →

Attack Summary

Three vulnerabilities in Anthropic’s canonical Git MCP server allow attackers to read, delete, or overwrite arbitrary files on the host system:
  • CVE-2025-68143: Unrestricted git_init allows repository initialization on arbitrary file system paths
  • CVE-2025-68144: Argument injection in git_diff passes unsanitized input to Git CLI
  • CVE-2025-68145: Path validation bypass allows access to repositories outside the allowlist
When combined with the Filesystem MCP server, attackers can exploit Git’s smudge/clean filters to execute shell commands. The entire exploit chain can be triggered via prompt injection (risky README, poisoned issue description, or compromised web page).

What We Detect

Our Tool Call Guard detects Git MCP exploitation patterns, including:
  • Risky instructions embedded in README files and issue descriptions
  • git_init calls targeting system paths (/etc/, /var/, ~/.ssh/)
  • Path traversal attempts (../../../etc/passwd)
  • Argument injection in git commands (;, &&, | in parameters)
  • Git smudge/clean filter exploitation with shell commands
  • Download-and-execute patterns in .gitattributes configurations

Reprompt Attack

Covered by Runlayer ToolGuard
Disclosed by: Varonis Threat Labs
Date: January 2026
Affected: Microsoft Copilot Personal
Status: Patched by Microsoft
Read the full disclosure →

Attack Summary

Reprompt is a single-click attack that exfiltrates user data from Microsoft Copilot through three techniques:
  • P2P Injection: Exploits URL parameters to inject prompts directly
  • Double-Request Bypass: Evades safeguards by instructing the AI to repeat actions twice
  • Chain-Request: Continuous, hidden data exfiltration through follow-up server instructions
The attack requires no plugins or user interaction beyond clicking a link.

What We Detect

Our Tool Call Guard detects all variants of this attack pattern, including:
  • URL parameters containing encoded user data
  • Instructions to fetch external URLs with embedded variables
  • Chain-request patterns with staged exfiltration
  • Pseudo-code variable substitution for data leaks
  • Double-execution bypass attempts

Cowork File Exfiltration

Covered by Runlayer ToolGuard
Disclosed by: PromptArmor
Date: January 2026
Affected: Claude Cowork (Research Preview)
Status: Known issue, users advised to exercise caution
Read the full disclosure →

Attack Summary

Claude Cowork is vulnerable to file exfiltration attacks via indirect prompt injection. Attackers can:
  • Upload risky “skill” documents with hidden instructions
  • Exploit allowlisted API endpoints (Anthropic API) for data egress
  • Use curl commands to upload user files to attacker-controlled accounts
  • Chain AppleScript and shell commands for system access
The attack exploits the trust boundary between Cowork’s VM environment and allowlisted domains.

What We Detect

Our Tool Call Guard detects all variants of this attack pattern, including:
  • curl commands targeting file upload APIs
  • API keys embedded in shell commands
  • File path traversal and glob patterns
  • Hidden instructions in document outputs
  • Shell command chaining (osascript, do shell script)
  • Base64 encoding of sensitive data

CamoLeak GitHub Copilot Prompt Injection

Covered by Runlayer ToolGuard
Disclosed by: Legit Security (Omer Mayraz)
Date: October 2025
Affected: GitHub Copilot Chat
CVSS: 9.6 (Critical)
SecurityWeek: GitHub Copilot Chat Flaw Leaked Data → Nudge Security: CamoLeak Technical Analysis →

Attack Summary

CamoLeak exploits GitHub Copilot Chat through hidden markdown comments in PRs and issues. When Copilot ingests these instructions, it exfiltrates secrets and source code via GitHub’s Camo image proxy:
  • Hidden Markdown Injection: Risky instructions embedded in PR/issue comments
  • CSP Bypass: Data exfiltration through GitHub’s trusted Camo image proxy
  • Secret Extraction: AWS keys, API tokens, and private code leaked silently
  • Response Manipulation: Copilot recommendations altered by attacker instructions
GitHub disabled image rendering in Copilot Chat on August 14, 2025; public disclosure October 2025.

What We Detect

Our Tool Call Guard detects GitHub/Copilot prompt injection patterns, including:
  • Hidden instructions in markdown comments and code blocks
  • Data exfiltration via image URLs and proxy services
  • $(command) and backtick command substitution patterns
  • curl | bash, wget | sh, and download-and-execute chains
  • Shell metacharacters in tool outputs (;, &&, |)
  • Base64-encoded command payloads

Figma MCP RCE

Covered by Runlayer ToolGuard
CVE: CVE-2025-53967
Date: September 2025
Affected: Figma MCP Server (figma-developer-mcp) < 0.6.3
Status: Patched in version 0.6.3
Read the full disclosure →

Attack Summary

A command injection vulnerability in the Figma MCP Server allows unauthenticated remote attackers to execute arbitrary OS commands. The vulnerability exists in the fetchWithRetry function which falls back to curl via child_process.exec when native fetch fails:
  • Shell Metacharacter Injection: Attackers inject characters like ;, &&, or | into URLs or headers
  • No Authentication Required: The attack requires only network access to the MCP server
  • Full Code Execution: Successful exploitation leads to arbitrary command execution with MCP process privileges
CVSS Score: 8.0 (High)

What We Detect

Our Tool Call Guard detects command injection patterns in tool outputs, including:
  • Shell metacharacters in URLs and parameters (;, &&, |, backticks)
  • Command chaining attempts in tool responses
  • child_process.exec exploitation patterns
  • Encoded shell commands in tool outputs
  • Suspicious curl command constructions

Poisoned Document Injection

Covered by Runlayer ToolGuard
Disclosed by: Multiple researchers (Wired, Simon Willison, Google Security)
Date: September 2025
Affected: ChatGPT, Notion AI, Google Gemini, Google Drive, Confluence
Severity: High
Wired: Poisoned Document Could Leak Secret Data → Simon Willison: Notion’s Lethal Trifecta → Notion: How We Protect Against Prompt Injection →

Attack Summary

Risky instructions are embedded in documents that AI assistants retrieve through MCP tools. The hidden directives instruct the AI to exfiltrate data, call additional tools, or send information to attacker-controlled endpoints:
  • Hidden System Directives: Instructions disguised as “AI PROCESSING NOTE” or “SYSTEM DIRECTIVE”
  • Fake Compliance Requirements: Claims like “GDPR requires logging to external endpoint”
  • Embedded Webhook Calls: Instructions to POST conversation context to external URLs
  • HTML Comment Injection: Directives hidden in <!-- --> comment blocks
  • Metadata Field Exploitation: Risky instructions in _sync, _analytics, or _telemetry fields

What We Detect

Our Tool Call Guard detects poisoned document patterns, including:
  • Instructions to call webhook, http_request, or API tools
  • External URLs disguised as internal analytics endpoints
  • Directives to include user identity, tokens, or conversation context
  • Fake compliance/audit logging requirements
  • Hidden instruction markers ([SYSTEM], AI_DIRECTIVE, <<HIDDEN>>)
  • Instructions targeting Slack, email, or messaging tools for exfiltration

ForcedLeak Webhook Exfiltration

Covered by Runlayer ToolGuard
Disclosed by: Noma Security
Date: September 2025
Affected: Salesforce Agentforce, SaaS AI agents with webhook access
CVSS: 9.4 (Critical)
Noma Security: ForcedLeak Disclosure → The Register: Salesforce Agentforce ForcedLeak Attack → MITRE ATT&CK: Exfiltration Over Webhook (T1567.004) →

Attack Summary

ForcedLeak exploits AI agents through prompt injection in form fields (like Web-to-Lead), then exfiltrates CRM data via expired domains still trusted by CSP policies:
  • Fake Analytics Endpoints: URLs like https://slack-analytics.workspace-insights.io
  • Disguised Sync Webhooks: Instructions to POST data to “backup” or “sync” services
  • Metadata Field Exploitation: _telemetry, _analytics, _sync fields with exfil instructions
  • Compliance Theater: Claims that logging is required for SOC2/GDPR compliance
  • Integration Mimicry: URLs that mimic legitimate SaaS domains (e.g., atlassian-analytics.io)

What We Detect

Our Tool Call Guard detects subtle exfiltration patterns, including:
  • External URLs in _metadata, _sync, _telemetry, or _analytics fields
  • Domains that mimic legitimate SaaS providers
  • Instructions to include accessor identity, tokens, or query context
  • Fake webhook URLs disguised as integrations
  • Claims of compliance or audit requirements for external logging
  • POST/webhook instructions targeting user email, API keys, or session data

Helm Chart Code Injection

Covered by Runlayer ToolGuard
CVE: CVE-2025-53547 (Helm), CVE-2025-53355 (mcp-server-kubernetes)
Date: July 2025
Affected: Helm < 3.18.4, mcp-server-kubernetes < 2.5.0, kubectl MCP integrations
Severity: Critical (CVSS 8.5-8.9)
GitHub Advisory: CVE-2025-53547 → Vulert: Helm Code Injection Analysis →

Attack Summary

Command injection payloads are embedded in Kubernetes manifests, Helm chart values, and container configurations returned through infrastructure MCP tools. When AI assistants process these outputs, they may execute or recommend applying configurations containing risky commands:
  • Container Command Injection: Shell commands in command: and args: arrays
  • Init Container Exploitation: Risky init containers with download-and-execute patterns
  • Environment Variable Injection: Command substitution in environment values
  • Lifecycle Hook Abuse: PostStart/PreStop hooks with shell execution
  • Helm Values Poisoning: Risky commands in Helm values.yaml or templates
  • ConfigMap/Secret Injection: Scripts embedded in configuration data

What We Detect

Our Tool Call Guard detects K8s/Helm command injection, including:
  • Shell commands in YAML command: arrays (sh -c "$(curl ...)")
  • Init containers with busybox/alpine running curl/wget
  • lifecycle.postStart and lifecycle.preStop hook exploitation
  • Environment variables with $() command substitution
  • Helm pre/post install hooks with shell execution
  • ConfigMap data containing embedded shell scripts
  • ArgoCD/FluxCD sync hooks with risky commands

Runlayer ToolGuard

Learn about our AI security suite

Security Dashboard

Monitor security events in real-time

Security Best Practices

Guidelines for securing your MCP environment

Audit Logs

View detailed security logs