Skip to main content

Recent Security Threats

This page highlights recent AI agent and MCP attack patterns that Runlayer ToolGuard detects and protects against. These are real-world attacks we’ve seen, but our models cover many more threat patterns beyond what’s listed here.
Our models are tested against novel attack variants—not just the exact examples from disclosures—to ensure robust, generalizable detection.

How We Stay Ahead

  1. Continuous Monitoring: We track security disclosures from industry researchers and threat intelligence sources
  2. Rapid Response: When new attack patterns emerge, we generate diverse training examples and update our models quickly
  3. Generalization Testing: We test against novel variants—not just disclosed examples—to ensure robust detection
  4. Low False Positive Rate: Our testing includes benign examples to ensure legitimate tool calls aren’t blocked

Disclosed Vulnerabilities


Anthropic Git MCP Server Vulnerabilities

Covered by Runlayer ToolGuard
Disclosed by: Cyata Security
Date: January 2026
CVEs: CVE-2025-68143, CVE-2025-68144, CVE-2025-68145
Affected: mcp-server-git (Anthropic’s reference implementation)
Status: Patched December 17, 2025
Read the full disclosure →

Attack Summary

Three vulnerabilities in Anthropic’s canonical Git MCP server allow attackers to read, delete, or overwrite arbitrary files on the host system:
  • CVE-2025-68143: Unrestricted git_init allows repository initialization on arbitrary file system paths
  • CVE-2025-68144: Argument injection in git_diff passes unsanitized input to Git CLI
  • CVE-2025-68145: Path validation bypass allows access to repositories outside the allowlist
When combined with the Filesystem MCP server, attackers can exploit Git’s smudge/clean filters to execute shell commands. The entire exploit chain can be triggered via prompt injection (malicious README, poisoned issue description, or compromised web page).

What We Detect

Our Tool Call Guard detects Git MCP exploitation patterns, including:
  • Malicious instructions embedded in README files and issue descriptions
  • git_init calls targeting system paths (/etc/, /var/, ~/.ssh/)
  • Path traversal attempts (../../../etc/passwd)
  • Argument injection in git commands (;, &&, | in parameters)
  • Git smudge/clean filter exploitation with shell commands
  • Download-and-execute patterns in .gitattributes configurations

Reprompt Attack

Covered by Runlayer ToolGuard
Disclosed by: Varonis Threat Labs
Date: January 2026
Affected: Microsoft Copilot Personal
Status: Patched by Microsoft
Read the full disclosure →

Attack Summary

Reprompt is a single-click attack that exfiltrates user data from Microsoft Copilot through three techniques:
  • P2P Injection: Exploits URL parameters to inject prompts directly
  • Double-Request Bypass: Evades safeguards by instructing the AI to repeat actions twice
  • Chain-Request: Continuous, hidden data exfiltration through follow-up server instructions
The attack requires no plugins or user interaction beyond clicking a link.

What We Detect

Our Tool Call Guard detects all variants of this attack pattern, including:
  • URL parameters containing encoded user data
  • Instructions to fetch external URLs with embedded variables
  • Chain-request patterns with staged exfiltration
  • Pseudo-code variable substitution for data leaks
  • Double-execution bypass attempts

Cowork File Exfiltration

Covered by Runlayer ToolGuard
Disclosed by: PromptArmor
Date: January 2026
Affected: Claude Cowork (Research Preview)
Status: Known issue, users advised to exercise caution
Read the full disclosure →

Attack Summary

Claude Cowork is vulnerable to file exfiltration attacks via indirect prompt injection. Attackers can:
  • Upload malicious “skill” documents with hidden instructions
  • Exploit allowlisted API endpoints (Anthropic API) for data egress
  • Use curl commands to upload user files to attacker-controlled accounts
  • Chain AppleScript and shell commands for system access
The attack exploits the trust boundary between Cowork’s VM environment and allowlisted domains.

What We Detect

Our Tool Call Guard detects all variants of this attack pattern, including:
  • curl commands targeting file upload APIs
  • API keys embedded in shell commands
  • File path traversal and glob patterns
  • Hidden instructions in document outputs
  • Shell command chaining (osascript, do shell script)
  • Base64 encoding of sensitive data

CamoLeak GitHub Copilot Prompt Injection

Covered by Runlayer ToolGuard
Disclosed by: Legit Security (Omer Mayraz)
Date: October 2025
Affected: GitHub Copilot Chat
CVSS: 9.6 (Critical)
SecurityWeek: GitHub Copilot Chat Flaw Leaked Data → Nudge Security: CamoLeak Technical Analysis →

Attack Summary

CamoLeak exploits GitHub Copilot Chat through hidden markdown comments in PRs and issues. When Copilot ingests these instructions, it exfiltrates secrets and source code via GitHub’s Camo image proxy:
  • Hidden Markdown Injection: Malicious instructions embedded in PR/issue comments
  • CSP Bypass: Data exfiltration through GitHub’s trusted Camo image proxy
  • Secret Extraction: AWS keys, API tokens, and private code leaked silently
  • Response Manipulation: Copilot recommendations altered by attacker instructions
GitHub disabled image rendering in Copilot Chat on August 14, 2025; public disclosure October 2025.

What We Detect

Our Tool Call Guard detects GitHub/Copilot prompt injection patterns, including:
  • Hidden instructions in markdown comments and code blocks
  • Data exfiltration via image URLs and proxy services
  • $(command) and backtick command substitution patterns
  • curl | bash, wget | sh, and download-and-execute chains
  • Shell metacharacters in tool outputs (;, &&, |)
  • Base64-encoded command payloads

Figma MCP RCE

Covered by Runlayer ToolGuard
CVE: CVE-2025-53967
Date: September 2025
Affected: Figma MCP Server (figma-developer-mcp) < 0.6.3
Status: Patched in version 0.6.3
Read the full disclosure →

Attack Summary

A command injection vulnerability in the Figma MCP Server allows unauthenticated remote attackers to execute arbitrary OS commands. The vulnerability exists in the fetchWithRetry function which falls back to curl via child_process.exec when native fetch fails:
  • Shell Metacharacter Injection: Attackers inject characters like ;, &&, or | into URLs or headers
  • No Authentication Required: The attack requires only network access to the MCP server
  • Full Code Execution: Successful exploitation leads to arbitrary command execution with MCP process privileges
CVSS Score: 8.0 (High)

What We Detect

Our Tool Call Guard detects command injection patterns in tool outputs, including:
  • Shell metacharacters in URLs and parameters (;, &&, |, backticks)
  • Command chaining attempts in tool responses
  • child_process.exec exploitation patterns
  • Encoded shell commands in tool outputs
  • Suspicious curl command constructions

Poisoned Document Injection

Covered by Runlayer ToolGuard
Disclosed by: Multiple researchers (Wired, Simon Willison, Google Security)
Date: September 2025
Affected: ChatGPT, Notion AI, Google Gemini, Google Drive, Confluence
Severity: High
Wired: Poisoned Document Could Leak Secret Data → Simon Willison: Notion’s Lethal Trifecta → Notion: How We Protect Against Prompt Injection →

Attack Summary

Malicious instructions are embedded in documents that AI assistants retrieve through MCP tools. The hidden directives instruct the AI to exfiltrate data, call additional tools, or send information to attacker-controlled endpoints:
  • Hidden System Directives: Instructions disguised as “AI PROCESSING NOTE” or “SYSTEM DIRECTIVE”
  • Fake Compliance Requirements: Claims like “GDPR requires logging to external endpoint”
  • Embedded Webhook Calls: Instructions to POST conversation context to external URLs
  • HTML Comment Injection: Directives hidden in <!-- --> comment blocks
  • Metadata Field Exploitation: Malicious instructions in _sync, _analytics, or _telemetry fields

What We Detect

Our Tool Call Guard detects poisoned document patterns, including:
  • Instructions to call webhook, http_request, or API tools
  • External URLs disguised as internal analytics endpoints
  • Directives to include user identity, tokens, or conversation context
  • Fake compliance/audit logging requirements
  • Hidden instruction markers ([SYSTEM], AI_DIRECTIVE, <<HIDDEN>>)
  • Instructions targeting Slack, email, or messaging tools for exfiltration

ForcedLeak Webhook Exfiltration

Covered by Runlayer ToolGuard
Disclosed by: Noma Security
Date: September 2025
Affected: Salesforce Agentforce, SaaS AI agents with webhook access
CVSS: 9.4 (Critical)
Noma Security: ForcedLeak Disclosure → The Register: Salesforce Agentforce ForcedLeak Attack → MITRE ATT&CK: Exfiltration Over Webhook (T1567.004) →

Attack Summary

ForcedLeak exploits AI agents through prompt injection in form fields (like Web-to-Lead), then exfiltrates CRM data via expired domains still trusted by CSP policies:
  • Fake Analytics Endpoints: URLs like https://slack-analytics.workspace-insights.io
  • Disguised Sync Webhooks: Instructions to POST data to “backup” or “sync” services
  • Metadata Field Exploitation: _telemetry, _analytics, _sync fields with exfil instructions
  • Compliance Theater: Claims that logging is required for SOC2/GDPR compliance
  • Integration Mimicry: URLs that mimic legitimate SaaS domains (e.g., atlassian-analytics.io)

What We Detect

Our Tool Call Guard detects subtle exfiltration patterns, including:
  • External URLs in _metadata, _sync, _telemetry, or _analytics fields
  • Domains that mimic legitimate SaaS providers
  • Instructions to include accessor identity, tokens, or query context
  • Fake webhook URLs disguised as integrations
  • Claims of compliance or audit requirements for external logging
  • POST/webhook instructions targeting user email, API keys, or session data

Helm Chart Code Injection

Covered by Runlayer ToolGuard
CVE: CVE-2025-53547 (Helm), CVE-2025-53355 (mcp-server-kubernetes)
Date: July 2025
Affected: Helm < 3.18.4, mcp-server-kubernetes < 2.5.0, kubectl MCP integrations
Severity: Critical (CVSS 8.5-8.9)
GitHub Advisory: CVE-2025-53547 → Vulert: Helm Code Injection Analysis →

Attack Summary

Command injection payloads are embedded in Kubernetes manifests, Helm chart values, and container configurations returned through infrastructure MCP tools. When AI assistants process these outputs, they may execute or recommend applying configurations containing malicious commands:
  • Container Command Injection: Shell commands in command: and args: arrays
  • Init Container Exploitation: Malicious init containers with download-and-execute patterns
  • Environment Variable Injection: Command substitution in environment values
  • Lifecycle Hook Abuse: PostStart/PreStop hooks with shell execution
  • Helm Values Poisoning: Malicious commands in Helm values.yaml or templates
  • ConfigMap/Secret Injection: Scripts embedded in configuration data

What We Detect

Our Tool Call Guard detects K8s/Helm command injection, including:
  • Shell commands in YAML command: arrays (sh -c "$(curl ...)")
  • Init containers with busybox/alpine running curl/wget
  • lifecycle.postStart and lifecycle.preStop hook exploitation
  • Environment variables with $() command substitution
  • Helm pre/post install hooks with shell execution
  • ConfigMap data containing embedded shell scripts
  • ArgoCD/FluxCD sync hooks with malicious commands