Documentation Index
Fetch the complete documentation index at: https://docs.runlayer.com/llms.txt
Use this file to discover all available pages before exploring further.
Industry-Leading AI Security for MCP Ecosystems
Runlayer ToolGuard is an industry-leading suite of specialized machine learning models that protect your MCP environment from tool poisoning, prompt injection, and output manipulation attacks. With fast 50-100ms inference times, ToolGuard delivers real-time threat detection without compromising performance.Currently featuring three specialized threat classification models, with additional models in active development to address emerging attack vectors.
The Models
Tool List Guard
Scans tool definitions at registration to detect risky descriptions, prompt injection attempts, and hidden instructions before tools are made available to your environment.Tool Call Guard
Scans tool execution outputs in real-time to detect risky responses, data exfiltration attempts, and prompt injection before they reach your LLMs.Tool Intent Guard
Detects tool intent drift from prompt injections that lead to data exfiltration, privilege escalation, credential theft, and infrastructure damage. Unlike the Tool Call Guard which evaluates individual responses, Tool Intent Guard analyzes tool inputs and outputs together to detect semantic misalignment — catching cases where a tool’s actual behavior diverges from what was requested.Skill File Scanning
ToolGuard can also scan skill files uploaded to the platform. Each file’s content is analyzed using the same threat classification models, with results cached per content hash and scanner version. Large files are automatically chunked for processing. Skill file scanning runs when skills are uploaded via the CLI (skills push) or the API, producing per-file risk scores and an overall skill-level classification.
Skill Risk Policy
Admins can configure how the platform responds when a skill scan detects elevated risk. Navigate to Settings → Security Scanners to set the action for each risk tier:| Risk tier | Default action | Options |
|---|---|---|
| High | Block | Block, Alert, Allow |
| Medium | Alert | Block, Alert, Allow |
- Block — the skill import is rejected
- Alert — the skill is imported with a warning badge visible in the UI; acceptance is logged to the Audit Log
- Allow — the skill is imported without restriction
LLM Risk Categorization
When Tool List Guard flags a tool at Medium or High risk, an LLM-based categorizer automatically classifies the threat into specific attack categories. The full taxonomy includes:- Prompt Injection
- Data Exfiltration
- Privilege Escalation
- Destructive Action
- Unauthorized Communication
- Resource Abuse
- Shadow Persistence
- Context Poisoning
- Guardrail Bypass
- Supply Chain Compromise
LLM categorization requires the Bedrock integration to be enabled in your deployment. When disabled, violations retain the default ToolGuard reason.
Additional specialized models are in development to address new attack vectors as they emerge in the MCP ecosystem.
Why Industry-Leading?
Purpose-Built for MCP - Custom-trained threat classification models specifically designed for MCP ecosystem attacks. High Performance - Fast inference with typical scan times of 50-100ms. Continuously Evolving - Models are regularly refined based on emerging threat patterns. Battle-Tested - Deployed in production environments protecting real-world MCP deployments. Enterprise-Ready - Complete audit logging, flexible configuration, and Security Dashboard integration.Configuration
Navigate to Settings → Security Scanners to enable Runlayer ToolGuard models.Sensitivity Levels
Each scanner phase (Tool List Guard, Tool Call Guard, Tool Intent Guard) has a configurable sensitivity that controls how aggressively it flags findings:| Level | Behavior |
|---|---|
| Strict | Lowest tolerance — flags more items, fewer false negatives |
| Balanced (default) | Recommended for most environments |
| Moderate | Highest tolerance — fewer flags, useful for noisy connectors |
PII Scan Direction
PII detection can be applied to tool inputs, outputs, or both. The direction controls which traffic the PII scanner inspects:| Direction | Behavior |
|---|---|
| Input (default) | Scans data sent to MCP tools |
| Output | Scans data returned from MCP tools |
| Both | Scans in both directions |
Risk Tiers
Tool List Guard assigns a risk tier to each scanned tool based on its confidence score. The default thresholds are:| Tier | Score Range | Meaning |
|---|---|---|
| Minimal | < 0.6 | Clean scan, no concern |
| Low | 0.6 – 0.7 | Low-confidence flag |
| Medium | 0.7 – 0.9 | Elevated risk, review recommended |
| High | ≥ 0.9 | High-confidence detection |
RUNLAYER_TOOL_GUARD_LIST_RISK_THRESHOLD_LOW(default0.6)RUNLAYER_TOOL_GUARD_LIST_RISK_THRESHOLD_MEDIUM(default0.7)RUNLAYER_TOOL_GUARD_LIST_RISK_THRESHOLD_HIGH(default0.9)
Monitoring
Security Dashboard - View detection timelines, violation trends, and common threat types Connector Pages - Tool List Guard warnings appear directly on connector detail pages when potentially risky tools are detected Audit Logs - Full history of detections, blocks, and configuration changes with confidence scoresBest Practices
- Use per-server overrides for high-risk external servers
- Combine with MCP access policies for layered security
- Review flagged tools with your security team before blocking
Staying Ahead
Runlayer ToolGuard models are continuously refined based on emerging threat patterns in the MCP ecosystem. Our commitment to continuous innovation ensures you have industry-leading defenses as new attack techniques emerge.Model Attribution
The Runlayer ToolGuard suite utilizes GA Guard Lite for threat classification embeddings and model inputs. GA Guard Lite is licensed under Apache 2.0.
Related Resources
Platform Security
Monitor security events and violations
Security Best Practices
MCP security guidelines and recommendations
Audit Logs
View detailed activity and security logs
Access Policies
Configure MCP access control policies