🔧 AI Security Testing Tools
Curated collection of 25+ tools for AI/ML security testing, red teaming, prompt injection, and LLM vulnerability assessment
Prompt Injection Testing 6 tools
Garak
Open SourceOpen-source LLM vulnerability scanner by NVIDIA. Probes for hallucinations, data leakage, prompt injection, toxicity, and jailbreaks with automated reporting.
PyRIT
Open SourceMicrosoft's open-source framework for AI red teaming. Automates attacks on LLM endpoints including jailbreaks, prompt extraction, and harmful content generation.
LLM Guard
Open SourceOpen-source input/output scanner for LLM applications. Detects prompt injection, PII leakage, toxic outputs, and other security risks in real-time.
Rebuff
Open SourceSelf-hardening prompt injection detector with multi-layer defense. Uses heuristics, LLM-based analysis, and vector similarity to catch attacks.
Prompt Map
Open SourceVisual prompt injection testing tool that maps attack surfaces and identifies vulnerabilities through systematic prompt mutation and fuzzing.
Comment and Control Scanner
Open SourceDetects prompt injection via GitHub PR titles, issues, and comments targeting AI coding agents (Claude Code, Gemini CLI, Copilot Agent). Inspired by the April 2026 Johns Hopkins research.
AI Red Teaming 4 tools
AutoRedTeam
Open SourceOpen-source automated red teaming framework for LLMs. Generates adversarial prompts, evaluates model responses, and produces security reports.
Adversa AI Platform
CommercialCommercial AI red teaming platform with automated adversarial testing, robustness evaluation, and compliance reporting for enterprise LLM deployments.
Microsoft AI Red Team
CommercialMicrosoft's official AI red team resources and PyRIT framework. Includes attack libraries, evaluation datasets, and best practices for LLM security testing.
Lakera
CommercialEnterprise-grade LLM security platform with real-time prompt injection detection, data loss prevention, and automated security assessments.
Agent Security 4 tools
OpenClaw Security Scanner
Open SourceSecurity scanner for OpenClaw agent frameworks. Detects exposed instances, tool registry poisoning, gateway misconfigurations, and function-call injection risks.
MCP Inspector
Open SourceOpen-source debugging and security inspection tool for Model Context Protocol servers. Validates inputs, audits tool definitions, and tests for injection vectors.
AgentOps
CommercialObservability and security monitoring platform for AI agents. Tracks agent actions, detects anomalous behavior, and provides security audit trails.
Hermes Agent Guard
Open SourceSecurity scanner for Hermes Agent frameworks. Detects tool registry poisoning, function-call injection, Brainworm/C2-style attacks, and malicious skill patterns.
LLM API Security 2 tools
Burp Suite AI Extensions
CommercialCollection of Burp Suite extensions for testing LLM-backed APIs. Includes prompt injection payloads, response analyzers, and API security test cases.
OWASP LLM Top 10 Checklist
Open SourceOfficial OWASP checklist for LLM application security testing. Structured test cases covering all 10 LLM risk categories with verification steps.
Model Evaluation 3 tools
EleutherAI LM Eval
Open SourceOpen-source framework for evaluating language models on hundreds of benchmarks. Supports safety evaluations, truthfulness tests, and custom task creation.
HELM
Open SourceHolistic Evaluation of Language Models by Stanford. Comprehensive benchmarking framework covering accuracy, calibration, robustness, fairness, and social bias.
MLCommons AI Safety
Open SourceIndustry-standard AI safety benchmarking initiative. Provides standardized tests for hazardous capabilities, harmful outputs, and model alignment evaluation.
Prompt Injection Tool Comparison
Side-by-side feature comparison of the top 5 prompt injection testing tools.
| Feature | Garak | PyRIT | LLM Guard | Rebuff | Prompt Map |
|---|---|---|---|---|---|
| Open Source | Yes | Yes | Yes | Yes | Yes |
| Automation | Full | Full | Real-time | API-based | Semi-auto |
| LLM Support | OpenAI, Anthropic, Local | Azure OpenAI, OpenAI | Any (middleware) | Any (SDK) | OpenAI, Local |
| Report Generation | Detailed | Detailed | Metrics | Minimal | Visual |
| Price | Free | Free | Free | Free | Free |
Getting Started
Quick-start checklist for beginning your AI security testing journey.
Pre-Testing Checklist
First Test Run
Pro tip: Start with Garak for comprehensive automated scanning, then use PyRIT for targeted adversarial testing. Always test in isolated environments before production deployment.
Usage Guidelines
Legal & Ethical
- Always obtain proper authorization before testing
- Respect rate limits and terms of service
- Never use production data without consent
- Follow responsible disclosure practices
Technical Best Practices
- Test in isolated environments first
- Document all testing activities
- Implement monitoring and rollback capabilities
- Use multiple tools for comprehensive coverage