AI System Threats
Comprehensive catalog of AI-specific vulnerabilities and attack vectors
Critical Threats
Immediate risks with potential for severe impact. Require urgent remediation.
High Risk
Serious vulnerabilities that should be addressed promptly to reduce exposure.
Defense Strategies
Best practices and mitigations for reducing AI threat exposure.
Threat Categories
Prompt Injection
CriticalCrafted inputs designed to manipulate model behavior, override safeguards, or extract sensitive information.
Testing Approach
- Craft adversarial prompts with hidden instructions or special characters
- Attempt multi-turn injection chaining
- Test for jailbreak bypass of alignment filters
- Evaluate output sanitization and safety layers
Training Data Poisoning
CriticalMalicious or biased data introduced into training pipelines, compromising model integrity and reliability.
Testing Approach
- Analyze data provenance and supply chain
- Inject poisoned samples and assess downstream effects
- Test resilience to mislabeled or manipulated data
- Review validation and anomaly detection mechanisms
Model Inversion
HighReconstructing training data or sensitive attributes from model outputs, leading to privacy breaches.
Testing Approach
- Attempt to recover representative training samples
- Test susceptibility to membership inference attacks
- Evaluate differential privacy protections
- Assess risk of leaking PII from embeddings
Adversarial Examples
HighInputs intentionally perturbed to cause misclassification, hallucinations, or other erroneous outputs.
Testing Approach
- Generate gradient-based adversarial examples
- Apply noise and perturbation attacks
- Check model consistency across variations
- Evaluate robustness against transfer attacks
Model Stealing
HighExtraction of model functionality or parameters through repeated queries or side-channel analysis.
Testing Approach
- Simulate query-based model extraction
- Analyze API rate limits and response variability
- Check for fingerprinting vulnerabilities
- Test throttling and monitoring protections
Data Memorization Leakage
HighSensitive information unintentionally memorized by AI models, retrievable via crafted prompts.
Testing Approach
- Probe for known secret patterns in outputs
- Test for repeated exposure of sensitive training data
- Assess risk of accidental PII disclosure
Model Misuse & Malicious Automation
HighAI leveraged to perform tasks outside intended scope, enabling social engineering, spam, or automated attacks.
Testing Approach
- Simulate misuse scenarios using sandbox models
- Test AI output moderation and guardrails
- Assess monitoring alerts for abnormal behaviors
Testing & Defense Best Practices
Safe Testing Environment
- Use sandboxed or replica instances for testing
- Never perform unauthorized tests on production systems
- Implement monitoring, logging, and rollback capabilities
Documentation & Observability
- Maintain detailed test logs and evidence
- Capture model responses for reproducibility
- Tag, classify, and organize test cases for future audits
Legal & Ethical Compliance
- Stay within authorized scope and contracts
- Respect data protection, privacy laws, and intellectual property
- Follow responsible disclosure and coordinated vulnerability disclosure
Monitoring & Mitigation
- Implement anomaly detection for unusual AI outputs
- Regularly review rate limits, API access, and query patterns
- Integrate real-time alerting for critical threats