# OWASP LLM Top 10 Testing Checklist

## Mapped Testing Checklist for LLM01–LLM10

### LLM01: Prompt Injection
**Severity: Critical**

- [ ] Test direct prompt injection (override system instructions)
- [ ] Test indirect prompt injection (via external data)
- [ ] Test jailbreak techniques
- [ ] Test system prompt extraction
- [ ] Verify input validation controls
- [ ] Test privilege separation between user/system prompts
- [ ] **Tools**: Garak, PyRIT, Prompt Map, Rebuff

### LLM02: Insecure Output Handling
**Severity: Critical**

- [ ] Test for XSS in LLM outputs
- [ ] Test for SQL injection via LLM responses
- [ ] Test for command injection in generated code
- [ ] Verify output encoding/escaping
- [ ] Test for PII leakage in responses
- [ ] Check for sensitive data exposure
- [ ] **Tools**: Burp Suite, OWASP ZAP

### LLM03: Training Data Poisoning
**Severity: Critical**

- [ ] Assess training data sources
- [ ] Test for data integrity controls
- [ ] Verify data provenance tracking
- [ ] Test for backdoor injection in training
- [ ] Check data validation pipelines
- [ ] **Tools**: Model cards, data lineage tools

### LLM04: Model Denial of Service
**Severity: High**

- [ ] Test resource exhaustion attacks
- [ ] Test recursive context expansion
- [ ] Test long input handling
- [ ] Verify rate limiting
- [ ] Test compute-intensive prompt patterns
- [ ] **Tools**: Load testing frameworks

### LLM05: Supply Chain Vulnerabilities
**Severity: High**

- [ ] Assess model provenance
- [ ] Verify dependency integrity
- [ ] Test plugin/tool security
- [ ] Check for poisoned models
- [ ] Verify signing and verification
- [ ] **Tools**: SLSA, Sigstore

### LLM06: Sensitive Information Disclosure
**Severity: Critical**

- [ ] Test for training data extraction
- [ ] Test for PII leakage
- [ ] Test for credential exposure
- [ ] Verify output filtering
- [ ] Check for system information leakage
- [ ] **Tools**: Garak, LLM Guard

### LLM07: Insecure Plugin Design
**Severity: High**

- [ ] Test plugin input validation
- [ ] Test for plugin sandboxing
- [ ] Verify plugin permissions
- [ ] Test plugin-to-plugin communication
- [ ] Check for plugin escalation paths
- [ ] **Tools**: MCP Inspector, custom scanners

### LLM08: Excessive Agency
**Severity: High**

- [ ] Test agent permission boundaries
- [ ] Test for unauthorized actions
- [ ] Verify approval workflows
- [ ] Test goal hijacking
- [ ] Check action audit logs
- [ ] **Tools**: AgentOps, OpenClaw Scanner

### LLM09: Overreliance
**Severity: Medium**

- [ ] Test for hallucination detection
- [ ] Verify fact-checking mechanisms
- [ ] Test for authoritative tone abuse
- [ ] Check output verification
- [ ] **Tools**: Fact-checking APIs

### LLM10: Unbounded Consumption
**Severity: Medium**

- [ ] Test API rate limits
- [ ] Test token limits
- [ ] Test cost controls
- [ ] Verify timeout mechanisms
- [ ] Test resource quotas
- [ ] **Tools**: API gateway tools

---

## Testing Methodology

1. **Reconnaissance**: Map system architecture and data flow
2. **Vulnerability Mapping**: Identify LLM-specific risks
3. **Attack Execution**: Perform targeted tests
4. **Impact Assessment**: Evaluate business impact
5. **Reporting**: Document findings with mitigations

---
*Generated by AI Hacking - ai-hacking.cyberchaos.nl*
*Last updated: April 2026*
