AI Hacking
AI Security Resources

AI Incident Response Playbook

Step-by-step procedures for responding to AI security incidents including LLM breaches, agent compromise, and prompt injection - Updated March 2026

$4.2M
Average AI security incident cost
(IBM Cost of Data Breach 2025)
29 min
eCrime average breakout time
(CrowdStrike 2026)
47 days
Average AI vulnerability patch time
(Ponemon Institute)
🚨

Critical: Act Fast

AI incidents can spread rapidly. Average eCrime breakout time is 29 minutes. Have this playbook ready BEFORE incidents occur.

📋 Incident Classification

CRITICAL - P0

  • Confirmed data exfiltration via AI system
  • Remote code execution on AI infrastructure
  • Complete model theft or extraction
  • Unauthorized access to production AI systems
  • Active prompt injection with data impact

Response Time: Immediate - Activate incident team within 15 minutes

HIGH - P1

  • Suspected prompt injection attack
  • Unusual API call patterns
  • Authentication bypass attempts
  • Vector database tampering
  • Agent acting outside defined parameters

Response Time: Within 1 hour

MEDIUM - P2

  • Potential sensitive data in prompts
  • Unusual model behavior or hallucinations
  • Failed attack attempts (blocked)
  • Policy violation by AI output

Response Time: Within 4 hours

LOW - P3

  • Minor policy violations
  • Suspicious but inconclusive activity
  • Testing/false positive reports

Response Time: Within 24 hours

⏱️ Phase 1: Identification & Triage (0-30 min)

Detection Triggers

Automated Alerts

  • Anomalous API usage spikes
  • Unusual response patterns
  • Failed authentication attempts
  • Rate limit violations
  • Toxicity detection triggers

Manual Reports

  • User reports of suspicious output
  • Customer complaints
  • Employee security concerns
  • Third-party notifications

Immediate Triage Checklist

  1. Confirm the incident: Is this a real security event or false positive?
  2. Classify severity: P0-P3 based on impact
  3. Preserve evidence: Start logging everything immediately
  4. Notify team: Alert incident response lead
  5. Document timeline: Record when first indicators appeared

🔒 Phase 2: Containment (30 min - 2 hours)

LLM API Containment

  • Rotate API keys: Immediately rotate any potentially compromised credentials
  • Rate limiting: Apply aggressive rate limits to affected endpoints
  • IP blocking: Block malicious source IPs at API gateway
  • Feature flags: Disable risky features (file uploads, code execution)
  • Read-only mode: Consider read-only mode for affected systems

Agent Containment

  • Terminate agent processes: Stop compromised agents immediately
  • Revoke tool access: Disable agent access to sensitive tools
  • Isolate MCP servers: Disconnect suspicious MCP server connections
  • Session invalidation: Force logout all active sessions
  • Network segmentation: Isolate AI systems from critical networks

Data Containment

  • Audit logs: Preserve all access logs for forensic analysis
  • Database snapshots: Take point-in-time snapshots
  • Vector database: Isolate and examine vector database
  • Backup integrity: Verify recent backups haven't been compromised

🔍 Phase 3: Investigation (2-24 hours)

Log Analysis

Collect and analyze:

  • API request/response logs: All LLM API calls with timestamps
  • Authentication logs: Login attempts, token usage
  • Application logs: Server logs from AI processing services
  • Network logs: Traffic patterns, source IPs
  • User activity logs: Who accessed what and when

Attack Vector Analysis

Prompt Injection

  • Identify injection patterns
  • Trace injection source
  • Assess manipulation impact
  • Document attack technique

Data Exfiltration

  • Identify data accessed
  • Determine exfiltration method
  • Assess data sensitivity
  • Calculate record count

Agent Compromise

  • Identify compromised actions
  • Trace tool misuse
  • Assess unauthorized access
  • Document lateral movement

Evidence Preservation

  1. Create forensic copies: Bit-for-bit copies of affected systems
  2. Hash verification: Calculate and document file hashes
  3. Chain of custody: Document who accessed what evidence
  4. Secure storage: Store evidence in secure, access-controlled location
  5. Timeline creation: Build detailed incident timeline

🛠️ Phase 4: Remediation (24-72 hours)

Immediate Fixes

LLM Vulnerabilities

  • Patch input validation gaps
  • Update content filtering
  • Strengthen output sanitization
  • Add injection detection

Agent Issues

  • Reduce agent permissions
  • Add approval workflows
  • Implement stricter boundaries
  • Update tool access controls

Infrastructure Remediation

  • Credential reset: Force password resets, rotate all API keys
  • Access review: Audit and clean up permissions
  • Network hardening: Update firewall rules, segment networks
  • Monitor updates: Enhance monitoring with new detection rules
  • Backup verification: Verify clean backups, test restoration

📬 Phase 5: Notification & Reporting

Internal Reporting

  • Executive summary: 1-page incident overview for leadership
  • Technical report: Detailed technical findings for security team
  • Timeline: Complete incident timeline with key events
  • Lessons learned: What went well, what needs improvement
  • Action items: Specific tasks with owners and deadlines

External Notifications

Regulatory Bodies

  • GDPR: 72-hour notification to DPA
  • EU AI Act: Report to competent authority
  • Sector regulators (finance, healthcare)
  • State breach notification laws

Affected Parties

  • Customer notification (if data affected)
  • Business partner notification
  • CVE disclosure (if applicable)
  • Public disclosure (if required)

🔄 Phase 6: Post-Incident Activities

Root Cause Analysis

  1. 5 Whys analysis: Drill down to root cause
  2. Attack path mapping: How did the attacker succeed?
  3. Control failures: Which controls didn't work?
  4. Detection gaps: Why wasn't this caught earlier?
  5. Process issues: Where did response processes fail?

Improvement Actions

Technical

  • Implement missing security controls
  • Update detection rules
  • Patch vulnerabilities
  • Enhance monitoring

Process

  • Update incident response procedures
  • Improve communication protocols
  • Enhance training programs
  • Document new playbooks

Preventive

  • Red team testing
  • Table-top exercises
  • Security awareness training
  • Architecture review

🆘 Emergency Quick Reference

Critical Incident Actions

  1. Rotate API keys immediately
  2. Disable affected endpoints
  3. Preserve all logs
  4. Notify CISO within 15 min
  5. Begin evidence collection

Escalation Contacts

  • Security Team: [internal Slack #]
  • On-call Engineer: [pagerduty]
  • CISO: [direct line]
  • Legal: [legal@company]
  • PR/Communications: [pr@company]

📋 Pre-Incident Preparation Checklist

📚 Related Resources

OWASP LLM Top 10

Core vulnerabilities

Prompt Injection Guide

Injection techniques

Agentic AI Security

Agent vulnerabilities

MCP Security

Tool security