AI Hacking
AI Security Resources

Secure AI Development

Complete developer guide to building secure AI applications - from input validation to deployment

Updated: February 2026

Security Principles

1. Defense in Depth

Implement multiple layers of security controls. No single control should be relied upon exclusively.

2. Least Privilege

Grant minimum necessary permissions to users, APIs, and AI components.

3. Fail Secure

When errors occur, default to secure states rather than permissive ones.

4. Zero Trust

Never trust, always verify. Validate at every boundary, including internal systems.

Input Validation

All user input must be validated and sanitized before processing. This is your first line of defense against prompt injection and other attacks.

Input Validation Class

```python
import re
from typing import Optional, List

class InputValidator:
    # Known injection patterns
    INJECTION_PATTERNS = [
        r"ignore\s+(previous|prior|all)\s+(instructions|rules)",
        r"(forget|disregard)\s+(your|all)\s+(instructions|rules)",
        r"you\s+are\s+(now|still)\s+(dan|do\s+anything)",
        r"system\s+prompt:",
        r"{{.*}}",
        r"\[INST\]|\[/INST\]|<\|end\|>",
    ]
    
    # Encoding patterns
    ENCODING_PATTERNS = [
        r"base64:",  # Base64 prefix
        r"\\x[0-9a-fA-F]{2}",  # Hex encoding
        r"%[0-9a-fA-F]{2}",  # URL encoding
    ]
    
    def validate(self, user_input: str) -> dict:
        """Validate user input for potential attacks."""
        issues = []
        
        # Length check
        if len(user_input) > 10000:
            issues.append("Input exceeds maximum length")
        
        # Check for injection patterns
        for pattern in self.INJECTION_PATTERNS:
            if re.search(pattern, user_input, re.IGNORECASE):
                issues.append(f"Suspicious pattern detected: {pattern}")
        
        # Check for encoding attempts
        for pattern in self.ENCODED_PATTERNS:
            if re.search(pattern, user_input):
                issues.append(f"Encoded content detected")
        
        # Check for high entropy (possible encoding)
        unique_chars = len(set(user_input))
        if len(user_input) > 50 and unique_chars / len(user_input) < 0.3:
            issues.append("High entropy detected - possible encoding")
        
        return {
            "valid": len(issues) == 0,
            "issues": issues,
            "sanitized": self._sanitize(user_input)
        }
    
    def _sanitize(self, user_input: str) -> str:
        """Basic sanitization."""
        # Remove control characters
        sanitized = re.sub(r'[\x00-\x1F\x7F]', '', user_input)
        return sanitized.strip()
```

Validation Checklist

Output Handling

LLM outputs must be validated and sanitized before being presented to users or passed to other systems.

Content Filtering

  • Check for sensitive data exposure
  • Verify no system prompts leaked
  • Validate output format
  • Check for injected content

PII Detection

  • Scan for personal information
  • Mask or redact sensitive data
  • Log PII exposure attempts
  • User notification when detected

Format Validation

  • Validate JSON outputs
  • Check expected structure
  • Verify allowed values
  • Sanitize HTML if applicable

Output Handler Example

```python
import re
import json

class OutputHandler:
    PII_PATTERNS = {
        "ssn": r"\b\d{3}-\d{2}-\d{4}\b",
        "email": r"\b[\w.-]+@[\w.-]+\.\w+\b",
        "phone": r"\b\d{3}[-.]?\d{3}[-.]?\d{4}\b",
    }
    
    def process_output(self, raw_output: str) -> dict:
        """Process and validate LLM output."""
        result = {
            "original": raw_output,
            "cleaned": raw_output,
            "warnings": [],
            "blocked": False
        }
        
        # Check for system prompt leakage
        if "system prompt" in raw_output.lower():
            result["warnings"].append("System prompt reference detected")
        
        # Scan for PII
        for pii_type, pattern in self.PII_PATTERNS.items():
            matches = re.findall(pattern, raw_output)
            if matches:
                result["warnings"].append(f"{pii_type} detected in output")
                # Optionally mask
                result["cleaned"] = re.sub(pattern, "[REDACTED]", result["cleaned"])
        
        return result
```

API Security

Authentication

  • Use strong API key management
  • Implement OAuth 2.0 where possible
  • JWT token validation
  • API key rotation

Authorization

  • Role-based access control (RBAC)
  • API key scoping
  • Per-user rate limits
  • Tenant isolation

Rate Limiting

  • Token-based limits
  • Request-based limits
  • Cost controls
  • Tiered throttling

API Security Checklist

Authentication & Authorization

User Authentication

  • Strong password policies
  • MFA support
  • Session management
  • Token expiration

API Authentication

  • API key management
  • OAuth scopes
  • JWT validation
  • Key rotation

Authorization

  • RBAC implementation
  • Permission checks
  • Resource-level access
  • Audit logging

Data Security

Data Classification

  • Identify sensitive data
  • Categorize by sensitivity
  • Apply controls by category
  • Regular audits

Encryption

  • TLS in transit
  • AES at rest
  • Key management
  • Key rotation

PII Handling

  • Detection
  • Minimization
  • Consent management
  • Right to deletion

Logging & Monitoring

What to Log

Security Events

  • Authentication attempts
  • Authorization failures
  • Rate limit exceeded
  • Suspicious patterns

AI-Specific Events

  • Prompt injection attempts
  • System prompt access
  • Tool invocations
  • Data access via AI

Operational Events

  • API calls
  • Errors and exceptions
  • Performance metrics
  • Cost tracking

Monitoring Best Practices

Common Vulnerabilities

1. IDOR in AI Features

Insecure Direct Object Reference when AI accesses resources

Example

User asks AI to "read file X" and AI has access without proper authorization checks.

2. SSRF via LLM

Server-Side Request Forgery through AI tool calls

Example

Prompt tricks AI into making requests to internal services.

3. Authentication Bypass

Weak or missing authentication for AI endpoints

Example

AI API endpoints without proper auth checks.

4. Vector DB Exposure

Unprotected vector database with sensitive embeddings

Example

Publicly accessible vector DB with embedded sensitive data.

Compliance

GDPR (EU)

  • Lawful basis for processing
  • Data minimization
  • Right to access/deletion
  • Data portability

CCPA (California)

  • Right to know
  • Right to delete
  • Right to opt-out
  • Non-discrimination

AI-Specific

  • EU AI Act requirements
  • AI risk assessments
  • Transparency obligations
  • Human oversight

References & Resources

Continue Learning

LLM API Security Prompt Injection RAG Security Security Tools