OWASP Top 10 for LLM Applications 2025/2026
The definitive list of critical security risks in LLM applications - Updated March 2026
Critical Risk Awareness
Prompt Injection remains the #1 vulnerability. See our comprehensive Prompt Injection Guide →
Prompt Injection
Manipulating LLM behavior through crafted inputs that override original instructions. Includes both direct (user input) and indirect (external data) injection.
Attack Types
- Direct Injection: Malicious user prompts that override system instructions
- Indirect Injection: Hidden instructions in external documents, web content, or API responses
- Context Manipulation: Exploiting conversation context to alter behavior
- Jailbreak Attacks: Bypassing safety filters through creative prompting
Attack Examples
Ignore previous instructions and...Text embedded in invisible unicode charactersPrompt injection via stored XSS in web contentDAN (Do Anything Now) style jailbreaks
Mitigation Strategies
- Implement strict input validation and sanitization
- Use privilege separation between user inputs and system prompts
- Apply output filtering before displaying results
- Monitor for injection patterns in logs
- Implement defense-in-depth with multiple security layers
Sensitive Information Disclosure
LLMs inadvertently revealing private, confidential, or proprietary information through model outputs.
Attack Types
- Training Data Extraction: Recovering sensitive data from model memory
- PII Leakage: Exposing personally identifiable information
- API Key/credential Exposure: Revealing secrets in responses
- Business Logic Disclosure: Exposing proprietary algorithms or processes
Attack Examples
Extracting email addresses through specific promptsRevealing SQL credentials in error messagesDisclosing customer PII from training dataExposing internal system prompts
Mitigation Strategies
- Implement robust input/output filtering
- Enforce data sanitization pipelines
- Apply user opt-out policies for data usage
- Use differential privacy techniques
- Restrict system prompt access and visibility
Supply Chain Vulnerabilities
Compromised or vulnerable components in the LLM supply chain including models, APIs, plugins, and third-party services.
Attack Types
- Model Tampering: Compromised pre-trained models
- PyPI/npm Dependency Vulnerabilities: Insecure libraries
- Malicious Fine-tuning Data: Poisoned training datasets
- Compromised API Providers: Untrusted LLM services
Attack Examples
Backdoored model weights from untrusted sourceVulnerable transformer library exploitationPoisoned training data with hidden triggersCompromised RAG document store
Mitigation Strategies
- Verify model integrity through checksums and signatures
- Maintain SBOM (Software Bill of Materials)
- Use trusted model hubs and verify provenance
- Scan dependencies for known vulnerabilities
- Implement model lifecycle management
Data and Model Poisoning
Introducing malicious or biased data into training pipelines, fine-tuning data, or RAG knowledge bases to compromise model integrity.
Attack Types
- Training Data Poisoning: Corrupting model training data
- Fine-tuning Poisoning: Injecting backdoors via fine-tuning
- RAG Poisoning: Contaminating retrieval knowledge bases
- Embedding Manipulation: Corrupting vector databases
Attack Examples
Inserting biased examples to alter model behaviorCreating backdoor triggers in training dataPoisoning public datasets used for trainingInjecting false information into RAG documents
Mitigation Strategies
- Verify data provenance and supply chain integrity
- Implement data validation and anomaly detection
- Use data sanitization pipelines
- Apply robust fine-tuning safeguards
- Monitor for model behavior drift
Improper Output Handling
Failing to validate, sanitize, or properly handle LLM outputs before passing them to downstream systems or users.
Attack Types
- XSS via LLM Output: Cross-site scripting from model responses
- SQL Injection: Malicious queries generated by LLM
- Command Injection: LLM generating unsafe system commands
- Path Traversal: LLM revealing or accessing unauthorized paths
Attack Examples
LLM generating JavaScript that executes in browserModel outputting malicious SQL queriesCode generation including unsafe system callsFile path disclosure in responses
Mitigation Strategies
- Implement output validation and sanitization
- Use context-aware content filtering
- Apply the same security controls as user inputs
- Sandbox LLM outputs before downstream use
- Enable secure coding modes in code generation
Excessive Agency
Granting LLM systems too much functionality autonomy,, permissions, or enabling unauthorized or harmful actions.
Attack Types
- Unlimited Function Access: Excessive API permissions
- Autonomous Action: Performing actions without human approval
- Tool Abuse: Exploiting integrated external tools
- Chain-of-Thought Manipulation: Altering reasoning processes
Attack Examples
LLM with admin API access performing unauthorized changesAuto-executing financial transactionsDeleting resources without confirmationManipulating user data without consent
Mitigation Strategies
- Implement least privilege access controls
- Require human-in-the-loop for critical actions
- Apply rate limiting to sensitive operations
- Log and monitor all autonomous actions
- Use scope limitations on tool access
System Prompt Leakage
Exposing confidential system prompts, instructions, or internal logic through manipulation or inadequate protections.
Attack Types
- Direct Prompt Extraction: Social engineering to reveal prompts
- Context Overflow: Overflow techniques to expose system messages
- Role Play Exploitation: Trick LLM into revealing instructions
- Log Leakage: Exposing prompts in logs or errors
Attack Examples
Asking LLM to repeat 'previous text' to extract system promptUsing context window overflow to bypass instruction parsingPrompt injection to override system roleFinding prompts in application logs
Mitigation Strategies
- Obfuscate system prompts where possible
- Implement prompt isolation techniques
- Use separate processing for sensitive instructions
- Monitor for prompt extraction attempts
- Apply strict log filtering
Vector and Embedding Weaknesses
Security vulnerabilities in Retrieval-Augmented Generation (RAG) systems, vector databases, and embedding pipelines.
Attack Types
- RAG Injection: Malicious content in retrieved documents
- Vector DB Compromise: Attacking vector storage
- Embedding Extraction: Stealing embedding representations
- Context Manipulation: Corrupting retrieval results
Attack Examples
Poisoned documents being retrieved as top resultsVector database unauthorized accessExtracting training data from embeddingsRetrieval manipulation via embedding attacks
Mitigation Strategies
- Validate and sanitize all RAG input documents
- Implement vector database access controls
- Use embedding encryption where available
- Apply reranking with security filters
- Monitor retrieval for anomalies
Misinformation
LLMs generating false, misleading, or biased content that appears credible, leading to informed decisions based on incorrect information.
Attack Types
- Hallucinations: Confident but false outputs
- Factual Errors: Incorrect information presented as fact
- Bias Amplification: Reinforcing existing biases
- Manipulation: Deliberate misleading outputs
Attack Examples
Citing non-existent research papersProviding incorrect medical adviceGenerating biased hiring recommendationsCreating fake news or reviews
Mitigation Strategies
- Implement fact-checking pipelines
- Use confidence scoring and uncertainty estimation
- Apply source verification for critical outputs
- Add content provenance and attribution
- Label AI-generated content clearly
Unbounded Consumption
Allowing excessive or uncontrolled resource usage by LLM applications, leading to denial of service, financial exploitation, or service degradation.
Attack Types
- API Rate Abuse: Excessive API calls exhausting quotas
- Resource Exhaustion: Maxing out compute resources
- Cost Exploitation: Pay-per-token abuse leading to charges
- Inference Attacks: Extracting model via excessive queries
Attack Examples
Automated attacks consuming entire API quotaMax-length prompt spam causing compute exhaustionExtracting model through millions of queriesRecursive prompt loops consuming resources
Mitigation Strategies
- Implement strict rate limiting and quotas
- Add input/output token limits
- Monitor usage patterns for anomalies
- Use cost controls and budget alerts
- Apply timeout controls on long-running operations
OWASP Testing Framework
Follow this structured approach for testing each vulnerability:
1. Reconnaissance
- Map system architecture and data flow
- Identify input points and API interfaces
- Document integrated tools and plugins
- Review system prompts and configuration
2. Vulnerability Mapping
- Test each OWASP category systematically
- Document attack surface and entry points
- Identify defense mechanisms in place
- Map dependencies and third-party services
3. Exploitation
- Conduct proof-of-concept attacks
- Document exploitability and impact
- Test chaining multiple vulnerabilities
- Assess real-world attack feasibility
4. Reporting
- Prioritize findings by risk level
- Provide remediation recommendations
- Include proof-of-concept code
- Suggest defensive improvements