AI Hacking
AI Security Resources

RAG Security: Complete Guide

Securing Retrieval-Augmented Generation systems against document poisoning, retrieval manipulation, and embedding attacks

Updated: February 2026 • Part of OWASP LLM08

What is RAG?

Retrieval-Augmented Generation (RAG) is an AI architecture that combines large language models with external knowledge retrieval. Over 60% of enterprise LLM deployments now use RAG, making its security critical.

1. User Query

User submits a question or prompt to the system

2. Embedding

Query is converted to vector embedding

3. Retrieval

Similar documents retrieved from vector DB

4. Augmentation

Retrieved context added to prompt

5. Generation

LLM generates response using context

The Trust Paradox

RAG systems have a fundamental security flaw: user queries are treated as untrusted input, but retrieved context is implicitly trusted - even though both enter the same prompt. This creates a significant attack surface that traditional security doesn't address.

Attack Vectors

1. Document Poisoning

Injecting malicious content into documents stored in the knowledge base.

How it works
  • Attacker uploads or injects malicious documents
  • Documents are embedded and stored in vector DB
  • When relevant query is made, poisoned doc is retrieved
  • LLM incorporates malicious context into response
Impact
  • 90% success with just 5 poisoned documents
  • Works even in databases with millions of documents
  • Can cause harmful, biased, or incorrect outputs
  • Difficult to detect after deployment

2. Retrieval Manipulation

Manipulating which documents are retrieved to influence outputs.

How it works
  • Attacker crafts queries to trigger specific retrieval
  • Exploits ranking algorithm weaknesses
  • Uses semantic similarity to hijack retrieval
  • Cross-user manipulation in shared systems
Impact
  • Forces retrieval of attacker-controlled content
  • Can suppress legitimate content
  • Enables targeted manipulation
  • Breaks trust in retrieval quality

3. Embedding Inversion

Recovering original data from vector embeddings.

How it works
  • Attacker gains access to vector database
  • Uses inversion techniques on embeddings
  • Reconstructs original text from vectors
  • Recovers sensitive embedded data
Impact
  • Recover 50-70% of input words
  • Expose sensitive data in embeddings
  • Privacy violations
  • Compliance issues (GDPR, etc.)

4. Cross-Tenant Attacks

Exploiting shared RAG infrastructure in multi-tenant systems.

How it works
  • Attacker is one tenant of shared system
  • Injects content that affects other tenants
  • Exploits shared vector database
  • Retrieves data from other tenants
Impact
  • Data leakage between tenants
  • Unauthorized access to competitor data
  • Compliance violations
  • Reputational damage

Real-World CVEs

Documented vulnerabilities in RAG systems.

CVE ID Product Description Severity CVSS
CVE-2026-24770 RAGFlow Zip Slip vulnerability in MinerU parser allowing arbitrary file write and RCE Critical 9.1
CVE-2025-XXXX Pinecone Vector database authentication bypass (hypothetical) Critical 9.0
CVE-2025-XXXX ChromaDB Embedding injection vulnerability High 8.2
CVE-2025-XXXX LangChain RAG pipeline code execution High 8.0
CVE-2025-XXXX Weaviate GraphQL injection in vector DB Medium 6.5
CVE-2025-XXXX Milvus Authentication bypass in query engine High 7.5

Defense Strategies

1. Ingestion Phase Security

Document Validation

  • Scan all documents for malware
  • Validate document format and structure
  • Check for suspicious content patterns
  • Limit document types allowed

Content Filtering

  • Remove PII before embedding
  • Filter sensitive data patterns
  • Block known malicious patterns
  • Implement allow/block lists

Access Controls

  • Validate source of documents
  • Implement RBAC for ingestion
  • Audit trail for all uploads
  • Quarantine new documents

2. Retrieval Phase Security

Query Sanitization

  • Validate and sanitize user queries
  • Detect injection attempts
  • Limit query complexity
  • Rate limiting

Retrieval Filtering

  • Implement reranking security
  • Cross-reference with trusted sources
  • Detect anomalous retrieval patterns
  • Limit number of documents

Multi-Tenancy Isolation

  • Separate vector namespaces
  • Strict tenant boundaries
  • Cross-tenant query prevention
  • Encryption per tenant

3. Generation Phase Security

Output Validation

  • Validate LLM outputs
  • Check for injected content
  • Fact-check against sources
  • Content filtering

Context Verification

  • Verify retrieved content authenticity
  • Detect manipulation attempts
  • Flag unusual context patterns
  • Log all context usage

Human-in-the-Loop

  • Review sensitive outputs
  • Approve high-risk actions
  • Manual override capability
  • Escalation paths

4. Data Security

Encryption

  • Encrypt vectors at rest
  • TLS for data in transit
  • Key management best practices
  • Consider homomorphic encryption

Vector DB Security

  • Strong authentication
  • Network isolation
  • Regular security audits
  • Patch management

Privacy Protection

  • Data minimization
  • PII detection and removal
  • Retention policies
  • Right to deletion support

Testing Methodology

RAG Security Test Checklist

Testing Tools

Garak

LLM vulnerability scanner with RAG-specific probes

View →

Apache RAG

RAG-specific security testing framework

View →

Vector DB Scanners

Tools for testing vector database security

View →

Code Examples

Document Validation Example

```python
import re
from typing import List

class RAGDocumentValidator:
    SUSPICIOUS_PATTERNS = [
        r"ignore previous instructions",
        r"system prompt:",
        r"{{.*}}",
        r"you are now dan",
    ]
    
    PII_PATTERNS = [
        r"\b\d{3}-\d{2}-\d{4}\b",  # SSN
        r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b",  # Email
        r"\b\d{16}\b",  # Credit card
    ]
    
    def validate_document(self, content: str) -> dict:
        """Validate document before embedding."""
        warnings = []
        blocked = False
        
        # Check for suspicious patterns
        for pattern in self.SUSPICIOUS_PATTERNS:
            if re.search(pattern, content, re.IGNORECASE):
                warnings.append(f"Suspicious pattern: {pattern}")
                blocked = True
        
        # Check for PII
        for pattern in self.PII_PATTERNS:
            if re.search(pattern, content):
                warnings.append(f"PII detected: {pattern}")
        
        return {
            "valid": not blocked,
            "warnings": warnings,
            "pii_detected": len([w for w in warnings if "PII" in w]) > 0
        }
```

References & Resources

Related Topics

Prompt Injection Red Teaming Secure Development OWASP LLM Top 10