LLM API Security
Best practices for securing LLM API endpoints - authentication, rate limiting, key management, and cost control
Updated: March 2026
LLM API Threat Landscape
API Key Exposure
Hardcoded keys in source code, client-side exposure, or leaked credentials
Rate Limit Abuse
Automated attacks consuming API quotas or causing denial of service
Cost Attacks
Prompt manipulation causing excessive token usage and unexpected costs
Data Leakage
Sensitive data in prompts or responses exposed through logs
Authentication Methods
API Keys
- Generate unique keys per user/application
- Use environment variables for storage
- Rotate keys regularly (30-90 days)
- Implement key revocation
OAuth 2.0
- Implement for user-facing applications
- Use short-lived access tokens
- Implement refresh tokens
- Proper scope definitions
JWT Tokens
- Short expiration times
- Strong signing algorithms
- Proper claim validation
- Token revocation support
Rate Limiting & Throttling
Token-Based Limits
- Track input + output tokens
- Set per-minute/per-day limits
- Use sliding windows
- Implement burst allowance
Cost Controls
- Budget alerts (various thresholds)
- Per-user quotas
- Token estimation before requests
- Hard caps with circuit breakers
Tiered Approaches
- Free tier: strict limits
- Basic: moderate limits
- Pro: higher limits
- Enterprise: custom agreements
Provider Security Comparison
| Provider | Key Features | Rate Limits | Security |
|---|---|---|---|
| OpenAI | GPT models, Assistants API | Tier-based RPM | SOC 2, API key management |
| Anthropic | Claude, tool use | Token-based | SOC 2, HIPAA available |
| Gemini, Vertex AI | Project quotas | SOC 2, HIPAA, ISO | |
| AWS Bedrock | Multiple models | Account-based | AWS IAM, VPC, encryption |
Self-Hosted LLM Security
Network Isolation
- Deploy in VPC/private network
- Use firewall rules
- No public internet access
- VPN for management
API Security
- Enable authentication
- Use TLS/SSL
- Implement API keys
- Rate limiting
Model Protection
- Model files encrypted at rest
- Secure model loading
- No model export
- Access logging
Cost Attacks & Abuse Prevention
LLM APIs present unique cost attack vectors that traditional API security doesn't address.
Prompt Padding
Attacker adds invisible text to prompts to increase token count without user awareness.
User: [100KB invisible text] Tell me about AI
Repeated Requests
Automated tools making excessive API calls to drain budget quotas.
Context Window Overflow
Sending extremely long inputs to maximize per-request costs.
Model Selection Abuse
Switching to more expensive models through parameter manipulation.
Mitigations
- Implement token estimation before requests
- Set per-user/month cost hard limits
- Use input length limits and validation
- Fix model selection to intended tier
- Monitor for unusual usage patterns
Recent LLM API CVEs
| CVE ID | Description | Severity |
|---|---|---|
CVE-2025-61260 |
OpenAI Codex CLI command injection via project-local .env config | Critical |
CVE-2025-53767 |
Azure OpenAI privilege escalation flaw | High |
CVE-2025-14980 |
BetterDocs WordPress plugin exposes OpenAI API key to contributors | High |