How do I secure LLM API keys?

Never hardcode API keys. Use environment variables or secrets management services. Rotate keys regularly, use minimal scopes, and never expose them in client-side code.

What rate limiting strategies work for LLM APIs?

Implement token-based rate limiting (requests per minute), cost-based limits, per-user quotas, and tiered approaches. Use token estimation to predict costs before making requests.

LLM API Security

Best practices for securing LLM API endpoints - authentication, rate limiting, key management, and cost control

Updated: May 2026

LLM API Threat Landscape

API Key Exposure

Hardcoded keys in source code, client-side exposure, or leaked credentials

Rate Limit Abuse

Automated attacks consuming API quotas or causing denial of service

Cost Attacks

Prompt manipulation causing excessive token usage and unexpected costs

Data Leakage

Sensitive data in prompts or responses exposed through logs

Authentication Methods

API Keys

Generate unique keys per user/application
Use environment variables for storage
Rotate keys regularly (30-90 days)
Implement key revocation

OAuth 2.0

Implement for user-facing applications
Use short-lived access tokens
Implement refresh tokens
Proper scope definitions

JWT Tokens

Short expiration times
Strong signing algorithms
Proper claim validation
Token revocation support

Rate Limiting & Throttling

Token-Based Limits

Track input + output tokens
Set per-minute/per-day limits
Use sliding windows
Implement burst allowance

Cost Controls

Budget alerts (various thresholds)
Per-user quotas
Token estimation before requests
Hard caps with circuit breakers

Tiered Approaches

Free tier: strict limits
Basic: moderate limits
Pro: higher limits
Enterprise: custom agreements

Provider Security Comparison

Provider	Key Features	Rate Limits	Security
OpenAI	GPT models, Assistants API	Tier-based RPM	SOC 2, API key management
Anthropic	Claude, tool use	Token-based	SOC 2, HIPAA available
Google	Gemini, Vertex AI	Project quotas	SOC 2, HIPAA, ISO
AWS Bedrock	Multiple models	Account-based	AWS IAM, VPC, encryption

Self-Hosted LLM Security

Network Isolation

Deploy in VPC/private network
Use firewall rules
No public internet access
VPN for management

API Security

Enable authentication
Use TLS/SSL
Implement API keys
Rate limiting

Model Protection

Model files encrypted at rest
Secure model loading
No model export
Access logging

Cost Attacks & Abuse Prevention

LLM APIs present unique cost attack vectors that traditional API security doesn't address.

Prompt Padding

Attacker adds invisible text to prompts to increase token count without user awareness.

User: [100KB invisible text] Tell me about AI

Repeated Requests

Automated tools making excessive API calls to drain budget quotas.

Context Window Overflow

Sending extremely long inputs to maximize per-request costs.

Model Selection Abuse

Switching to more expensive models through parameter manipulation.

Mitigations

Implement token estimation before requests
Set per-user/month cost hard limits
Use input length limits and validation
Fix model selection to intended tier
Monitor for unusual usage patterns

Recent LLM API CVEs

CVE ID	Description	Severity
`CVE-2025-61260`	OpenAI Codex CLI command injection via project-local .env config	Critical
`CVE-2025-53767`	Azure OpenAI privilege escalation flaw	High
`CVE-2025-14980`	BetterDocs WordPress plugin exposes OpenAI API key to contributors	High

Best Practices Checklist

Use environment variables for API keys

Rotate keys regularly

Implement rate limiting

Set cost alerts

Log API usage

Use HTTPS

Validate input

Sanitize output

Related Resources

AI Hacking Team

The AI Hacking team researches and documents AI/LLM security vulnerabilities, red teaming techniques, and defensive strategies. Our guides are based on real-world pentesting experience and continuous monitoring of the AI security landscape.

GitHub LinkedIn