NGINX Ingress limit-burst-multiplier
Configure nginx.ingress.kubernetes.io/limit-burst-multiplier for rate limiting burst control. Tune burst size, rate limits, and 429 response handling.
π‘ Quick Answer:
nginx.ingress.kubernetes.io/limit-burst-multipliersets the burst bucket size as a multiple of the per-second rate limit. Default is5. Withlimit-rps: "10"andlimit-burst-multiplier: "3", the burst bucket holds 30 requests. This means a client can send 30 requests instantly, then must stay under 10/s. Set it to1for strict rate limiting,5-10for APIs with natural traffic spikes.
The Problem
NGINX Ingress rate limiting with limit-rps alone is too strict:
- Legitimate browser page loads send 10-20 requests simultaneously (CSS, JS, images)
- API clients batch requests then pause
- WebSocket upgrades need burst capacity
- Default burst of 5x can be too generous for API protection
The Solution
How limit-burst-multiplier Works
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
annotations:
# Rate: 10 requests per second
nginx.ingress.kubernetes.io/limit-rps: "10"
# Burst: 3x rate = 30 request burst bucket
nginx.ingress.kubernetes.io/limit-burst-multiplier: "3"
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080What happens:
t=0.0s: Client sends 30 requests instantly β all accepted (burst bucket)
t=0.1s: Client sends 1 request β rejected (429) β bucket empty
t=0.1s: Bucket refills at 10/s β 1 token available after 100ms
t=1.0s: Bucket has 10 tokens β can burst 10 more
t=3.0s: Bucket full again at 30 tokensConfiguration Examples
# Strict API protection (no burst)
annotations:
nginx.ingress.kubernetes.io/limit-rps: "5"
nginx.ingress.kubernetes.io/limit-burst-multiplier: "1"
# Bucket: 5 requests, refill at 5/s
# Web application (generous burst for page loads)
annotations:
nginx.ingress.kubernetes.io/limit-rps: "20"
nginx.ingress.kubernetes.io/limit-burst-multiplier: "5"
# Bucket: 100 requests, refill at 20/s
# Webhook endpoint (low rate, moderate burst)
annotations:
nginx.ingress.kubernetes.io/limit-rps: "2"
nginx.ingress.kubernetes.io/limit-burst-multiplier: "10"
# Bucket: 20 requests, refill at 2/s
# Per-minute rate limit with burst
annotations:
nginx.ingress.kubernetes.io/limit-rpm: "300"
nginx.ingress.kubernetes.io/limit-burst-multiplier: "3"
# Bucket: 15 requests (300/60 * 3), refill at 5/sCombine with Other Rate Limit Annotations
annotations:
# Request rate
nginx.ingress.kubernetes.io/limit-rps: "10"
nginx.ingress.kubernetes.io/limit-burst-multiplier: "3"
# Connection limit (concurrent connections per IP)
nginx.ingress.kubernetes.io/limit-connections: "10"
# Whitelist IPs (bypass rate limiting)
nginx.ingress.kubernetes.io/limit-whitelist: "10.0.0.0/8,192.168.0.0/16"
# Custom response code (default: 503, change to 429)
nginx.ingress.kubernetes.io/server-snippet: |
limit_req_status 429;What NGINX Generates
# The annotations translate to this nginx.conf directive:
limit_req_zone $binary_remote_addr zone=ingress_rps:10m rate=10r/s;
location / {
limit_req zone=ingress_rps burst=30 nodelay;
# ^^ 10 Γ 3 = 30
# ^^ Don't delay burst requests
proxy_pass http://upstream;
}Monitor Rate Limiting
# Check NGINX controller logs for rate-limited requests
kubectl logs -n ingress-nginx deployment/ingress-nginx-controller | grep "limiting"
# Prometheus metrics
# nginx_ingress_controller_requests{status="429"} β rate-limited count
# nginx_ingress_controller_request_duration_seconds β latency impactCommon Issues
All requests getting 429 with low rps
burst-multiplier too low. A browser loading a page sends 20+ requests. Set limit-burst-multiplier: "5" minimum for web apps.
Rate limiting not working at all
Check annotation spelling exactly: nginx.ingress.kubernetes.io/limit-rps (not limit_rps). Also verify the Ingress is using the nginx IngressClass.
Different behavior with limit-rpm vs limit-rps
limit-rpm: "300" = 5/s internally. The burst-multiplier applies to the per-second rate, so burst = (300/60) Γ multiplier = 5 Γ 3 = 15.
Best Practices
- Web apps: multiplier 5-10 β browsers send burst of requests per page load
- APIs: multiplier 1-3 β tighter control, predictable rate
- Webhooks: multiplier 5-10 with low rps β handle deployment burst, prevent abuse
- Always whitelist internal CIDRs β donβt rate limit health checks
- Return 429 not 503 β 429 is semantically correct for rate limiting
Key Takeaways
limit-burst-multipliersets burst bucket = rps Γ multiplier (default 5)- Burst allows instant request spike, then enforces steady rate
- Set to 1 for strict limiting, 5-10 for user-facing applications
- Combine with
limit-connectionsfor concurrent connection control - Whitelist internal traffic to avoid rate limiting health checks and probes

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
