Kubernetes Rate Limiting with Gateway API
Implement rate limiting for Kubernetes services using Gateway API, Istio, Kong, NGINX, and Envoy. Protect APIs from abuse
π‘ Quick Answer: Rate limiting in Kubernetes protects services from traffic spikes and abuse. Implement at the ingress/gateway layer using Gateway API
BackendTrafficPolicy, NGINX annotations (limit-rps), IstioEnvoyFilter, or KongRateLimitingplugin. For CI/CD webhooks, use a global rate limit service (Envoy ratelimit) with Redis backend.
The Problem
- APIs exposed via Ingress/Gateway have no default rate limiting
- A single misbehaving client can overwhelm backend services
- Webhook endpoints (CI/CD, payment) receive uncontrolled burst traffic
- Need per-client, per-route, and global rate limits with different policies
- Rate limit state must be shared across multiple gateway replicas
The Solution
Gateway API with Envoy Gateway Rate Limiting
# Envoy Gateway BackendTrafficPolicy
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: api-rate-limit
namespace: production
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: api-route
rateLimit:
type: Global
global:
rules:
- clientSelectors:
- headers:
- name: x-api-key
type: Distinct # Per unique API key
limit:
requests: 100
unit: Minute
- clientSelectors:
- sourceCIDR:
value: "0.0.0.0/0"
type: Distinct # Per source IP
limit:
requests: 1000
unit: HourNGINX Ingress Rate Limiting
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
namespace: production
annotations:
# Rate limit by requests per second
nginx.ingress.kubernetes.io/limit-rps: "10"
# Rate limit by requests per minute
nginx.ingress.kubernetes.io/limit-rpm: "300"
# Rate limit by connections
nginx.ingress.kubernetes.io/limit-connections: "5"
# Burst allowance (queue up to N extra requests)
nginx.ingress.kubernetes.io/limit-burst-multiplier: "5"
# Allowlist IPs exempt from rate limiting
nginx.ingress.kubernetes.io/limit-whitelist: "10.0.0.0/8,172.16.0.0/12"
# Custom error response when rate limited
nginx.ingress.kubernetes.io/custom-http-errors: "429"
# Rate limit key (default: $binary_remote_addr)
# Use X-Forwarded-For behind load balancer:
nginx.ingress.kubernetes.io/limit-rate-after: "1m"
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080Istio Rate Limiting with EnvoyFilter
# Local rate limit (per-pod, no shared state)
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: api-local-ratelimit
namespace: production
spec:
workloadSelector:
labels:
app: api-service
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
listener:
filterChain:
filter:
name: envoy.filters.network.http_connection_manager
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
value:
stat_prefix: http_local_rate_limiter
token_bucket:
max_tokens: 100
tokens_per_fill: 100
fill_interval: 60s
filter_enabled:
runtime_key: local_rate_limit_enabled
default_value:
numerator: 100
denominator: HUNDRED
filter_enforced:
runtime_key: local_rate_limit_enforced
default_value:
numerator: 100
denominator: HUNDRED
response_headers_to_add:
- append_action: OVERWRITE_IF_EXISTS_OR_ADD
header:
key: x-rate-limited
value: "true"Global Rate Limiting with Envoy + Redis
# Deploy rate limit service
apiVersion: apps/v1
kind: Deployment
metadata:
name: ratelimit
namespace: rate-limiting
spec:
replicas: 2
selector:
matchLabels:
app: ratelimit
template:
spec:
containers:
- name: ratelimit
image: envoyproxy/ratelimit:latest
env:
- name: REDIS_SOCKET_TYPE
value: "tcp"
- name: REDIS_URL
value: "redis.rate-limiting.svc:6379"
- name: RUNTIME_ROOT
value: "/data"
- name: RUNTIME_SUBDIRECTORY
value: "ratelimit"
- name: USE_STATSD
value: "false"
ports:
- containerPort: 8080 # HTTP
- containerPort: 8081 # gRPC
volumeMounts:
- name: config
mountPath: /data/ratelimit/config
volumes:
- name: config
configMap:
name: ratelimit-config
---
apiVersion: v1
kind: ConfigMap
metadata:
name: ratelimit-config
namespace: rate-limiting
data:
config.yaml: |
domain: production
descriptors:
# Per API key: 1000 req/min
- key: api_key
rate_limit:
unit: minute
requests_per_unit: 1000
# Per source IP: 100 req/min
- key: remote_address
rate_limit:
unit: minute
requests_per_unit: 100
# Per path: different limits
- key: path
value: "/api/v1/webhooks"
rate_limit:
unit: second
requests_per_unit: 50
# Tiered: free vs premium
- key: plan
value: "free"
rate_limit:
unit: hour
requests_per_unit: 100
- key: plan
value: "premium"
rate_limit:
unit: hour
requests_per_unit: 10000Kong Rate Limiting Plugin
# Kong rate limiting via KongPlugin CRD
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
name: rate-limit-api
namespace: production
config:
minute: 300
hour: 5000
policy: redis # local | cluster | redis
redis_host: redis.infrastructure.svc
redis_port: 6379
redis_timeout: 2000
limit_by: consumer # consumer | credential | ip | header | path
header_name: x-api-key # When limit_by=header
plugin: rate-limiting
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
annotations:
konghq.com/plugins: rate-limit-api
spec:
ingressClassName: kong
rules:
- host: api.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080Response Headers for Rate Limit Visibility
Standard rate limit response headers:
X-RateLimit-Limit: 100 # Max requests allowed
X-RateLimit-Remaining: 42 # Requests remaining in window
X-RateLimit-Reset: 1717244400 # Unix timestamp when limit resets
Retry-After: 30 # Seconds to wait (on 429 response)
HTTP 429 Too Many Requests response:
{
"error": "rate_limit_exceeded",
"message": "API rate limit exceeded. Try again in 30 seconds.",
"retry_after": 30
}Common Issues
Rate limit not applied β all requests pass through
- Cause: Annotation typo, or Ingress class mismatch
- Fix: Verify ingress controller matches annotations; check controller logs
Rate limit too aggressive β legitimate users blocked
- Cause: Limiting by IP behind a load balancer (all traffic from same IP)
- Fix: Use
X-Forwarded-Forheader; configureuse-forwarded-headers: true
Shared rate limit state inconsistent across replicas
- Cause: Using local rate limiting instead of global (Redis-backed)
- Fix: Deploy centralized rate limit service with Redis; use
policy: redis
429 responses not reaching the client
- Cause: CDN or upstream proxy retrying silently
- Fix: Configure CDN to pass through 429; set
Retry-Afterheader
Best Practices
- Layer rate limits β global (gateway) + per-service + per-endpoint
- Use Redis for shared state β local limits are per-pod, not global
- Return proper headers β
X-RateLimit-Remaininghelps clients self-throttle - Allowlist internal traffic β donβt rate-limit service-to-service calls
- Different tiers β free/premium plans with distinct limits
- Burst allowance β allow short bursts above sustained rate
- Monitor 429 rates β high 429s may indicate legitimate traffic growth
- Rate limit by API key, not just IP β IPs are shared (NAT, proxies)
Key Takeaways
- Rate limiting protects Kubernetes services from abuse and overload
- NGINX Ingress: simple annotations (
limit-rps,limit-rpm,limit-connections) - Gateway API:
BackendTrafficPolicywith per-client selectors (IP, header, CIDR) - Istio: EnvoyFilter for local rate limits, external ratelimit service for global
- Kong:
RateLimitingplugin with Redis backend for cluster-wide consistency - Always return
429 Too Many RequestswithRetry-Afterheader - Use Redis-backed global rate limiting when running multiple ingress replicas

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
