πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Configuration advanced ⏱ 15 minutes K8s 1.28+

Kubernetes 1.36 L3 Cache Topology in CPU Manager

Configure L3 cache topology awareness in Kubernetes 1.36 CPU Manager. Allocate CPUs sharing L3 cache for better performance in latency-sensitive workloads.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: Kubernetes 1.36 adds L3 Cache Topology Awareness to CPU Manager (KEP-5109). CPUs allocated to a container can be constrained to share the same L3 cache, reducing cache misses and improving performance for latency-sensitive workloads by 10-30%.

The Problem

Modern CPUs have complex cache hierarchies:

  • L1/L2: Private per core (fast, small)
  • L3: Shared across a group of cores (larger, slower)
  • Cross-L3: Accessing data cached in a different L3 domain is 2-3x slower

When CPU Manager allocates cores, it considers NUMA zones but ignores L3 boundaries. A container with 4 CPUs might get cores from different L3 cache domains, causing:

  • Cache thrashing between L3 domains
  • Higher memory access latency
  • 10-30% performance degradation for cache-sensitive workloads
  • Unpredictable latency spikes in real-time applications

The Solution

L3 cache topology awareness ensures CPUs allocated to a container share the same L3 cache.

Enable L3 Cache Topology

# Kubelet configuration
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cpuManagerPolicy: static
cpuManagerPolicyOptions:
  full-pcpus-only: "true"
  distribute-cpus-across-numa: "false"
  align-by-l3-cache: "true"    # NEW in 1.36
topologyManagerPolicy: single-numa-node
featureGates:
  CPUManagerL3CacheAwareness: true

Pod Requesting L3-Aligned CPUs

apiVersion: v1
kind: Pod
metadata:
  name: latency-sensitive
spec:
  containers:
    - name: app
      image: registry.example.com/trading:v5.0
      resources:
        requests:
          cpu: "4"
          memory: "8Gi"
        limits:
          cpu: "4"        # Guaranteed QoS β†’ static CPU allocation
          memory: "8Gi"

With align-by-l3-cache: true, the 4 CPUs will be from cores sharing the same L3 cache.

Verify L3 Cache Alignment

# Check CPU topology on the node
lscpu -e=CPU,CORE,SOCKET,NODE,L1d:,L1i:,L2:,L3:
# CPU CORE SOCKET NODE L1d: L1i: L2:  L3:
# 0   0    0      0    0    0    0    0    ← L3 domain 0
# 1   1    0      0    1    1    1    0    ← L3 domain 0
# 2   2    0      0    2    2    2    0    ← L3 domain 0
# 3   3    0      0    3    3    3    0    ← L3 domain 0
# 4   4    0      0    4    4    4    1    ← L3 domain 1
# ...

# Check which CPUs were allocated to the container
kubectl exec latency-sensitive -- cat /sys/fs/cgroup/cpuset.cpus
# 0-3  (all in L3 domain 0 βœ“)

# Verify with lstopo (hwloc)
kubectl exec latency-sensitive -- lstopo --of txt

Performance Impact

# Without L3 alignment (CPUs 0,1,4,5 β€” crosses L3 boundary):
# Cache miss rate: ~15%
# P99 latency: 450ΞΌs

# With L3 alignment (CPUs 0,1,2,3 β€” same L3 domain):
# Cache miss rate: ~3%
# P99 latency: 180ΞΌs

# ~60% latency improvement for cache-heavy workloads

Common Issues

Pod stuck in Pending

  • Cause: Not enough free CPUs within a single L3 cache domain
  • Fix: Reduce CPU request to fit within one L3 domain, or relax the constraint

No performance improvement

  • Cause: Workload is memory-bandwidth bound, not cache-bound
  • Fix: L3 alignment helps cache-sensitive workloads; memory-bound workloads need NUMA alignment instead

Feature gate not recognized

  • Cause: Kubernetes version < 1.36
  • Fix: Upgrade kubelet to 1.36+

Best Practices

  1. Use for latency-sensitive workloads β€” trading systems, real-time analytics, game servers
  2. Combine with NUMA alignment β€” single-numa-node topology + L3 cache alignment
  3. Set Guaranteed QoS β€” requests == limits required for static CPU assignment
  4. Right-size CPU requests β€” match L3 domain size (check with lscpu)
  5. Benchmark before and after β€” measure actual cache miss rates and latency

Key Takeaways

  • L3 Cache Topology Awareness is available in Kubernetes 1.36 (KEP-5109)
  • CPUs allocated to containers share the same L3 cache domain
  • 10-30% performance improvement for cache-sensitive workloads
  • Requires static CPU Manager policy and Guaranteed QoS class
  • Complements NUMA alignment for maximum performance
#kubernetes-1.36 #cpu-manager #performance #numa #topology
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens