πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Networking intermediate ⏱ 15 minutes K8s 1.28+

VT-x vs VT-d vs SR-IOV Explained

Understand the difference between CPU virtualization (VT-x/SVM), I/O virtualization (VT-d/AMD-Vi/IOMMU), and SR-IOV. Which to enable or disable for GPU

By Luca Berton β€’ β€’ πŸ“– 7 min read

πŸ’‘ Quick Answer: VT-x (CPU virtualization) and VT-d (I/O virtualization/IOMMU) are completely different technologies. You can disable VT-d (IOMMU) to fix GPU-Direct P2P issues without affecting containers β€” containers only need VT-x. SR-IOV specifically requires VT-d enabled.

The Problem

Three BIOS settings are often confused:

  • VT-x β€” β€œdo I need this for containers?”
  • VT-d β€” β€œcan I disable this for GPU performance?”
  • SR-IOV β€” β€œwhy does this need VT-d?”

Disabling the wrong one breaks your cluster. Disabling the right one gives you GPU-Direct P2P at full speed.

The Solution

The Three Virtualization Technologies

Technology    Full Name                  Layer     What It Does
──────────────────────────────────────────────────────────────────────────
VT-x / SVM   CPU Virtualization         CPU       Hardware-assisted VM execution
              (Intel VT-x / AMD-V SVM)             Containers use this via namespaces
                                                    NEVER disable on K8s nodes

VT-d / AMD-Vi I/O Virtualization        PCIe/DMA  IOMMU β€” translates DMA addresses
              (Intel VT-d / AMD-Vi)                Isolates device DMA per VM/container
                                                    SAFE to disable if no SR-IOV/VMs

SR-IOV        Single Root I/O Virt.     NIC/PCIe  Splits 1 physical NIC into N VFs
                                                    Each VF appears as separate device
                                                    REQUIRES VT-d enabled

Relationship Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         BIOS                                β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚  β”‚   VT-x/SVM   β”‚  β”‚  VT-d/AMD-Vi β”‚  β”‚   SR-IOV     β”‚     β”‚
β”‚  β”‚  (CPU layer)  β”‚  β”‚ (PCIe/IOMMU) β”‚  β”‚ (NIC layer)  β”‚     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚         β”‚                  β”‚                  β”‚             β”‚
β”‚         β”‚                  β”‚    DEPENDS ON β”€β”€β”€β”˜             β”‚
β”‚         β”‚                  β”‚                               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚                  β”‚
          β–Ό                  β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚ Containers  β”‚    β”‚ DMA Isolationβ”‚
   β”‚ VMs         β”‚    β”‚ Device Pass- β”‚
   β”‚ KVM/QEMU    β”‚    β”‚ through      β”‚
   β”‚ cgroups     β”‚    β”‚ SR-IOV VFs   β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

VT-x: Required for containers and VMs (CPU instruction trapping)
VT-d: Required ONLY for DMA isolation / SR-IOV / VM device passthrough
SR-IOV: Requires VT-d (VFs need IOMMU address translation)

What Each Technology Is Used For

VT-x / AMD-V SVM (CPU Virtualization):
────────────────────────────────────────
Used by:
  βœ… Docker / containerd (via Linux namespaces/cgroups)
  βœ… KVM / QEMU virtual machines
  βœ… Kata Containers (microVMs)
  βœ… Kubernetes (all Pod execution)
  
Disable? NEVER on a K8s/OpenShift node
Impact if disabled: Containers still work (they don't use VT-x directly)
                    but some runtimes (Kata) and all VMs will break

─────────────────────────────────────────────────────────────────
VT-d / AMD-Vi (I/O Virtualization / IOMMU):
────────────────────────────────────────
Used by:
  βœ… SR-IOV Virtual Functions (VF address translation)
  βœ… VM device passthrough (GPU passthrough to VM)
  βœ… VFIO (device assignment)
  ❌ NOT needed for standard containers
  ❌ NOT needed for GPU-Direct P2P (actually hurts it)

Disable? SAFE if you don't use SR-IOV or VM passthrough
Impact if disabled: 
  β€’ GPU-Direct P2P works at full speed βœ…
  β€’ Containers work perfectly βœ…
  β€’ SR-IOV VFs will NOT work ❌
  β€’ VM device passthrough will NOT work ❌

─────────────────────────────────────────────────────────────────
SR-IOV (Single Root I/O Virtualization):
────────────────────────────────────────
Used by:
  βœ… Network VFs for Pods (high-performance networking)
  βœ… RDMA VFs for GPU-Direct RDMA (inter-node NCCL)
  
Requires: VT-d/AMD-Vi ENABLED
Disable? If you don't need VFs (use host networking instead)
Impact if disabled: No Virtual Functions, Pods get regular veth interfaces

Decision Matrix for GPU Clusters

Scenario                              VT-x    VT-d    SR-IOV   ACS
──────────────────────────────────────────────────────────────────────
Single-node training (no RDMA)        ON      OFF     OFF      N/A
  β†’ Max GPU-Direct P2P, simplest

Multi-node training with host NIC     ON      OFF     OFF      N/A
  β†’ NCCL uses host InfiniBand directly

Multi-node with SR-IOV RDMA VFs       ON      ON+pt   ON       Override
  β†’ VFs for Pods + GPU-Direct RDMA

Mixed (VMs + GPUs on same node)       ON      ON+pt   ON       Override
  β†’ Full virtualization stack

Inference only (no P2P needed)        ON      ON      ON       Don't care
  β†’ Single GPU per Pod, no P2P

BIOS Settings Summary

Bare-metal GPU training (no SR-IOV):
────────────────────────────────────────
VT-x / AMD-V:              ENABLED  (containers need it)
VT-d / AMD-Vi:             DISABLED (removes IOMMU overhead + ACS)
SR-IOV:                    DISABLED (no VFs needed)
Above 4G Decoding:         ENABLED  (large BAR GPUs)
ACS:                       N/A      (no IOMMU = no ACS enforcement)

Kernel args: (none needed, or intel_iommu=off)

────────────────────────────────────────

GPU training with SR-IOV RDMA:
────────────────────────────────────────
VT-x / AMD-V:              ENABLED
VT-d / AMD-Vi:             ENABLED  (SR-IOV requires it)
SR-IOV:                    ENABLED
Above 4G Decoding:         ENABLED
ACS:                       DISABLED in BIOS (or kernel override)

Kernel args: intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction

Common Misconception

❌ WRONG: "Disable VT-x to improve GPU performance"
   β†’ VT-x is CPU-level. Has ZERO impact on GPU/PCIe performance.
   β†’ Disabling breaks Kata, KVM, and some security features.

❌ WRONG: "Containers need VT-d"
   β†’ Containers use Linux namespaces + cgroups, NOT IOMMU.
   β†’ VT-d is only for DMA address translation (device isolation).

❌ WRONG: "SR-IOV works without IOMMU"
   β†’ VFs need IOMMU to translate their DMA addresses.
   β†’ Without VT-d, pci_enable_sriov will fail.

βœ… RIGHT: "Disable VT-d (not VT-x) to fix GPU-Direct P2P"
   β†’ IOMMU off = no DMA translation = direct P2P between GPU↔GPU/NIC
   β†’ Containers still work perfectly (they don't use IOMMU)
   β†’ You lose SR-IOV capability (acceptable if using host networking)

Verify Current State

# Check VT-x status
grep -E "vmx|svm" /proc/cpuinfo | head -1
# vmx = Intel VT-x enabled
# svm = AMD-V enabled

# Check VT-d / IOMMU status
dmesg | grep -i -E "DMAR|AMD-Vi|IOMMU"
# "DMAR: IOMMU enabled" = VT-d active
# Nothing = VT-d disabled

# Check SR-IOV VFs available
lspci | grep "Virtual Function"
# Lists VFs if SR-IOV enabled + VFs created

# Quick status script
echo "=== Virtualization Status ==="
echo -n "VT-x/SVM: "
grep -qE "vmx|svm" /proc/cpuinfo && echo "ENABLED βœ…" || echo "DISABLED ❌"
echo -n "VT-d/IOMMU: "
dmesg 2>/dev/null | grep -qi "IOMMU enabled\|AMD-Vi init" && echo "ENABLED" || echo "DISABLED"
echo -n "SR-IOV VFs: "
lspci 2>/dev/null | grep -c "Virtual Function"

Common Issues

Disabled VT-d but SR-IOV stopped working

  • Cause: SR-IOV requires IOMMU for VF DMA translation β€” this is expected
  • Fix: Choose one: VT-d on (with ACS override) for SR-IOV, or VT-d off (use host NIC)

Containers broken after disabling VT-x

  • Cause: Kata Containers or gVisor require hardware virtualization
  • Fix: Never disable VT-x; disable VT-d instead for GPU performance

Confused by BIOS labels

  • Cause: BIOS vendors use different names for the same thing
  • Fix: Intel VT-d = Intel Directed I/O = IOMMU = DMAR. AMD-Vi = AMD IOMMU.

Best Practices

  1. Never disable VT-x on Kubernetes nodes β€” containers and security depend on it
  2. VT-d is safe to disable on dedicated GPU compute nodes (no SR-IOV needed)
  3. If you need SR-IOV VFs: keep VT-d on + iommu=pt + ACS override
  4. Label nodes by capability β€” gpu-direct: true vs sriov: true for scheduling
  5. Document per-node BIOS profile β€” β€œwhy is VT-d off on these 8 nodes?”
  6. Separate node pools β€” SR-IOV nodes (VT-d on) vs bare GPU nodes (VT-d off)

Key Takeaways

  • VT-x (CPU) β‰  VT-d (I/O/IOMMU) β‰  SR-IOV (NIC virtualization)
  • Containers need VT-x, NOT VT-d β€” safe to disable IOMMU for GPU performance
  • SR-IOV requires VT-d β€” you can’t have VFs without IOMMU
  • Disabling VT-d removes IOMMU overhead AND eliminates ACS blocking
  • The simple path for GPU training: VT-d OFF, SR-IOV OFF, use host InfiniBand directly
  • If SR-IOV needed: VT-d ON + iommu=pt + pcie_acs_override (slight overhead)
  • Never confuse VT-x with VT-d β€” disabling VT-x can break your cluster
#virtualization #iommu #sriov #bios #gpu-direct
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens