πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Networking advanced ⏱ 15 minutes K8s 1.28+

Fix SR-IOV 'Not Enough MMIO Resources' Error

Resolve the mlx5_core 'not enough MMIO resources for SR-IOV' error on OpenShift nodes with Mellanox ConnectX NICs. Covers BIOS settings, PCIe BAR

By Luca Berton β€’ β€’ πŸ“– 7 min read

πŸ’‘ Quick Answer: The kernel error not enough MMIO resources for SR-IOV (pci_enable_sriov failed: -12 / ENOMEM) means the node’s BIOS doesn’t allocate enough PCIe MMIO/BAR address space for Virtual Functions. Fix by enabling β€œAbove 4G Decoding” and increasing MMIO allocation in BIOS, then cold reboot.

The Problem

When the SR-IOV Network Operator tries to create VFs on a Mellanox ConnectX NIC, the kernel logs show:

mlx5_core 0000:06:00.0: not enough MMIO resources for SR-IOV
mlx5_core 0000:06:00.0: mlx5_sriov_enable:224:(pid 620392): pci_enable_sriov failed : -12
mlx5_core 0000:06:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(16), necvfs(0), active vports(17)
mlx5_core 0000:06:00.0: E-Switch: Disable: mode(LEGACY), nvfs(16), necvfs(0), active vports(1)

The error code -12 is ENOMEM β€” the kernel cannot map enough PCI BAR (Base Address Register) space for the requested Virtual Functions. This is a BIOS/firmware issue, not an OpenShift or driver problem.

Why One Node Works and Another Doesn’t

Node 1 (working):   BIOS allocates sufficient MMIO space β†’ VFs created βœ…
Node 2 (failing):   BIOS MMIO allocation too small β†’ ENOMEM on pci_enable_sriov ❌

Same hardware model, same NICs, same OpenShift config β€” different BIOS settings.
This happens with inconsistent BIOS profiles across identical servers.

The Solution

Step 1: Stop the SR-IOV Retry Loop

Before touching BIOS, prevent the operator from repeatedly trying (and failing) to create VFs:

# Option A: Set numVfs to 0 for the failing node
# Edit the SriovNetworkNodePolicy to exclude the failing node temporarily
oc edit sriovnetworknodepolicy mellanox-rdma-policy \
  -n openshift-sriov-network-operator

# Or create a node-specific policy override:
cat <<YAML | oc apply -f -
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  name: disable-sriov-node2
  namespace: openshift-sriov-network-operator
spec:
  nodeSelector:
    kubernetes.io/hostname: "gpu-worker-02"
  numVfs: 0
  priority: 1    # Higher priority (lower number) overrides
  resourceName: mellanoxnics
  vendor: "15b3"
YAML

# Option B: Cordon the node
oc adm cordon gpu-worker-02

Step 2: Fix BIOS Settings

Access the node’s BIOS/UEFI (via BMC/iLO/iDRAC/IPMI) and verify these settings:

Required BIOS Settings:
────────────────────────────────────────────────────────────────
Setting                          Required Value    Notes
────────────────────────────────────────────────────────────────
SR-IOV                          Enabled           Global SR-IOV toggle
Above 4G Decoding               Enabled           CRITICAL β€” maps BARs above 4GB
PCIe ARI Support                Enabled / Auto    Alternative Routing-ID
IOMMU / VT-d / AMD-Vi           Enabled           Required for device passthrough
MMIO High Base                  Auto / Large      Some BIOS have explicit MMIO size
MMIO Allocation                 Large             Vendor-specific label
Memory Mapped I/O above 4GB     Enabled           Same as Above 4G Decoding
SR-IOV Global Enable            Enabled           Some BIOS separate global/per-slot
Common BIOS Locations by Vendor:
────────────────────────────────────────────────────────────────
Dell (iDRAC):
  System BIOS β†’ Integrated Devices β†’ SR-IOV Global Enable
  System BIOS β†’ Memory Settings β†’ Memory Mapped I/O above 4GB

HPE (iLO):
  System Configuration β†’ BIOS/Platform β†’ PCIe Settings
  β†’ SR-IOV, Above 4G Decoding, MMIO High Base

Lenovo (XClarity):
  UEFI Setup β†’ System Settings β†’ Devices and I/O Ports
  β†’ PCIe SR-IOV Support, Above 4G Decoding

Supermicro:
  Advanced β†’ PCIe/PCI/PnP Configuration
  β†’ Above 4G Decoding, SR-IOV Support

Step 3: Cold Reboot

# IMPORTANT: Cold reboot (full power cycle), not warm reboot
# PCIe BAR allocation happens at POST β€” warm reboot may not re-enumerate

# Via BMC/IPMI if available:
ipmitool -I lanplus -H bmc-gpu-worker-02.example.com \
  -U admin -P <password> chassis power cycle

# Or from OpenShift (warm reboot β€” less reliable for PCIe changes):
oc debug node/gpu-worker-02 -- chroot /host systemctl reboot

Step 4: Test VF Creation Manually

# After BIOS fix + cold reboot, test before re-enabling SR-IOV policy
oc debug node/gpu-worker-02
chroot /host

# Find the PF network device name
ls /sys/class/net/ | grep -E "ens|eno|enp"

# Try creating 1 VF first
echo 1 > /sys/class/net/ens1f0np0/device/sriov_numvfs
cat /sys/class/net/ens1f0np0/device/sriov_numvfs
# Should return: 1

# Check dmesg for errors
dmesg | tail -20 | grep -i "mmio\|sriov\|mlx5"
# Should NOT show "not enough MMIO resources"

# Clean up test VF
echo 0 > /sys/class/net/ens1f0np0/device/sriov_numvfs

Step 5: Gradual VF Enablement

# Don't jump straight to 16 VFs β€” scale up gradually:

# Phase 1: 1 PF Γ— 1 VF
echo 1 > /sys/class/net/ens1f0np0/device/sriov_numvfs
dmesg | tail -5
# βœ… Success? Continue.

# Phase 2: 1 PF Γ— 4 VFs
echo 0 > /sys/class/net/ens1f0np0/device/sriov_numvfs
echo 4 > /sys/class/net/ens1f0np0/device/sriov_numvfs
dmesg | tail -5
# βœ… Success? Continue.

# Phase 3: 1 PF Γ— 16 VFs (full target)
echo 0 > /sys/class/net/ens1f0np0/device/sriov_numvfs
echo 16 > /sys/class/net/ens1f0np0/device/sriov_numvfs
dmesg | tail -5
# βœ… Success? Now test all PFs.

# Phase 4: All PFs Γ— target VFs
# Repeat for each PF (ens1f1np1, ens2f0np0, etc.)

# Clean up β€” let the SR-IOV operator manage from here
echo 0 > /sys/class/net/ens1f0np0/device/sriov_numvfs
exit  # exit chroot
exit  # exit debug pod

Step 6: Re-Enable SR-IOV Policy

# Remove the override policy
oc delete sriovnetworknodepolicy disable-sriov-node2 \
  -n openshift-sriov-network-operator

# Or uncordon the node
oc adm uncordon gpu-worker-02

# Watch the SR-IOV operator apply the policy
oc get sriovnetworknodestate gpu-worker-02 \
  -n openshift-sriov-network-operator -w

# Wait for sync to complete
# syncStatus: Succeeded

# Verify resources are registered
oc describe node gpu-worker-02 | grep -A5 -B2 mellanox
# Expected:
#   openshift.io/mellanoxnics: 16

Diagnostic Commands

# Check current MMIO allocation on a node
oc debug node/gpu-worker-02 -- chroot /host \
  lspci -vvv -s 06:00.0 2>/dev/null | grep -i "memory\|region\|bar"

# Compare BAR sizes between working and failing nodes
# Working node:
oc debug node/gpu-worker-01 -- chroot /host \
  lspci -vvv -s 06:00.0 | grep "Region"

# Failing node:
oc debug node/gpu-worker-02 -- chroot /host \
  lspci -vvv -s 06:00.0 | grep "Region"

# Check total VFs supported by hardware
oc debug node/gpu-worker-02 -- chroot /host \
  cat /sys/class/net/ens1f0np0/device/sriov_totalvfs
# Usually: 127 (hardware supports it, MMIO is the bottleneck)

# Check kernel messages for MMIO errors
oc debug node/gpu-worker-02 -- chroot /host \
  dmesg | grep -i "mmio\|bar\|sriov\|mlx5" | tail -30

# Check iommu groups
oc debug node/gpu-worker-02 -- chroot /host \
  find /sys/kernel/iommu_groups/ -type l | head -20

Common Issues

BIOS changed but VFs still fail

  • Cause: Warm reboot doesn’t re-enumerate PCIe; BAR allocation unchanged
  • Fix: Full cold reboot (power off β†’ power on); verify via BMC

VFs work for 1 PF but fail on second PF

  • Cause: Total MMIO space shared across all PCIe devices; not enough for multiple PFs
  • Fix: Reduce numVfs per PF, or increase MMIO High allocation further in BIOS

SR-IOV works after manual echo but operator fails

  • Cause: Operator applies to all PFs simultaneously; manual test was 1 PF
  • Fix: Reduce numVfs in policy; apply per-PF policies with lower VF counts

Different BIOS versions across identical servers

  • Cause: Fleet not uniformly provisioned; BIOS defaults differ by version
  • Fix: Standardize BIOS settings via BMC redfish API or vendor management tool

Error returns after BIOS update

  • Cause: BIOS update reset settings to defaults
  • Fix: Re-apply SR-IOV BIOS settings; document in runbook for future updates

Best Practices

  1. Standardize BIOS across fleet β€” identical settings on all GPU/RDMA nodes
  2. Always cold reboot after BIOS PCIe changes β€” warm reboot is insufficient
  3. Test VFs manually first β€” echo 1 > sriov_numvfs before operator
  4. Scale VFs gradually β€” 1 β†’ 4 β†’ 16 to find the MMIO ceiling
  5. Document BIOS profile β€” save BMC configuration for fleet reprovisioning
  6. Compare working vs failing β€” lspci -vvv Region output reveals BAR differences
  7. Monitor dmesg after SR-IOV policy changes β€” catch MMIO errors early

Key Takeaways

  • not enough MMIO resources for SR-IOV = BIOS doesn’t allocate enough PCIe BAR space
  • Error code -12 (ENOMEM) on pci_enable_sriov confirms memory-mapped I/O shortage
  • Fix: Enable β€œAbove 4G Decoding” + increase MMIO allocation in BIOS
  • Cold reboot required (power cycle, not warm reboot) for PCIe re-enumeration
  • One working node + one failing node with same hardware = BIOS config difference
  • Test manually (echo N > sriov_numvfs) before re-enabling SR-IOV operator
  • Gradual VF enablement (1 β†’ 4 β†’ 16) identifies the MMIO ceiling per node
  • Fleet consistency: standardize BIOS profiles across all GPU/RDMA nodes
#sriov #mmio #mellanox #bios #openshift
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens