SR-IOV Mixed NICs for GPU Nodes
Configure SR-IOV with mixed ConnectX-7 and ConnectX-6 NICs for RDMA data plane and management traffic on GPU worker nodes.
π‘ Quick Answer: Use ConnectX-7 NICs for RDMA/GPUDirect data plane (SR-IOV VFs with RDMA capability) and ConnectX-6 NICs for management/tenant traffic (SR-IOV VFs without RDMA). Separate SriovNetworkNodePolicy per NIC type with different
resourceNamevalues.
The Problem
GPU worker nodes typically have mixed NIC generations. RDMA-capable ConnectX-7 handles GPUDirect and NCCL traffic, while ConnectX-6 handles management, tenant ingress, and storage. Using SR-IOV on both requires separate policies, resource names, and network attachments to avoid routing RDMA traffic through the wrong NIC.
The Solution
Identify NICs
# List NICs and their PCI addresses
oc debug node/gpu-worker-1 -- chroot /host lspci | grep -i mellanox
# 17:00.0 Ethernet controller: Mellanox ConnectX-7 (RDMA)
# 17:00.1 Ethernet controller: Mellanox ConnectX-7 (RDMA)
# 3b:00.0 Ethernet controller: Mellanox ConnectX-6 Dx (Mgmt)
# 3b:00.1 Ethernet controller: Mellanox ConnectX-6 Dx (Mgmt)
# Check RDMA capability
oc debug node/gpu-worker-1 -- chroot /host ibstat
# Port 1 of mlx5_0 (ConnectX-7): Active, Rate 400 Gb/s
# Port 1 of mlx5_2 (ConnectX-6): Active, Rate 100 Gb/sSR-IOV Policy: ConnectX-7 (RDMA Data Plane)
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: cx7-rdma-policy
namespace: openshift-sriov-network-operator
spec:
nodeSelector:
feature.node.kubernetes.io/network-sriov.capable: "true"
nvidia.com/gpu-sharing: "full"
resourceName: cx7rdma
numVfs: 8
nicSelector:
vendor: "15b3"
deviceID: "a2dc" # ConnectX-7
pfNames: ["ens17f0", "ens17f1"]
deviceType: netdevice
isRdma: true # Enable RDMA on VFs
linkType: IB # or ETH for RoCE
mtu: 9000SR-IOV Policy: ConnectX-6 (Management Traffic)
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: cx6-mgmt-policy
namespace: openshift-sriov-network-operator
spec:
nodeSelector:
feature.node.kubernetes.io/network-sriov.capable: "true"
resourceName: cx6mgmt
numVfs: 16
nicSelector:
vendor: "15b3"
deviceID: "101d" # ConnectX-6 Dx
pfNames: ["ens3bf0", "ens3bf1"]
deviceType: netdevice
isRdma: false # No RDMA for management
linkType: ETH
mtu: 1500Network Attachments
# RDMA network for NCCL / GPUDirect
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: rdma-net
namespace: openshift-sriov-network-operator
spec:
resourceName: cx7rdma
networkNamespace: tenant-alpha
ipam: |
{
"type": "host-local",
"subnet": "192.168.100.0/24",
"rangeStart": "192.168.100.10",
"rangeEnd": "192.168.100.200"
}
capabilities: '{ "rdma": true }'
---
# Management network for tenant services
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: tenant-mgmt-net
namespace: openshift-sriov-network-operator
spec:
resourceName: cx6mgmt
networkNamespace: tenant-alpha
ipam: |
{
"type": "host-local",
"subnet": "10.10.0.0/24",
"rangeStart": "10.10.0.10",
"rangeEnd": "10.10.0.200"
}Pod with Mixed NICs
apiVersion: v1
kind: Pod
metadata:
name: training-job
namespace: tenant-alpha
annotations:
k8s.v1.cni.cncf.io/networks: |
[
{
"name": "rdma-net",
"namespace": "tenant-alpha"
},
{
"name": "tenant-mgmt-net",
"namespace": "tenant-alpha"
}
]
spec:
containers:
- name: trainer
image: nvcr.io/nvidia/pytorch:24.03-py3
resources:
limits:
nvidia.com/gpu: 8
openshift.io/cx7rdma: 1 # RDMA VF from ConnectX-7
openshift.io/cx6mgmt: 1 # Mgmt VF from ConnectX-6
env:
# NCCL uses RDMA NIC
- name: NCCL_IB_HCA
value: "mlx5_0,mlx5_1"
- name: NCCL_NET_GDR_LEVEL
value: "5"
# Storage traffic on management NIC
- name: NFS_INTERFACE
value: "net2"Verify VF Allocation
# Check VFs created
oc debug node/gpu-worker-1 -- chroot /host \
cat /sys/class/net/ens17f0/device/sriov_numvfs
# 8
# Check RDMA devices
oc debug node/gpu-worker-1 -- chroot /host ibstat
# mlx5_0: ConnectX-7, RDMA Active
# mlx5_2: ConnectX-6, No RDMA
# Verify pod got correct VFs
kubectl exec training-job -- ip link show
# net1: RDMA VF from CX-7 (192.168.100.x)
# net2: Mgmt VF from CX-6 (10.10.0.x)graph TD
A[GPU Worker Node] --> B[ConnectX-7 Port 0]
A --> C[ConnectX-7 Port 1]
A --> D[ConnectX-6 Port 0]
A --> E[ConnectX-6 Port 1]
B --> F[8 VFs: cx7rdma RDMA enabled]
C --> G[8 VFs: cx7rdma RDMA enabled]
D --> H[16 VFs: cx6mgmt no RDMA]
E --> I[16 VFs: cx6mgmt no RDMA]
F --> J[NCCL and GPUDirect traffic]
G --> J
H --> K[Management and tenant traffic]
I --> K
J -->|InfiniBand or RoCE| L[400 Gb/s fabric]
K -->|Ethernet| M[25-100 Gb/s fabric]Common Issues
- RDMA VFs not created β SR-IOV Global Enable must be on in BIOS; verify firmware supports SR-IOV on that NIC model
- Wrong NIC selected for NCCL β set
NCCL_IB_HCAexplicitly to ConnectX-7 device names; without it, NCCL may pick the wrong NIC - VF count exceeds maximum β ConnectX-7 supports up to 252 VFs per PF; ConnectX-6 up to 127; but practical limit is usually 8-16 per PF
- MachineConfigPool not updating β SR-IOV operator triggers MCO; check
oc get mcpfor rollout status - Mixed link types (IB + ETH) β each SriovNetworkNodePolicy must specify correct
linkType; donβt mix IB and ETH on same policy
Best Practices
- Separate SR-IOV policies per NIC generation β different
resourceNameper NIC type - ConnectX-7 for RDMA (GPUDirect, NCCL); ConnectX-6 for management and storage
- Use
isRdma: trueonly on NICs that need RDMA β unnecessary RDMA VFs waste resources - Set
NCCL_IB_HCAexplicitly in pod env β donβt rely on NCCL auto-detection - Match MTU to fabric: 9000 for RDMA/jumbo frames, 1500 for management
- Label nodes with NIC capabilities for pod scheduling affinity
Key Takeaways
- Mixed NIC generations need separate SR-IOV policies with distinct
resourceNamevalues - ConnectX-7 (RDMA) for data plane, ConnectX-6 for management β never mix traffic paths
- Pods request specific VF types via resource limits (
openshift.io/cx7rdma) - NCCL must be explicitly pointed to RDMA-capable NICs via
NCCL_IB_HCA - BIOS SR-IOV Global Enable is a prerequisite β without it, no VFs are created

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
