MOFED Driver Operator Build Kubernetes
Let the NVIDIA Network Operator build MOFED drivers on-node via DKMS. Kernel header detection, compile flags, and DTK integration for OpenShift.
π‘ Quick Answer: Set
forcePrecompiled: falsein the NicClusterPolicyofedDriverspec and the Network Operator will automatically compile MOFED kernel modules on each node using DKMS when no precompiled modules match the running kernel. The MOFED container detects the kernel version, pulls kernel headers from the node or a configured repo, compilesmlx5_core,mlx5_ib, and supporting modules, then loads them β all without manual intervention.
The Problem
Precompiled MOFED driver packages only exist for specific kernel versions. When your nodes run a kernel that doesnβt have a precompiled match:
- Custom kernels (RT, low-latency, vendor-patched)
- Kernel updates between MOFED releases
- OpenShift z-stream updates that change the kernel
- Non-standard Linux distributions
- Air-gapped environments where precompiled images arenβt mirrored
The operator needs to build the drivers on the node itself.
The Solution
NicClusterPolicy β Let the Operator Build
apiVersion: mellanox.com/v1alpha1
kind: NicClusterPolicy
metadata:
name: nic-cluster-policy
spec:
ofedDriver:
image: mofed
repository: nvcr.io/nvidia/mellanox
version: 24.07-0.7.0.0
# Key setting: allow on-node compilation
forcePrecompiled: false
# Compilation timeout (large modules take time)
terminationGracePeriodSeconds: 600
startupProbe:
initialDelaySeconds: 30
periodSeconds: 30
failureThreshold: 60 # 30min total for compilation
livenessProbe:
initialDelaySeconds: 600 # Wait for compile to finish
periodSeconds: 30
env:
# Unload inbox drivers before loading compiled ones
- name: UNLOAD_STORAGE_MODULES
value: "true"
# Restore inbox if pod terminates
- name: RESTORE_DRIVER_ON_POD_TERMINATION
value: "true"
# Custom DKMS compile flags (optional)
- name: DKMS_EXTRA_FLAGS
value: ""How the Build Process Works
graph TD
START[MOFED Pod Starts] --> DETECT[Detect Running Kernel<br/>uname -r]
DETECT --> CHECK{Precompiled<br/>Modules Available?}
CHECK --> |Yes| LOAD[Load Precompiled<br/>Modules]
CHECK --> |No| HEADERS{Kernel Headers<br/>Available?}
HEADERS --> |Yes| BUILD[DKMS Build<br/>mlx5_core, mlx5_ib, ...]
HEADERS --> |No| FETCH[Fetch Headers<br/>from Repo/Node]
FETCH --> BUILD
BUILD --> UNLOAD[Unload Inbox Drivers<br/>rmmod mlx5_core]
LOAD --> UNLOAD
UNLOAD --> INSMOD[Load MOFED Modules<br/>modprobe mlx5_core]
INSMOD --> READY[MOFED Ready β
]
HEADERS --> |Failed| FAIL[Pod CrashLoop β]
BUILD --> |Compile Error| FAIL
style BUILD fill:#FF9800,color:white
style READY fill:#4CAF50,color:white
style FAIL fill:#F44336,color:whiteforcePrecompiled Behavior
| Setting | Behavior |
|---|---|
forcePrecompiled: true | Only use precompiled modules. Fail if no match. |
forcePrecompiled: false (default) | Try precompiled first. If no match, compile on-node via DKMS. |
Kernel Headers on Different Distros
The MOFED container needs kernel headers to compile. How theyβre provided varies:
RHEL/Rocky/CentOS:
# Headers typically at /usr/src/kernels/$(uname -r)
# MOFED container mounts host /usr/src and /lib/modulesOpenShift (RHCOS):
# RHCOS is immutable β no dnf/yum available
# The MOFED container uses Driver Toolkit (DTK) or
# extensions to get kernel-devel packages
# Network Operator handles this automatically
env:
- name: DTK_OCP_VERSION
value: "4.18" # Operator auto-detects if omittedUbuntu:
# Headers at /usr/src/linux-headers-$(uname -r)
# Ensure linux-headers package installed on hostOpenShift β Driver Toolkit Integration
On OpenShift, the MOFED container uses the Driver Toolkit (DTK) base image which includes kernel headers matching the clusterβs RHCOS version:
spec:
ofedDriver:
image: mofed
repository: nvcr.io/nvidia/mellanox
version: 24.07-0.7.0.0
forcePrecompiled: false
# The operator automatically selects the correct DTK image
# for the running OpenShift version's kernel
# No manual DTK configuration neededThe build sequence on OpenShift:
- Operator queries cluster version β determines RHCOS kernel
- Pulls DTK image matching that kernel
- MOFED init container uses DTK for kernel-devel packages
- Compiles MOFED modules against the DTK headers
- Loads compiled modules into the running kernel
Monitor the Build Process
# Watch MOFED pod β Init phase is the compile step
kubectl get pods -n nvidia-network-operator -l app=mofed -w
# NAME READY STATUS RESTARTS
# mofed-gpu-node-1 0/1 Init:0/1 0 β Compiling
# mofed-gpu-node-1 1/1 Running 0 β Done
# Watch compilation logs in real-time
kubectl logs -n nvidia-network-operator mofed-gpu-node-1 -c mofed-container -f
# Checking for precompiled modules... NOT FOUND
# Kernel version: 5.14.0-427.40.1.el9_4.x86_64
# Starting DKMS build for MLNX_OFED 24.07-0.7.0.0
# Building mlx5_core... OK
# Building mlx5_ib... OK
# Building rdma_rxe... OK
# ...
# Build complete. Loading modules.
# Check compile duration
kubectl describe pod -n nvidia-network-operator mofed-gpu-node-1 | grep -A5 "State:"Custom Compile Flags
env:
# Add extra DKMS build flags
- name: DKMS_EXTRA_FLAGS
value: "--force" # Force rebuild even if cached
# Skip specific modules
- name: MLNX_OFED_SRC_SKIP_MODULES
value: "isert iser srp" # Skip storage modules
# Enable debug build
- name: MLNX_OFED_DEBUG
value: "1"Precompiled vs On-Node Build
| Aspect | Precompiled | On-Node Build |
|---|---|---|
| Speed | Seconds (load only) | 5-15 minutes (compile) |
| Reliability | High (pre-tested) | Medium (compile can fail) |
| Kernel flexibility | Fixed set | Any kernel |
| Air-gapped | Must mirror images | Needs headers only |
| OpenShift | Limited kernel versions | DTK covers all RHCOS |
| Custom kernels | Not supported | Fully supported |
Air-Gapped Build Strategy
# In disconnected environments, the MOFED container
# needs access to kernel-devel packages
spec:
ofedDriver:
image: mofed
repository: registry.example.com:8443/nvidia/mellanox
version: 24.07-0.7.0.0
forcePrecompiled: false
env:
# Point to internal yum/dnf repo for kernel-devel
- name: ADDITIONAL_YUM_REPOS
value: "http://repo.example.com/rhel9-kernel-devel"
# Or for OpenShift, the DTK image must be mirrored
# oc-mirror handles this with the GPU/Network Operator catalogTroubleshooting Build Failures
# Check why build failed
kubectl logs -n nvidia-network-operator mofed-gpu-node-1 -c mofed-container | grep -i "error\|fail"
# Common: missing kernel headers
# "ERROR: Kernel source directory /usr/src/kernels/5.14.0-427.el9 not found"
# Fix: install kernel-devel on the node or ensure DTK is available
# Common: GCC version mismatch
# "ERROR: Compiler version mismatch"
# Fix: update MOFED version or pin kernel version
# Common: disk space
# "No space left on device"
# Fix: MOFED build needs ~2GB in /tmp β ensure node has space
# Force rebuild after fix
kubectl delete pod -n nvidia-network-operator mofed-gpu-node-1
# DaemonSet recreates it β fresh build attemptValidate After Build
# Confirm MOFED loaded (not inbox)
kubectl exec -n nvidia-network-operator mofed-gpu-node-1 -- ofed_info -s
# MLNX_OFED_LINUX-24.07-0.7.0.0
# Verify modules are MOFED-built (not inbox)
kubectl exec -n nvidia-network-operator mofed-gpu-node-1 -- \
modinfo mlx5_core | grep -E "version|filename"
# version: 24.07-0.7.0.0
# filename: /lib/modules/.../updates/dkms/mlx5_core.ko
# Check RDMA working
kubectl exec -n nvidia-network-operator mofed-gpu-node-1 -- ibv_devinfo
# hca_id: mlx5_0
# transport: InfiniBand (0)
# fw_ver: 28.39.1002
# Bandwidth test between two nodes
kubectl exec -n nvidia-network-operator mofed-gpu-node-1 -- \
ib_write_bw -d mlx5_0 --report_gbitsCommon Issues
Build takes >15 minutes β pod killed by liveness probe
Increase livenessProbe.initialDelaySeconds to 900+ and startupProbe.failureThreshold to 120 for slow nodes.
βKernel source not foundβ on RHCOS
DTK image not available or not mirrored. In disconnected OpenShift, mirror the DTK image matching your cluster version: registry.redhat.io/openshift4/driver-toolkit-rhel9.
Compile succeeds but modules donβt load β βsymbol version mismatchβ
Kernel was updated between build start and module load. Reboot the node to ensure consistent kernel, then let MOFED rebuild.
Build succeeds on some nodes but not others
Different kernel versions across nodes (partial upgrade). Ensure all nodes in the MCP run the same kernel before MOFED deployment.
Best Practices
forcePrecompiled: falseβ let the operator fall back to compilation- Increase startup/liveness timeouts β compilation takes 5-15 minutes
- Pin kernel versions during MOFED rollout β avoid kernel updates mid-deployment
- Mirror DTK image in air-gapped β MOFED on OpenShift needs DTK for headers
- Test build on one node first β label a single node, verify, then expand
- Monitor build logs β donβt just wait for Running; check compile output
- Keep /tmp clean on nodes β DKMS needs ~2GB temp space
Key Takeaways
forcePrecompiled: falselets the operator build MOFED drivers on-node when no precompiled modules match- Build uses DKMS: detects kernel β fetches headers β compiles β loads modules
- OpenShift uses Driver Toolkit (DTK) for kernel headers on immutable RHCOS
- Compilation takes 5-15 minutes β increase probe timeouts accordingly
- Precompiled is faster and more reliable; on-node build provides kernel flexibility
- In air-gapped environments, mirror kernel-devel repos or DTK images
- Always validate with
ofed_info -sandmodinfo mlx5_coreafter build

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
