NFSoRDMA Worker Node Setup
Complete worker node setup for NFS over RDMA including kernel modules, NFS client configuration, PersistentVolume mounts, and RDMA transport verification.
π‘ Quick Answer: Load
xprtrdmakernel module on workers, mount NFS with-o rdma,port=20049, and create PersistentVolumes targeting the RDMA mount. Verify RDMA transport with/proc/self/mountstats.
The Problem
After configuring dedicated NICs and switch access mode for NFSoRDMA, the worker nodes need:
- RDMA kernel modules loaded β
xprtrdmafor the NFS client transport - NFS client configured for RDMA β default NFS mounts use TCP, not RDMA
- PersistentVolumes using RDMA β Kubernetes workloads must use the RDMA-backed NFS mount
- Verification β confirming traffic actually uses RDMA, not falling back to TCP
The Solution
Step 1: Load RDMA Kernel Modules
Create a MachineConfig to load modules at boot (OpenShift) or a DaemonSet for vanilla Kubernetes:
# OpenShift MachineConfig
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: 99-worker-rdma-modules
labels:
machineconfiguration.openshift.io/role: worker
spec:
config:
ignition:
version: 3.2.0
storage:
files:
- path: /etc/modules-load.d/rdma-nfs.conf
mode: 0644
contents:
source: data:text/plain;charset=utf-8,xprtrdma%0Asvcrdma%0Ardma_ucm%0Ardma_cm%0Aib_ipoib# Vanilla Kubernetes DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: rdma-module-loader
namespace: kube-system
spec:
selector:
matchLabels:
app: rdma-modules
template:
metadata:
labels:
app: rdma-modules
spec:
nodeSelector:
node-role.kubernetes.io/worker: ""
hostNetwork: true
hostPID: true
initContainers:
- name: load-modules
image: registry.access.redhat.com/ubi9/ubi-minimal:latest
securityContext:
privileged: true
command:
- /bin/sh
- -c
- |
nsenter -t 1 -m -u -i -n -- modprobe xprtrdma
nsenter -t 1 -m -u -i -n -- modprobe svcrdma
echo "RDMA modules loaded"
containers:
- name: sleep
image: registry.access.redhat.com/ubi9/ubi-minimal:latest
command: ["sleep", "infinity"]Step 2: NFS Server RDMA Configuration
Ensure the NFS server exports over RDMA:
# /etc/nfs.conf on NFS server
[nfsd]
rdma=y
rdma-port=20049
vers3=n
vers4=y
vers4.1=y
vers4.2=y
# Restart NFS server
systemctl restart nfs-server
# Verify RDMA listening
cat /proc/fs/nfsd/portlist
# Should show: rdma 20049Step 3: Manual RDMA Mount Test
Test on a worker before creating PVs:
# Mount NFS over RDMA
oc debug node/worker-0 -- chroot /host \
mkdir -p /mnt/nfsordma
oc debug node/worker-0 -- chroot /host \
mount -t nfs4 -o rdma,port=20049,vers=4.2 \
10.90.0.1:/exports/data /mnt/nfsordma
# Verify RDMA transport
oc debug node/worker-0 -- chroot /host \
grep -A5 "10.90.0.1" /proc/self/mountstats | grep xprt
# Should show: xprt: rdma ... (not tcp)
# Benchmark
oc debug node/worker-0 -- chroot /host \
dd if=/dev/zero of=/mnt/nfsordma/test bs=1M count=1024 oflag=direct
# Expected: 2-5 GB/s with RDMA vs 500MB-1GB/s with TCPStep 4: PersistentVolume with RDMA NFS
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfsordma-data
spec:
capacity:
storage: 1Ti
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 10.90.0.1
path: /exports/data
mountOptions:
- rdma
- port=20049
- vers=4.2
- hard
- rsize=1048576
- wsize=1048576
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: training-data
namespace: ai-workloads
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Ti
volumeName: nfsordma-data
---
apiVersion: v1
kind: Pod
metadata:
name: gpu-training
namespace: ai-workloads
spec:
containers:
- name: training
image: nvcr.io/nvidia/pytorch:24.01-py3
volumeMounts:
- name: data
mountPath: /data
resources:
requests:
nvidia.com/gpu: 1
volumes:
- name: data
persistentVolumeClaim:
claimName: training-dataStep 5: Verify RDMA in Running Pods
# Check mount options inside the pod
oc exec gpu-training -n ai-workloads -- mount | grep nfs
# Should show: proto=rdma
# Check mountstats from the node
NODE=$(oc get pod gpu-training -n ai-workloads -o jsonpath='{.spec.nodeName}')
oc debug node/$NODE -- chroot /host \
grep -A10 "10.90.0.1" /proc/self/mountstatsflowchart TD
A[Worker Node Setup] --> B[Load xprtrdma module]
B --> C[NNCP configures dedicated NIC]
C --> D[Switch access mode VLAN 90]
D --> E[NFS server RDMA port 20049]
E --> F[PV with mountOptions rdma]
F --> G[PVC bound to PV]
G --> H[Pod mounts NFSoRDMA volume]
H --> I[2-5 GB per s throughput]
J[Verify] --> K[mountstats shows xprt rdma]Common Issues
Mount succeeds but uses TCP
# Check if xprtrdma module is loaded
oc debug node/worker-0 -- chroot /host lsmod | grep rdma
# Check NFS server is listening on RDMA port
# From worker:
oc debug node/worker-0 -- chroot /host \
rpcinfo -p 10.90.0.1 | grep 20049
# Verify mountOptions include both 'rdma' AND 'port=20049'
oc get pv nfsordma-data -o yaml | grep -A5 mountOptionsPod cannot mount NFS RDMA volume
# The kubelet mounts NFS β it needs RDMA modules on the node
# Verify modules are loaded on the node running the pod
NODE=$(oc get pod <pod> -o jsonpath='{.spec.nodeName}')
oc debug node/$NODE -- chroot /host lsmod | grep xprtrdma
# Check kubelet mount logs
oc debug node/$NODE -- chroot /host \
journalctl -u kubelet --since "5 min ago" | grep -i nfsrsize/wsize not applied
# Ensure mountOptions are in the PV, not PVC
mountOptions:
- rdma
- port=20049
- rsize=1048576 # 1MB read buffer
- wsize=1048576 # 1MB write buffer
- vers=4.2Best Practices
- Load RDMA modules at boot β use MachineConfig (OpenShift) or DaemonSet initContainer
- Always specify
port=20049β the default NFS port (2049) uses TCP - Use large rsize/wsize β 1MB buffers maximize RDMA throughput
- Set
hardmount option β prevents data corruption on transient network issues - Verify with mountstats β confirm
xprt: rdmaafter every mount - Use NFSv4.2 β required for server-side copy and other RDMA optimizations
- Benchmark before production β
ddorfioto confirm RDMA throughput (should be 2-5x TCP)
Key Takeaways
- Worker nodes need
xprtrdmakernel module for NFS RDMA client transport - NFS server must listen on port 20049 with
rdma=yin nfs.conf - PersistentVolumes use
mountOptions: [rdma, port=20049]to enable RDMA transport - Always verify with
/proc/self/mountstatsβ look forxprt: rdma, notxprt: tcp - NFSoRDMA delivers 2-5x throughput compared to NFS over TCP β critical for AI/GPU workloads

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
