NFSoRDMA Persistent Volume for Kubernetes
Create PersistentVolumes and StorageClasses for NFSoRDMA storage with RDMA transport, optimized mount options, and ReadWriteMany access.
π‘ Quick Answer: Create a PersistentVolume with
nfs.serverpointing to the RDMA IP andmountOptions: [proto=rdma, port=20049, vers=4.1]to get NFS storage over RDMA with ReadWriteMany access.
The Problem
Standard NFS over TCP wastes CPU on protocol overhead and maxes out at 3-4 GB/s even on 100GbE. AI training workloads need shared storage with near-line-rate throughput and low latency. You need to expose NFSoRDMA storage as Kubernetes PersistentVolumes that pods can consume transparently.
The Solution
Create PVs with RDMA mount options that bypass TCP/IP, delivering 10-12 GB/s throughput with microsecond latency. Use static PVs for dedicated shares or the NFS CSI driver for dynamic provisioning.
Static PV with RDMA Mount Options
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfsordma-data
labels:
storage-type: rdma
purpose: ai-training
spec:
capacity:
storage: 10Ti
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 10.100.0.1 # RDMA interface IP of NFS server
path: /export/ai-data
mountOptions:
- proto=rdma
- port=20049 # NFSoRDMA port (not standard 2049)
- vers=4.1
- rsize=1048576 # 1MB read blocks
- wsize=1048576 # 1MB write blocks
- hard
- timeo=600
- retrans=2
- nconnect=8 # 8 parallel RDMA connections
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ai-training-data
namespace: ai-workloads
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Ti
selector:
matchLabels:
storage-type: rdma
purpose: ai-trainingMultiple PVs for Different Workloads
# Dataset storage β large sequential reads
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfsordma-datasets
labels:
storage-type: rdma
purpose: datasets
spec:
capacity:
storage: 50Ti
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 10.100.0.1
path: /export/datasets
mountOptions:
- proto=rdma
- port=20049
- vers=4.1
- rsize=1048576
- wsize=1048576
- hard
- nconnect=16 # More connections for parallel reads
- noatime # Skip access time updates
---
# Checkpoint storage β frequent small writes
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfsordma-checkpoints
labels:
storage-type: rdma
purpose: checkpoints
spec:
capacity:
storage: 5Ti
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 10.100.0.1
path: /export/checkpoints
mountOptions:
- proto=rdma
- port=20049
- vers=4.1
- rsize=1048576
- wsize=1048576
- hard
- sync # Sync writes for checkpoint integrity
- nconnect=4
---
# Model output storage
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfsordma-models
labels:
storage-type: rdma
purpose: models
spec:
capacity:
storage: 2Ti
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 10.100.0.1
path: /export/models
mountOptions:
- proto=rdma
- port=20049
- vers=4.1
- rsize=1048576
- wsize=1048576
- hard
- nconnect=8PVCs for AI Training Pods
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: datasets
namespace: ai-workloads
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Ti
selector:
matchLabels:
storage-type: rdma
purpose: datasets
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: checkpoints
namespace: ai-workloads
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Ti
selector:
matchLabels:
storage-type: rdma
purpose: checkpoints
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: models
namespace: ai-workloads
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Ti
selector:
matchLabels:
storage-type: rdma
purpose: modelsPod Using NFSoRDMA Volumes
apiVersion: v1
kind: Pod
metadata:
name: training-job
namespace: ai-workloads
spec:
nodeSelector:
feature.node.kubernetes.io/rdma: "true"
containers:
- name: trainer
image: nvcr.io/nvidia/pytorch:24.03-py3
command: ["torchrun", "--nproc_per_node=8", "train.py"]
resources:
limits:
nvidia.com/gpu: 8
volumeMounts:
- name: datasets
mountPath: /data
readOnly: true
- name: checkpoints
mountPath: /checkpoints
- name: models
mountPath: /models
volumes:
- name: datasets
persistentVolumeClaim:
claimName: datasets
- name: checkpoints
persistentVolumeClaim:
claimName: checkpoints
- name: models
persistentVolumeClaim:
claimName: modelsDynamic Provisioning with NFS CSI Driver
# Install NFS CSI driver
helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts
helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs \
--namespace kube-system \
--set driver.mountPermissions=0777apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfsordma-dynamic
provisioner: nfs.csi.k8s.io
parameters:
server: 10.100.0.1
share: /export/dynamic
mountOptions:
- proto=rdma
- port=20049
- vers=4.1
- rsize=1048576
- wsize=1048576
- hard
- nconnect=8
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
---
# Dynamic PVC β CSI driver creates subdirectory automatically
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dynamic-rdma-storage
namespace: ai-workloads
spec:
storageClassName: nfsordma-dynamic
accessModes:
- ReadWriteMany
resources:
requests:
storage: 500GiVerify RDMA Mount
# Check mount is using RDMA transport
kubectl exec -n ai-workloads training-job -- mount | grep nfs
# 10.100.0.1:/export/datasets on /data type nfs4 (ro,proto=rdma,port=20049,...)
# Verify RDMA connections
kubectl exec -n ai-workloads training-job -- cat /proc/mounts | grep rdma
# Performance test from inside the pod
kubectl exec -n ai-workloads training-job -- \
dd if=/data/test-file of=/dev/null bs=1M count=4096 iflag=direct
# Expected: 8-12 GB/s with RDMA + jumbo frames
# Check NFS stats
kubectl exec -n ai-workloads training-job -- nfsstat -cgraph TD
A[PersistentVolume] -->|proto=rdma port=20049| B[NFS Server RDMA IP]
C[PVC datasets] --> A
D[PVC checkpoints] --> E[PV checkpoints]
E -->|proto=rdma| B
F[Training Pod] -->|/data| C
F -->|/checkpoints| D
F -->|/models| G[PVC models]
H[StorageClass nfsordma-dynamic] -->|CSI Driver| I[Dynamic PV]
I -->|proto=rdma| B
J[Node Selector rdma=true] -->|Ensures| K[Pod on RDMA-capable node]Common Issues
- Mount fails with
proto=rdmaβ node must have RDMA kernel modules loaded (rdma_cm,ib_core); verify withlsmod | grep rdma - PVC stuck Pending β check label selectors match between PV and PVC; verify PV is
AvailablenotReleased - Slow performance despite RDMA β check
nconnectvalue; verify jumbo frames (MTU 9000) end-to-end; checknoatimefor read-heavy workloads - Permission denied on mount β NFS server exports must allow the node IPs; check
/etc/exportson NFS server - Pod scheduled on non-RDMA node β add
nodeSelectorwithfeature.node.kubernetes.io/rdma: "true"to ensure RDMA-capable placement
Best Practices
- Use
proto=rdma,port=20049β standard NFS port 2049 uses TCP - Set
nconnect=8minimum,nconnect=16for dataset reads with many workers - Always use
hardmount βsoftmounts can cause silent data corruption - Separate PVs by purpose (datasets, checkpoints, models) for different mount options
- Use
noatimefor read-heavy workloads to reduce metadata overhead - Use
syncfor checkpoint PVs to ensure write durability - Label PVs with
storage-type: rdmafor selector-based PVC binding - Add
nodeSelectorto pods to ensure scheduling on RDMA-capable nodes
Key Takeaways
- NFSoRDMA PVs use
proto=rdma,port=20049mount options for RDMA transport - ReadWriteMany access enables shared storage across multi-node training jobs
- Static PVs for dedicated shares, NFS CSI driver for dynamic provisioning
nconnectmultiplies parallelism,rsize/wsize=1048576optimizes block size- Always pair with jumbo frames (MTU 9000) and RDMA node selectors

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
