πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Networking advanced ⏱ 15 minutes K8s 1.28+

NFSoRDMA Jumbo Frames MTU Configuration

Configure 9000 MTU jumbo frames for NFSoRDMA interfaces using NNCP to maximize RDMA throughput on Kubernetes nodes.

By Luca Berton β€’ β€’ πŸ“– 6 min read

πŸ’‘ Quick Answer: Set MTU 9000 on the RDMA-capable NIC via NodeNetworkConfigurationPolicy (NNCP), verify end-to-end jumbo frame support on the switch fabric, and configure NFS mount with rsize=1048576,wsize=1048576 to fully utilize the larger frame size.

The Problem

Default 1500-byte MTU creates excessive packet fragmentation for RDMA workloads. NFSoRDMA transfers large sequential blocks (1MB+), and each block gets split into thousands of small frames. This wastes CPU cycles on packet processing and reduces throughput by 30-40% compared to jumbo frames.

The Solution

Enable 9000 MTU (jumbo frames) end-to-end: switch ports, host NICs, and NFS mount options. Every hop in the path must support the same MTU β€” a single 1500 MTU link breaks the entire chain.

NNCP for Jumbo Frame NIC

apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: rdma-nic-mtu9000
spec:
  nodeSelector:
    node-role.kubernetes.io/worker: ""
    feature.node.kubernetes.io/rdma: "true"
  desiredState:
    interfaces:
      - name: ens4f0  # RDMA-capable NIC
        type: ethernet
        state: up
        mtu: 9000
        ipv4:
          enabled: true
          address:
            - ip: "{{ ansible_host_rdma_ip }}"
              prefix-length: 24
          dhcp: false
        ethtool:
          feature:
            rx-gro: true
            tx-gso: true
            rx-checksum: true

Bonded RDMA Interface with Jumbo Frames

apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: rdma-bond-mtu9000
spec:
  nodeSelector:
    node-role.kubernetes.io/worker: ""
    feature.node.kubernetes.io/rdma: "true"
  desiredState:
    interfaces:
      - name: bond-rdma
        type: bond
        state: up
        mtu: 9000
        ipv4:
          enabled: true
          address:
            - ip: 10.100.0.11
              prefix-length: 24
          dhcp: false
        link-aggregation:
          mode: 802.3ad
          options:
            miimon: "100"
            xmit_hash_policy: layer3+4
          port:
            - ens4f0
            - ens4f1
      # Member interfaces must also have MTU 9000
      - name: ens4f0
        type: ethernet
        state: up
        mtu: 9000
      - name: ens4f1
        type: ethernet
        state: up
        mtu: 9000

Per-Node Static IP with Jumbo Frames

# Worker node 1
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: rdma-mtu9000-worker1
spec:
  nodeSelector:
    kubernetes.io/hostname: worker-1
  desiredState:
    interfaces:
      - name: ens4f0
        type: ethernet
        state: up
        mtu: 9000
        ipv4:
          enabled: true
          address:
            - ip: 10.100.0.11
              prefix-length: 24
          dhcp: false
    routes:
      config:
        - destination: 10.100.0.0/24
          next-hop-interface: ens4f0
---
# Worker node 2
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: rdma-mtu9000-worker2
spec:
  nodeSelector:
    kubernetes.io/hostname: worker-2
  desiredState:
    interfaces:
      - name: ens4f0
        type: ethernet
        state: up
        mtu: 9000
        ipv4:
          enabled: true
          address:
            - ip: 10.100.0.12
              prefix-length: 24
          dhcp: false
    routes:
      config:
        - destination: 10.100.0.0/24
          next-hop-interface: ens4f0

NFS Server MTU Configuration

# On the NFS server β€” set MTU 9000 on the RDMA interface
ip link set dev ens4f0 mtu 9000

# Persistent via /etc/sysconfig/network-scripts/ifcfg-ens4f0
# MTU=9000

# Or via NetworkManager
nmcli con mod rdma-conn 802-3-ethernet.mtu 9000
nmcli con up rdma-conn

# Verify
ip link show ens4f0 | grep mtu
# ens4f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 ...

Switch Configuration (Example: Cisco Nexus)

! Enable jumbo frames on switch ports connected to RDMA NICs
interface Ethernet1/1-1/4
  description RDMA-NIC-Workers
  switchport mode access
  switchport access vlan 100
  mtu 9216  ! Switch MTU slightly larger to accommodate headers

! Verify
show interface Ethernet1/1 | include MTU

Verify End-to-End MTU

# From worker node β€” test jumbo frame path to NFS server
# -M do = don't fragment, -s 8972 = payload (9000 - 28 bytes IP/ICMP header)
ping -M do -s 8972 -c 3 10.100.0.1

# If this fails, there's a 1500 MTU hop in the path
# Reduce size to find the actual MTU:
ping -M do -s 4000 -c 1 10.100.0.1
ping -M do -s 6000 -c 1 10.100.0.1
ping -M do -s 8000 -c 1 10.100.0.1

# Verify RDMA interface MTU on each worker
for node in worker-{1..4}; do
  echo "=== ${node} ==="
  ssh ${node} "ip link show ens4f0 | grep mtu"
done

# Verify NFS mount is using RDMA with proper rsize/wsize
mount | grep nfs
# 10.100.0.1:/export on /mnt/nfs type nfs (rw,proto=rdma,port=20049,rsize=1048576,wsize=1048576)

NFS Mount with Jumbo Frame Optimization

# PV mount options optimized for jumbo frames
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfsordma-jumbo
spec:
  capacity:
    storage: 1Ti
  accessModes:
    - ReadWriteMany
  nfs:
    server: 10.100.0.1
    path: /export/data
  mountOptions:
    - proto=rdma
    - port=20049
    - vers=4.1
    - rsize=1048576   # 1MB read size β€” matches jumbo frame efficiency
    - wsize=1048576   # 1MB write size
    - hard
    - timeo=600
    - retrans=2
    - nconnect=8      # Multiple RDMA connections for parallelism

Performance Comparison Script

#!/bin/bash
# Compare throughput: 1500 MTU vs 9000 MTU
echo "=== Testing NFSoRDMA throughput ==="

# Sequential write test
echo "Sequential write (1MB blocks):"
dd if=/dev/zero of=/mnt/nfs/test-jumbo bs=1M count=4096 oflag=direct 2>&1 | tail -1

# Sequential read test
echo "Sequential read (1MB blocks):"
dd if=/mnt/nfs/test-jumbo of=/dev/null bs=1M iflag=direct 2>&1 | tail -1

# Cleanup
rm -f /mnt/nfs/test-jumbo

# Expected improvement with jumbo frames:
# 1500 MTU: ~3-4 GB/s (ConnectX-6 100GbE)
# 9000 MTU: ~5-6 GB/s (30-50% improvement)
# Reason: fewer packets, less CPU overhead, better RDMA efficiency
graph LR
    A[Worker Node] -->|MTU 9000| B[ConnectX NIC ens4f0]
    B -->|MTU 9000 RDMA| C[Switch Port MTU 9216]
    C -->|MTU 9000| D[NFS Server NIC]
    D -->|MTU 9000| E[NFS Server]
    
    F[1500 MTU anywhere?] -->|Breaks path| G[Fragmentation and drop]
    
    H[NNCP] -->|Configures| B
    I[rsize and wsize 1MB] -->|Optimized for| J[Jumbo frames]

Common Issues

  • ping -M do -s 8972 fails β€” a hop in the path has MTU < 9000; check switch port MTU and any intermediate routers
  • NNCP stuck in Progressing β€” NMState operator may timeout if NIC is in use; drain node first with kubectl drain
  • Bond member MTU mismatch β€” all bond member interfaces must have MTU β‰₯ bond MTU; set member MTU before bond MTU
  • NFS mount falls back to TCP β€” proto=rdma mount option requires NFS server to listen on port 20049; verify rpcinfo -p | grep 20049
  • Performance not improving β€” verify MTU end-to-end including NFS server; check ethtool -S ens4f0 | grep drop for dropped frames

Best Practices

  • Always verify MTU end-to-end before deploying workloads: node β†’ switch β†’ NFS server
  • Set switch MTU to 9216 (slightly larger) to accommodate L2 headers
  • Use nconnect=8 with jumbo frames for maximum parallelism
  • Set rsize=1048576,wsize=1048576 β€” 1MB blocks align with jumbo frame efficiency
  • Configure MTU via NNCP (not manually) for persistence across reboots
  • Monitor ethtool -S counters for frame drops and errors after MTU change
  • Drain nodes before applying NNCP MTU changes to avoid disrupting running pods

Key Takeaways

  • Jumbo frames (MTU 9000) improve NFSoRDMA throughput by 30-50%
  • Every link in the path must support the same MTU β€” one 1500 hop breaks it
  • NNCP manages MTU persistently on Kubernetes worker nodes
  • ping -M do -s 8972 is the definitive test for end-to-end jumbo support
  • Combine with rsize/wsize=1048576 and nconnect=8 for maximum performance
#nfsordma #rdma #mtu #jumbo-frames #nncp #performance
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens