Tune NCCL Env Variables for RDMA & Ethernet
Apply safe NCCL environment variable profiles for RDMA-capable and Ethernet-only GPU clusters to maximize collective communication throughput.
π‘ Quick Answer: Start with
NCCL_DEBUG=INFO, setNCCL_SOCKET_IFNAMEto the correct data interface, and enable or disable InfiniBand explicitly usingNCCL_IB_DISABLE.
Use explicit NCCL environment configuration to reduce transport ambiguity and improve repeatability.
RDMA-Oriented Profile
NCCL_DEBUG=INFO
NCCL_IB_DISABLE=0
NCCL_SOCKET_IFNAME=eth0Ethernet-Only Profile
NCCL_DEBUG=INFO
NCCL_IB_DISABLE=1
NCCL_SOCKET_IFNAME=eth0Validation Loop
- Apply one profile.
- Run
all_reduce_perfand keep logs. - Compare bandwidth and error rates.
Best Practices
- Change one variable at a time when troubleshooting.
- Keep per-cluster baseline profiles under version control.
- Re-test after CNI, firmware, or driver upgrades.
NCCL Environment Tuning for RDMA and Ethernet
Configure NCCL environment variables to optimize GPU communication over RDMA (InfiniBand/RoCE) and Ethernet networks.
RDMA Configuration
apiVersion: v1
kind: Pod
metadata:
name: nccl-rdma-training
spec:
containers:
- name: training
image: nvcr.io/nvidia/pytorch:25.11-py3
env:
# Force RDMA transport (disable TCP fallback)
- name: NCCL_IB_DISABLE
value: "0"
# Select IB interface
- name: NCCL_IB_HCA
value: "mlx5_0,mlx5_1"
# Enable GPUDirect RDMA
- name: NCCL_NET_GDR_LEVEL
value: "SYS"
# Tune buffer sizes for large messages
- name: NCCL_BUFFSIZE
value: "8388608"
# Use adaptive routing if switch supports it
- name: NCCL_IB_ADAPTIVE_ROUTING
value: "1"
resources:
limits:
nvidia.com/gpu: 8
rdma/rdma_shared_device_a: 1Ethernet/RoCE Configuration
env:
# Disable IB (use RoCE over Ethernet)
- name: NCCL_IB_DISABLE
value: "0"
# Force specific NIC for TCP socket communication
- name: NCCL_SOCKET_IFNAME
value: "eth0"
# RoCE-specific GID index (usually 3 for RoCEv2)
- name: NCCL_IB_GID_INDEX
value: "3"
# Enable ECN for RoCE congestion control
- name: NCCL_IB_TC
value: "106"Debugging NCCL Transport Selection
# Enable detailed NCCL logging
export NCCL_DEBUG=INFO
export NCCL_DEBUG_SUBSYS=NET,INIT
# Verify RDMA is being used (look for "NET/IB" not "NET/Socket")
# Good: [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE
# Bad: [0] NCCL INFO NET/Socket : Using [0]eth0Performance Comparison
| Transport | Bandwidth | Latency | Use Case |
|---|---|---|---|
| NVLink | 900 GB/s | <1ΞΌs | Intra-node GPU-GPU |
| IB HDR | 200 Gb/s | 1-2ΞΌs | Inter-node RDMA |
| RoCE v2 | 100 Gb/s | 2-5ΞΌs | Inter-node Ethernet |
| TCP | 25-100 Gb/s | 10-50ΞΌs | Fallback only |

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
