Diagnose NVIDIA Memory-Only Kernel Modules on OpenShift
Understand why lsmod shows NVIDIA modules loaded but modinfo fails, and how the GPU Operator's proprietary driver container inserts modules without.
π‘ Quick Answer:
lsmodreads/proc/modules(in-memory state) whilemodinfosearches for.kofiles on disk. Proprietary NVIDIA driver containers useinsmodto load modules directly into memory without installing.kofiles under/lib/modules/, causingmodinfoto fail.
A confusing situation arises on OpenShift when lsmod shows NVIDIA modules loaded but modinfo cannot find them.
The Symptom
lsmod | grep nvidia_fs
# nvidia_fs 323584 0
modinfo nvidia_fs
# modinfo: ERROR: Module nvidia_fs not found.The module is loaded and functioning, yet modinfo reports it does not exist.
Why This Happens
Two tools, two data sources:
| Tool | Data Source | What It Shows |
|---|---|---|
lsmod | /proc/modules | Kernel memory β what is currently loaded |
modinfo | /lib/modules/$(uname -r)/ | Disk β where .ko files are stored |
The NVIDIA GPU Operatorβs proprietary driver flow works like this:
- Extracts the
.runinstaller inside the container - Runs with
--no-kernel-modulesflag (skips on-disk installation) - Uses
insmodto directly insert.kofiles from the container filesystem into the host kernel - Does not copy
.kofiles to/lib/modules/on the host
This leaves the kernel with loaded modules that have no backing file on the host disk.
How to Confirm
oc debug node/<node-name>
chroot /host
# Check for .ko files on disk
find /lib/modules/$(uname -r) -name "nvidia*.ko" -o -name "nvidia*.ko.xz"
# Compare with loaded modules
lsmod | grep nvidiaIf find returns fewer files than lsmod shows modules, those missing ones are memory-only.
Impact
Memory-only modules cause problems when:
- GDS tries to load its own
nvidia_fs.koβinsmod: File exists - Module updates fail because there is nothing to replace on disk
- Debugging cannot inspect module metadata or version info
- depmod cannot track module dependencies
Resolution
Switch to the Open Kernel Module, which properly installs .ko files on disk:
oc edit clusterpolicy gpu-cluster-policyspec:
driver:
kernelModuleType: openAfter switching, reboot nodes and restart driver pods. Then verify:
# Both commands should succeed
lsmod | grep nvidia_fs
modinfo nvidia_fs
# .ko file exists on disk
ls -la /lib/modules/$(uname -r)/extra/nvidia*.koWhy This Matters
Memory-only modules create invisible version mismatches and block GDS initialization. Switching to the open kernel module provides full on-disk module management, proper modinfo output, and compatibility with all GPU Operator features.

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
