πŸ“šBook Signing at KubeCon EU 2026Meet us at Booking.com HQ (Mon 18:30-21:00) & vCluster booth #521 (Tue 24 Mar, 12:30-1:30pm) β€” free book giveaway!RSVP Booking.com Event
Configuration intermediate ⏱ 15 minutes K8s 1.28+

Separate Worker and Infra MachineConfigPools

Create dedicated MachineConfigPools for infrastructure and GPU nodes. Isolate MCP rollout blast radius and control update order for different node types.

By Luca Berton β€’ β€’ πŸ“– 5 min read

πŸ’‘ Quick Answer: Create separate MCPs by labeling nodes with custom roles and creating MachineConfigPool resources that select those labels. This lets you update infra nodes independently from GPU workers, with different maxUnavailable settings and pause controls.

The Problem

All worker nodes share a single MCP. When you apply a MachineConfig, ALL workers update β€” including GPU nodes running expensive training jobs, ingress nodes handling production traffic, and storage nodes. You need to isolate these groups to control blast radius and update order.

The Solution

Step 1: Label Nodes

# Label infrastructure nodes
oc label node infra-1 node-role.kubernetes.io/infra=""
oc label node infra-2 node-role.kubernetes.io/infra=""

# Label GPU nodes
oc label node gpu-worker-1 node-role.kubernetes.io/gpu=""
oc label node gpu-worker-2 node-role.kubernetes.io/gpu=""

Step 2: Create Custom MCPs

---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: infra
spec:
  machineConfigSelector:
    matchExpressions:
      - key: machineconfiguration.openshift.io/role
        operator: In
        values: [worker, infra]
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/infra: ""
  maxUnavailable: 1
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: gpu
spec:
  machineConfigSelector:
    matchExpressions:
      - key: machineconfiguration.openshift.io/role
        operator: In
        values: [worker, gpu]
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/gpu: ""
  maxUnavailable: 1
  paused: true   # GPU nodes update manually only

Step 3: Apply Role-Specific MachineConfigs

# Config for GPU nodes only
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  name: 99-gpu-nvidia-settings
  labels:
    machineconfiguration.openshift.io/role: gpu  # Only targets GPU MCP
spec:
  config:
    ignition:
      version: 3.2.0
    # GPU-specific kernel params, NVIDIA driver settings, etc.

Step 4: Controlled Update Order

# 1. Update general workers first
oc get mcp worker -w
# Wait for UPDATED=True

# 2. Update infra nodes
oc get mcp infra -w
# Wait for UPDATED=True

# 3. Update GPU nodes last (unpause)
oc patch mcp gpu --type merge -p '{"spec":{"paused":false}}'
oc get mcp gpu -w

Verify MCP Membership

oc get mcp -o custom-columns='NAME:.metadata.name,COUNT:.status.machineCount,UPDATED:.status.updatedMachineCount,NODES:.status.conditions[?(@.type=="Updated")].status'
# NAME     COUNT   UPDATED   NODES
# master   3       3         True
# worker   4       4         True
# infra    2       2         True
# gpu      2       2         True

Common Issues

Node in Two MCPs

A node can only belong to one MCP. If labels match multiple MCPs, the node shows Degraded. Ensure nodeSelectors are mutually exclusive.

Worker MachineConfigs Not Applied to Custom MCP

The machineConfigSelector must include worker role to inherit base worker configs:

machineConfigSelector:
  matchExpressions:
    - key: machineconfiguration.openshift.io/role
      operator: In
      values: [worker, gpu]   # Inherits worker configs + gpu-specific ones

Best Practices

  • Always include worker in machineConfigSelector β€” inherit base OS configs
  • Use nodeSelector for mutual exclusivity β€” each node in exactly one custom MCP
  • Pause GPU MCP by default β€” unpause only during planned maintenance windows
  • Update in order: general workers β†’ infra β†’ GPU β†’ masters
  • Set different maxUnavailable per MCP β€” aggressive for dev workers, conservative for GPU

Key Takeaways

  • Custom MCPs isolate rollout blast radius by node type
  • Nodes must belong to exactly one MCP β€” use exclusive nodeSelectors
  • Include worker in machineConfigSelector to inherit base configs
  • Pause sensitive MCPs (GPU) and update them on your schedule
  • Controlled update order prevents cluster-wide disruption
#openshift #machineconfig #mcp #infra #node-management
Luca Berton
Written by Luca Berton

Principal Solutions Architect specializing in Kubernetes, AI/GPU infrastructure, and cloud-native platforms. Author of Kubernetes Recipes and creator of CopyPasteLearn courses.

Kubernetes Recipes book cover

Want More Kubernetes Recipes?

This recipe is from Kubernetes Recipes, our 750-page practical guide with hundreds of production-ready patterns.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens