OCI Container Image Internals on Kubernetes
Understand OCI container image internals: layers as tar archive diffs, image configuration JSON, content-addressable storage with SHA-256, multi-platform image
π‘ Quick Answer: An OCI container image is a content-addressable bundle: filesystem layers (compressed tar diffs), an image configuration JSON (platform, env, cmd, user), and a manifest tying them together via SHA-256 digests. On Kubernetes, containerd/CRI-O pulls manifests, downloads layers in parallel, unpacks them into an overlay filesystem, and applies the config as container runtime settings.
The Problem
- Developers treat images as black boxes β canβt debug layer bloat or config issues
- Multi-platform images (amd64/arm64) fail on wrong architecture without clear error
- Image pull is slow β donβt understand whatβs being downloaded or why layers cache
- Security scanning reports vulnerabilities βin layer 3β β need to know what that means
- Registry API errors (blob unknown, manifest invalid) are cryptic without understanding internals
The Solution
OCI Image Structure
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Container Image (content-addressable) β
β β
β Image Layers Image Configuration β
β (tar archives with filesystem diffs) (JSON document) β
β β
β ββββββββββββ { β
β β Layer 0 β ββsha256βββββββββββββββΊ "architecture": "amd64", β
β ββββββββββββ "os": "linux", β
β β Diff β
β ββββββββββββ "rootfs": { β
β β Layer 1 β ββsha256βββββββββββββββΊ "type": "layers", β
β ββββββββββββ "diff_ids": [ β
β β Diff "sha256:c6f988f...", β
β ββββββββββββ "sha256:5f70bf1..." β
β β Layer 2 β ββsha256βββββββββββββββΊ ] β
β ββββββββββββ }, β
β β ... "config": { β
β ββββββββββββ "Cmd": ["/bin/my-app"], β
β β Layer N β "Env": ["PATH=..."], β
β ββββββββββββ "User": "alice" β
β } β
β } β
β β
β rootfs.diff_ids = SHA-256 of sha256(config JSON) β
β UNCOMPRESSED tar archives == Image ID β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββImage Layers Deep Dive
# Inspect image layers
crane manifest nginx:1.27 | jq '.layers[]'
# {
# "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
# "digest": "sha256:a480a496ba95a...",
# "size": 29150479
# }
# Each layer is a tar archive containing filesystem diffs:
# Layer 0: base OS (debian-slim) β /usr, /lib, /etc, /bin
# Layer 1: nginx binary + config β /etc/nginx/, /usr/sbin/nginx
# Layer 2: default site β /usr/share/nginx/html/
# Export and inspect a layer
crane blob nginx:1.27@sha256:a480a496ba95a... | tar -tzf - | head -20
# usr/
# usr/sbin/
# usr/sbin/nginx
# etc/nginx/
# etc/nginx/nginx.conf
# ...
# Layers are ADDITIVE β each adds/modifies/deletes files on top of previous
# Deleted files use "whiteout" markers: .wh.<filename>Image Configuration
# Inspect image config
crane config nginx:1.27 | jq .
# Platform (which arch/OS this image runs on)
# {
# "architecture": "amd64",
# "os": "linux"
# }
# Filesystem (references to layers by uncompressed digest)
# {
# "rootfs": {
# "type": "layers",
# "diff_ids": [
# "sha256:c6f988f4874bb0add23a778f75...", β Layer 0 uncompressed
# "sha256:5f70bf18a086007016e948b04a...", β Layer 1 uncompressed
# "sha256:9a0ef0e3bc21a6b5..." β Layer 2 uncompressed
# ]
# }
# }
# Execution parameters (become container runtime settings)
# {
# "config": {
# "Cmd": ["nginx", "-g", "daemon off;"],
# "Env": ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"],
# "ExposedPorts": {"80/tcp": {}},
# "User": "",
# "WorkingDir": "",
# "StopSignal": "SIGQUIT"
# }
# }
# The SHA-256 of this entire JSON == Image ID
crane digest --full-ref nginx:1.27
# sha256:6db391d1c0cfb... β this is sha256(config JSON)Container Registry Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Container Registry β
β β
β Key API Endpoints <data-dir>/blobs/sha256/ β
β β
β POST /v2/<repo>/blobs/uploads/ β Multi-Platform Image β
β GET /v2/<repo>/blobs/<digest> β βββββββββββββββββββββββββββ β
β DELETE /v2/<repo>/blobs/<digest>β β aaa... Image Index β β
β β β βββΊ bbb... Manifest β β
β PUT /v2/<repo>/manifests/ β β β βββΊ ccc... Configβ β
β GET /v2/<repo>/manifests/ β β β βββΊ ddd... Layer β β
β DELETE /v2/<repo>/manifests/ β β βββΊ eee... Manifest β β
β β β βββΊ fff... Configβ β
β GET /v2/<repo>/tags/list β β βββΊ 111... Layer β β
β β βββββββββββββββββββββββββββ β
β β β
β Tag-to-Manifest mapping: β Single-Platform Image β
β :latest β sha256:222... β βββββββββββββββββββββββββββ β
β :v1.2.3 β sha256:222... β β 222... Manifest β β
β :debug β sha256:333... β β βββΊ 333... Config β β
β β β βββΊ 444... Layer β β
β All filenames = SHA-256 hashes β βββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββHow Kubernetes Pulls Images
# Pod spec triggers image pull
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: registry.example.com/myorg/app:v2.1.0
# What happens during pull:
# 1. Resolve tag β manifest digest (GET /v2/myorg/app/manifests/v2.1.0)
# 2. If multi-platform: select manifest matching node arch
# 3. Download config blob (GET /v2/myorg/app/blobs/sha256:config...)
# 4. Download layer blobs in parallel (GET /v2/myorg/app/blobs/sha256:layer...)
# 5. Verify SHA-256 of each downloaded blob
# 6. Unpack layers into overlay filesystem (lower dirs)
# 7. Apply config as container settings (Cmd, Env, User, etc.)# Watch containerd pulling in real-time
crictl pull registry.example.com/myorg/app:v2.1.0
# Resolving manifest...
# Downloading sha256:a480a496... (29.1 MB) β layer blob
# Downloading sha256:7b3a8c01... (1.2 MB) β layer blob
# Downloading sha256:config... (4.2 KB) β config blob
# Unpacking...
# Done: sha256:6db391d1c0cfb... β image ID
# Verify what's cached (layers are shared across images!)
crictl images -v
# shows layers, size, digest for each cached image
# Check layer sharing
crictl inspecti registry.example.com/myorg/app:v2.1.0 | jq '.info.imageSpec.rootfs'Multi-Platform Images (Image Index)
# Image Index (fat manifest) β points to per-platform manifests
crane manifest --platform all nginx:1.27 | jq .
# {
# "schemaVersion": 2,
# "mediaType": "application/vnd.oci.image.index.v1+json",
# "manifests": [
# {
# "mediaType": "application/vnd.oci.image.manifest.v1+json",
# "digest": "sha256:bbb...",
# "size": 1234,
# "platform": { "architecture": "amd64", "os": "linux" }
# },
# {
# "mediaType": "application/vnd.oci.image.manifest.v1+json",
# "digest": "sha256:eee...",
# "size": 1234,
# "platform": { "architecture": "arm64", "os": "linux" }
# }
# ]
# }
# Kubernetes kubelet selects the manifest matching the node's arch
# Node labels: kubernetes.io/arch=amd64
# β pulls manifest sha256:bbb...
# β downloads layers referenced in that manifest onlyBuild Multi-Platform for Kubernetes
# Build for multiple architectures
docker buildx create --name multiarch --use
docker buildx build \
--platform linux/amd64,linux/arm64 \
--tag registry.example.com/myorg/app:v2.1.0 \
--push .
# Result: Image Index with 2 manifests, each with own layers + config
# Kubernetes nodes pull only their architecture's layersDebugging Image Issues on Kubernetes
# Image pull fails β check manifest exists
crane manifest registry.example.com/myorg/app:v2.1.0
# Wrong architecture β check what platforms are available
crane manifest --platform all registry.example.com/myorg/app:v2.1.0 | \
jq '.manifests[].platform'
# Layer size analysis (find bloat)
crane manifest registry.example.com/myorg/app:v2.1.0 | \
jq '.layers[] | {digest: .digest[:20], size: (.size/1048576 | round | tostring + " MB")}'
# Compare two tags (what changed?)
diff <(crane config registry.example.com/myorg/app:v2.0.0 | jq .) \
<(crane config registry.example.com/myorg/app:v2.1.0 | jq .)
# Find which layer added a file
for layer in $(crane manifest registry.example.com/myorg/app:v2.1.0 | jq -r '.layers[].digest'); do
echo "=== $layer ==="
crane blob registry.example.com/myorg/app@$layer | tar -tzf - | grep "vulnerable-lib"
doneContent-Addressable Storage
# Everything in a registry is stored by SHA-256 digest
# <data-dir>/blobs/sha256/
# aaa... β Image Index JSON
# bbb... β Manifest JSON (linux/amd64)
# ccc... β Config JSON
# ddd... β Layer tar.gz
# eee... β Manifest JSON (linux/arm64)
# ...
# Tags are just pointers (mutable!)
# :latest β sha256:aaa...
# :v1.2.3 β sha256:aaa...
# Tags can be moved; digests are immutable
# Best practice for Kubernetes: pin by digest
containers:
- name: app
image: registry.example.com/myorg/app@sha256:6db391d1c0cfb...
# Immutable β always gets exactly this image
# Tags like :latest can change under youCommon Issues
ImagePullBackOff β manifest unknown
- Cause: Tag doesnβt exist or was deleted; registry returns 404
- Fix: Verify with
crane manifest <image>:<tag>; check tag spelling and registry URL
exec format error (wrong architecture)
- Cause: Image built for amd64, running on arm64 node (or vice versa)
- Fix: Build multi-platform image; or use nodeSelector to match image arch
Image pull slow (large layers)
- Cause: Base image too large; or layers not shared with other images on node
- Fix: Use smaller base (distroless, alpine); reorder Dockerfile for better layer caching
Layer cache not working (rebuilds everything)
- Cause: Dockerfile COPY before dependencies β invalidates all subsequent layers
- Fix: Copy dependency files first (package.json, go.mod), install deps, then copy source
Best Practices
- Pin by digest in production β tags are mutable; digests guarantee exact content
- Small base images β distroless/alpine reduce pull time and attack surface
- Order Dockerfile for cache β dependencies before source code
- Multi-platform builds β support amd64 + arm64 for mixed clusters
- Non-root USER β set in config; enforced by Pod Security Standards
- Scan per-layer β identify which layer introduced a vulnerability
- Use crane/skopeo β inspect images without pulling entire content
Key Takeaways
- OCI image = layers (tar diffs) + config (JSON) + manifest (ties them together)
- Everything is content-addressable: filename = SHA-256 of content
rootfs.diff_idsin config = SHA-256 of uncompressed layer tars- Image ID = SHA-256 of the config JSON
- Tags are mutable pointers; digests are immutable references
- Multi-platform: Image Index β per-arch Manifests β per-arch Layers
- Kubernetes selects correct platform manifest based on nodeβs
kubernetes.io/archlabel - Registry API: blobs (content), manifests (metadata), tags (human-readable pointers)
- Layer sharing across images reduces disk and network usage on nodes

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses βπ Deepen Your Skills β Hands-on Courses
Courses by CopyPasteLearn.com β Learn IT by Doing
