NXDOMAIN DNS Troubleshooting Kubernetes
Fix NXDOMAIN errors in Kubernetes. Debug CoreDNS failures, ndots configuration, search domain issues, and external DNS lookup problems.
π‘ Quick Answer: NXDOMAIN in Kubernetes usually means one of three things: (1) the
ndots:5default causes short names to search cluster domains before trying the FQDN β append a trailing dot (api.example.com.) to bypass; (2) CoreDNS canβt reach upstream DNS β checkkube-dnsservice and CoreDNS pods; (3) the domain genuinely doesnβt exist. Debug withkubectl exec <pod> -- nslookup api.example.comand check CoreDNS logs.
The Problem
Pods get NXDOMAIN for domains that resolve fine outside the cluster:
curl: (6) Could not resolve host: api.example.comnslookup: server can't find api.example.com: NXDOMAIN- External API calls fail but the domain works from nodes
- Intermittent NXDOMAIN under load (DNS rate limiting)
The Solution
Step 1: Test DNS from the Pod
# Run a debug pod with DNS tools
kubectl run dns-debug --image=busybox:1.36 --restart=Never -- sleep 3600
# Test resolution
kubectl exec dns-debug -- nslookup api.example.com
# Server: 10.96.0.10
# Address: 10.96.0.10:53
# ** server can't find api.example.com: NXDOMAIN
# Try with trailing dot (FQDN β bypasses search domains)
kubectl exec dns-debug -- nslookup api.example.com.
# Name: api.example.com
# Address: 93.184.216.34
# β
Works! It's an ndots issue.The ndots Problem
Kubernetes sets ndots:5 in /etc/resolv.conf by default:
kubectl exec dns-debug -- cat /etc/resolv.conf
# nameserver 10.96.0.10
# search default.svc.cluster.local svc.cluster.local cluster.local
# options ndots:5With ndots:5, any name with fewer than 5 dots is treated as βnot fully qualifiedβ and searched through ALL search domains first:
Query: api.example.com (2 dots, < 5)
1. api.example.com.default.svc.cluster.local β NXDOMAIN
2. api.example.com.svc.cluster.local β NXDOMAIN
3. api.example.com.cluster.local β NXDOMAIN
4. api.example.com β β
(finally tries the real domain)This creates 3 unnecessary NXDOMAIN queries before the real one.
Fix: Reduce ndots or Use FQDN
# Option 1: Set ndots:2 in pod spec (recommended)
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
dnsConfig:
options:
- name: ndots
value: "2"
containers:
- name: app
image: myapp:v1
# Option 2: Use trailing dot in application code
# api.example.com. β trailing dot = FQDN, no search domainsStep 2: Check CoreDNS
# Is CoreDNS running?
kubectl get pods -n kube-system -l k8s-app=kube-dns
# NAME READY STATUS RESTARTS
# coredns-5d78c9869d-abc12 1/1 Running 0
# Check CoreDNS logs for errors
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50
# Is the kube-dns service reachable?
kubectl get svc kube-dns -n kube-system
# NAME TYPE CLUSTER-IP PORT(S)
# kube-dns ClusterIP 10.96.0.10 53/UDP,53/TCP
# Test from node directly
kubectl debug node/worker-1 -it --image=busybox -- nslookup kubernetes.default.svc.cluster.local 10.96.0.10Step 3: Check Upstream DNS
# CoreDNS forwards external queries to upstream
kubectl get configmap coredns -n kube-system -o yaml | grep -A5 forward
# forward . /etc/resolv.conf
# or
# forward . 8.8.8.8 8.8.4.4
# Check if upstream resolvers are reachable from CoreDNS
kubectl exec -n kube-system coredns-5d78c9869d-abc12 -- \
cat /etc/resolv.conf
# This is the NODE's resolv.conf β upstream DNS servers
# Enable CoreDNS debug logging
kubectl edit configmap coredns -n kube-system
# Add "log" plugin:
# .:53 {
# log β Add this line
# errors
# ...
# }
kubectl rollout restart deployment coredns -n kube-system
# Watch queries in real-time
kubectl logs -n kube-system -l k8s-app=kube-dns -fStep 4: Check NetworkPolicy
# Is DNS traffic blocked by NetworkPolicy?
# CoreDNS needs UDP/TCP port 53 from all pods
# Check if any NetworkPolicy blocks DNS
kubectl get networkpolicy -A -o yaml | grep -B10 "port: 53"
# If NetworkPolicy exists, ensure DNS is allowed:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: my-namespace
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53Common NXDOMAIN Patterns
| Symptom | Cause | Fix |
|---|---|---|
| External domain NXDOMAIN, trailing dot works | ndots:5 search domain issue | Set ndots:2 in dnsConfig |
| All DNS fails | CoreDNS pods down | Restart CoreDNS deployment |
| External fails, internal works | Upstream DNS unreachable | Check CoreDNS forward config |
| Intermittent NXDOMAIN | DNS rate limiting / conntrack | Increase CoreDNS replicas, use NodeLocal DNSCache |
| NXDOMAIN after NetworkPolicy | Egress DNS blocked | Allow UDP/TCP 53 to kube-system |
Common Issues
Intermittent NXDOMAIN under high load
Conntrack table full or CoreDNS overwhelmed. Deploy NodeLocal DNSCache DaemonSet to cache DNS on every node: kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml.
NXDOMAIN for *.svc.cluster.local names
CoreDNS canβt read the Kubernetes API. Check RBAC: kubectl get clusterrolebinding system:coredns.
Alpine/musl-based images have DNS issues
muslβs resolver doesnβt handle ndots the same as glibc. Use ndots:1 or switch to debian-based images.
Best Practices
- Set
ndots:2on pods that call external APIs β reduces 3 unnecessary queries per lookup - Use FQDN with trailing dot for external domains in config files
- Deploy NodeLocal DNSCache for clusters with >50 nodes
- Never block DNS egress without explicitly allowing port 53 to kube-system
- Monitor CoreDNS latency and error rate with Prometheus metrics
Key Takeaways
- NXDOMAIN for external domains is usually the
ndots:5search domain behavior - Trailing dot (
api.example.com.) forces FQDN lookup β no search domains - Set
dnsConfig.options.ndots: 2on pods that primarily call external services - CoreDNS pods, kube-dns service, and upstream DNS are the three failure points
- NetworkPolicy must explicitly allow egress to UDP/TCP 53 on kube-system

Recommended
Kubernetes Recipes β The Complete Book100+ production-ready patterns with detailed explanations, best practices, and copy-paste YAML. Everything in one place.
Get the Book βLearn by Doing
CopyPasteLearn β Hands-on Cloud & DevOps CoursesMaster Kubernetes, Ansible, Terraform, and MLOps with interactive, copy-paste-run lessons. Start free.
Browse Courses β