





















































30 second summary of today's CloudPro for you:
Running your app in Kubernetes doesn’t automatically make it highly available. This walkthrough shows how ReplicaSets handle pod failures, node loss, and unhealthy containers, and what really happens behind the scenes when things go wrong. Adapted from The Kubernetes Bible.
> 8-minute read
> Hands-on commands included
> Bonus at the end for readers like you
Cheers,
Editor-in-Chief
Let’s say you’ve got a stateless NGINX app deployed in a multi-node Kubernetes cluster using a ReplicaSet. You think you’re covered because there are 4 replicas. But then you:
In all three cases, you’re expecting automatic recovery. But it’s not magic. It's ReplicaSet (and sometimes liveness probes) doing the heavy lifting.
Let’s walk through all three failure modes and see what Kubernetes does.
This scenario demonstrates how a ReplicaSet restores deleted pods to maintain the desired number of replicas.
Here's a step-by-step walkthrough:
1. Define the ReplicaSet manifest: Save the following YAML as nginx-replicaset-example.yaml:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-replicaset-example
namespace: rs-ns
spec:
replicas: 4
selector:
matchLabels:
app: nginx
environment: test
template:
metadata:
labels:
app: nginx
environment: test
spec:
containers:
- name: nginx
image: nginx:1.17
ports:
- containerPort: 80
2. Create the namespace: This ensures all your resources are scoped properly.
kubectl create -f ns-rs.yaml
3. Deploy the ReplicaSet: The manifest defines a ReplicaSet with 4 NGINX pods.
kubectl apply -f nginx-replicaset-example.yaml
4.Delete a pod manually: Simulate a pod failure by deleting one of the running pods.
kubectl delete pod <pod-name> -n rs-ns
5.Verify that the ReplicaSet restores the pod: The controller detects the change and automatically spins up a new pod to maintain the desired count.
kubectl get pods -n rs-ns
kubectl describe rs/nginx-replicaset-example -n rs-ns
Within seconds, the ReplicaSet controller notices the missing pod and recreates it to meet the declared replica count.
Takeaway:ReplicaSets automatically maintain the number of desired pods, making recovery from manual deletions fast and hands-free.
This scenario demonstrates how ReplicaSets maintain high availability when a node goes down by rescheduling pods onto available nodes:
Here's a step-by-step walkthrough:
1. Expose your app with a Service:
kubectl apply -f nginx-service.yaml
This creates a service to access your app across pods.
2. Forward traffic from your local machine to the Kubernetes Service:
kubectl port-forward svc/nginx-service 8080:80 -n rs-ns
curl localhost:8080
This confirms your service is working and traffic is flowing to the pods.
3. Check where the pods are currently running:
kubectl get pods -n rs-ns -o wide
This shows which node each pod is scheduled on.
4. Simulate node failure by cordoning and draining the node:
kubectl cordon kind-worker
Prevents new pods from being scheduled on this node.
kubectl drain kind-worker --ignore-daemonsets
Evicts all running pods from the node while ignoring daemonsets.
kubectl delete node kind-worker
Removes the node from the cluster to simulate a full node failure.
Within moments, the ReplicaSet detects the missing pods and spins up new ones on the remaining healthy nodes. Your Service automatically reroutes traffic to these new pods.
5. Verify that everything is still working:
kubectl get pods -n rs-ns -o wide
curl localhost:8080
You’ll see that traffic still flows, and the app remains accessible without downtime.
Takeaway:
The ReplicaSet ensures that the desired number of pod replicas is always maintained, even when a node goes offline. It handles pod rescheduling automatically, as long as there's sufficient capacity in your cluster.
Let’s see how Kubernetes handles an unhealthy container using liveness probes.
Here's a step-by-step walkthrough:
1. Add the following liveness probe to your ReplicaSet pod spec. It instructs the kubelet to check container health after 2 seconds and repeat every 2 seconds:
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 2
kubectl exec -it <pod-name> -- rm /usr/share/nginx/html/index.html
kubectl describe pod <pod-name>
You’ll see Liveness probe failed events, followed by automatic container restarts.
Takeaway:
The kubelet, not the ReplicaSet, manages container health. But when used with ReplicaSets, probes help create a resilient system that self-heals when a container goes bad.
You can delete the ReplicaSet and its pods:
kubectl delete rs/nginx-replicaset-livenessprobe-example
Or just delete the controller, leaving pods untouched:
kubectl delete rs/nginx-replicaset-livenessprobe-example --cascade=orphan
👋This walkthrough was adapted from just one chapter of The Kubernetes Bible, Second Edition: a 720-page, hands-on guide to mastering Kubernetes across cloud and on-prem environments.
If you’re tackling real production workloads or preparing for certs like CKA/CKAD/CKS, the book dives deeper into everything from ReplicaSets and Deployments to StatefulSets, autoscaling, Helm, traffic routing, and advanced security practices.
For the next 72 hours, CloudPro readers get 30% off the ebook and 20% off print.
Sponsored:
Curious how AI is changing secure coding? Join Sonya Moisset from Snyk on Aug 28 to explore real-world strategies for protecting your AI-driven SDLC and earn a CPE credit while you're at it. Register now.
Want faster builds and better mobile apps? Learn proven CI/CD tips from Bitrise and Embrace experts to speed up development and ship higher-quality apps. Register here.
📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.
If you have any comments or feedback, just reply back to this email.
Thanks for reading and have a great day!