In my case, I am running a GKE cluster with preemptible VM instances. A preemptible instance is an instance that lives at most 24 hours and can be replaced with another instance at any point. This requires all applications that are running in the cluster to be highly-available (HA) and be able to continue functioning with all but 1 node removed from the cluster.
In my case, all the deployments are highly available and have affinity configured to spread pods to all nodes, so that at all nodes theres a pod that can continue serving traffic even if all other pods are removed due to removal preemptible nodes. Yet, I still faced frequent downtimes when nodes were removed and added to the cluster.
After a short investigation, I figured out that the problem was in kube-dns, which didn't have affinity configured and thus sometimes placed all of its pods on a single node. When this node was removed, all kube-dns pods were removed and rescheduled on another node. While these pods were rescheduled, DNS resolution in the cluster was down, resulting in all the apps that depended on other services - like databases - being down. So I needed to either enable affinity for kube-dns, or replace it with CoreDNS with enabled affinity.
I tried enabling affinity for kube-dns first, but was unsuccessful. The GKE master plane automatically reconciles kube-dns changes and overrides all changes. So the only option left was to replace kube-dns with CoreDNS.
Installing CoreDNS1. Generate deployment
git clone https://github.com/coredns/deployment.git
./deploy.sh > coredns-deployment.yaml
2. On line 110, replace affinity with the following lines:
- topologyKey: kubernetes.io/hostname
- key: k8s-app
3. On line 90, change replicas: 1 to the number of nodes you have in your cluster.
4. Install CoreDNS
kubectl apply -f coredns-deployment-yaml
5. Scale down kube-dns deployment
Removing the deployment is not an option as kubernetes control plane will automatically recreate it.
kubectl -n kube-system scale --replicas=0 deployment/kube-dns
You're all set! You can test the installation by simulating removal of one of your nodes.
kubectl cordon <your-node-name>
kubectl drain <your-node-name> --ignore-daemonsets
Once you're done with tests, bring the node back online.
kubectl uncordon <your-node-name>