Cluster AutoScaler (CA)
The cluster autoscaler is available for standard clusters that are set up with public network connectivity. With the
ibm-iks-cluster-autoscaler plug-in, you can scale the worker pools in your IBM Cloud Kubernetes Service cluster automatically to increase or decrease the number of worker nodes in the worker pool based on the sizing needs of your scheduled workloads.
The environment should be set for the cluster which you want to autoscale as explined in
Environment Setup section.
Once environment is ready, follow the below steps:
Confirm that your IBM Cloud Identity and Access Management credentials are stored in the cluster.kubectl get secrets -n kube-system | grep storage-secret-store
The cluster autoscaler can scale only worker pools that have the
ibm-cloud.kubernetes.io/worker-pool-idlabel. Check whether your worker pool has the required label.# To get Cluster name or IDibmcloud ks cluster ls# To get Worker-Pool name or ID for your clusteribmcloud ks worker-pool ls -c <cluster_name_or_ID># To check Label of the worker-poolibmcloud ks worker-pool get --cluster <cluster_name_or_ID> --worker-pool <worker_pool_name_or_ID> | grep Labels
Helm v3by following the instructions. You can skip this step if you already have helmv3 installed.
Add and update the Helm repo where the cluster autoscaler Helm chart is.helm repo add iks-charts https://icr.io/helm/iks-chartshelm repo update
Install the cluster autoscaler Helm chart in the
kube-systemnamespace of your cluster. In the example command, the default worker pool is enabled for autoscaling with the Helm chart installation.helm install ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler --namespace kube-system --set workerpools.default.max=5,workerpools.default.min=3,workerpools.default.enabled=true
Here, the first workerpool named as default is enabled for autoscaling with maximum number of nodes as 5 and 3 minimum number of worker nodes. To customize and understand more
--set workerpoolsoptions, please refer this link. In case, if you have more than one workerpool in default resource group then you should use worker-pool ID instead of its name in the above command.
Verify that the installation is successful.
Check that the cluster autoscaler pod is in a Running state.kubectl get pods --namespace=kube-system | grep ibm-iks-cluster-autoscaler
Check that the cluster autoscaler service is created.kubectl get service --namespace=kube-system | grep ibm-iks-cluster-autoscaler
The worker pool details are added to the cluster autoscaler config map. Verify that the config map is correct by checking that the
workerPoolsConfig.jsonfield is updated and that the
workerPoolsConfigStatusfield shows a success message.kubectl get cm iks-ca-configmap -n kube-system -o yaml
A pod is considered pending when insufficient compute resources exist to schedule the pod on a worker node. When the cluster autoscaler detects pending pods, the autoscaler scales up your worker nodes to meet the workload resource requests.
Deploy the application as explained in
Deploy Application to IKSsection.
Create a deployment such that the worker pool runs out of resources and some of the pods will be in pending state which then triggers the cluster autoscaler to scale up the worker pool. Execute the following steps to increase load using
hparesource.# configure hpakubectl autoscale deployment test --cpu-percent=25 --min=1 --max=30# modify yaml for ingress subdomainsed -i '' s#HOST#<YOUR_INGRESS_SUBDOMAIN># generate-load-ca.yaml //macORsed -i s#HOST#<YOUR_INGRESS_SUBDOMAIN># generate-load-ca.yaml //linuxkubectl create -f generate-load-ca.yaml
The above step should result some of the pods in
pendingstate in sometime. Keep checking the following command.# to check pods and their statekubectl get pods# to check the number of replicaskubectl get hpa
Once some of the pods are in pending state, you can verify if cluster autoscaler has triggered addition of more worker nodes using the following command.# if it shows Workers > 3 then Cluster Autoscaler has been triggered successfullyibmcloud ks worker-pool ls --cluster <cluster_name_or_ID>
You can check the IBM Cloud Dashboard to confirm if worker nodes are created to meet the current demand. The dashboard will show something like this snapshot.
It will take few minutes for a new worker node to get ready. You can also follow along with the pod deployment from the command line. You should see the pods transition from pending to running as nodes are scaled up. After sometime all five worker nodes will be up and running.
The cluster autoscaler periodically scans the cluster to adjust the number of worker nodes within the worker pools. If the cluster autoscaler detects underutilized worker nodes, it scales down your worker nodes one at a time so that you have only the compute resources that you need.
Decrease the load using the following command.
kubectl delete -f generate-load-ca.yaml
When the load decreases, the number of pods will also decrease which internally freed up the worker nodes. After a short period of time, the cluster autoscaler detects that your cluster no longer needs all its compute resources and scales down the worker nodes one at a time. If you check the kubernetes dashboard after sometime, you might see that node is being stopped.
Check the Kubernetes dashboard after sometime, you can see the that nodes are getting deleted.
We setup 3 as minimum number of worker nodes, it means that the cluster autoscaler does not scale down below three worker nodes even if you remove the workload that requests the amount. Hence check the dashboard after sometime, it will show as below snapshot.