HPA (Horizontal Pod Autoscaler)

Metric-server

Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.

Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver.

Metrics Server has specific requirements for cluster and network configuration. These requirements aren't the default for all cluster distributions.

Installation

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

By default, it will only work with HTTPs between the Pods, so for local or development use you will need to download and edit the Deployment.yaml before applying.

...
kind: Deployment
spec:
  template:
    spec:
      - args:
        ...
        # Add this to let it work without HTTPs
        - --kubetlet-insecure-tls
...

HPA

A HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand.

Horizontal scaling means that the response to increased load is to deploy more Pods. This is different from vertical scaling, which for Kubernetes would mean assigning more resources (for example: memory or CPU) to the Pods that are already running for the workload.

If the load decreases, and the number of Pods is above the configured minimum, the HorizontalPodAutoscaler instructs the workload resource (the Deployment, StatefulSet, or other similar resource) to scale back down.

Commands

To apply and run a HPA configuration:

kubectl apply -f hpa.yaml

To list the HPAs:

kubectl get hpa

To describe a HPA:

kubectl describe hpa <hpa-name>

To delete a HPA:

kubectl delete hpa <hpa-name>

Example

hpa.yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    name: myapp-deployment
    kind: Deployment
  minReplicas: 2
  maxReplicas: 5
  targetCPUUtilizationPercentage: 75

Last updated