Configuring a Horizontal Pod Autoscaler

This page explains how to use Horizontal Pod Autoscaler (HPA) to autoscale a Deployment using different types of metrics. You can use the same guidelines to configure an HPA for any scalable deployment object.

Before you begin

To prepare for this task, perform the following steps:

  • Ensure that you have enabled the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • Ensure that you have installed the Cloud SDK.
  • Set your default project ID:
    gcloud config set project [PROJECT_ID]
  • If you are working with zonal clusters, set your default compute zone:
    gcloud config set compute/zone [COMPUTE_ZONE]
  • If you are working with regional clusters, set your default compute region:
    gcloud config set compute/region [COMPUTE_REGION]
  • Update gcloud to the latest version:
    gcloud components update

API versions for HPA objects

When you use the Google Cloud Console, HPA objects are created using the autoscaling/v2beta1 API.

When you use kubectl to create or view information about an HPA, you can specify either the autoscaling/v1 API or the autoscaling/v2beta1 API.

  • apiVersion: autoscaling/v1 is the default, and allows you to autoscale based only on CPU utilization. To autoscale based on other metrics, using `apiVersion: autoscaling/v2beta1 is recommended. The example in Configuring a Deployment uses apiVersion: autoscaling/v1.

  • apiVersion: autoscaling/v2beta1 is recommended for creating new HPA objects. It allows you to autoscale based on multiple metrics, including custom or external metrics. All other examples in this topic use apiVersion: autoscaling/v2beta1.

To check which API versions are supported, use the kubectl api-versions command.

You can specify which API to use when viewing details about an HPA that uses apiVersion: autoscaling/v2beta1.

Create the example Deployment

Before you can create an HPA, you must create the workload it monitors. The examples in this topic apply different HPA configurations to the following nginx Deployment. Separate examples show an HPA based on resource utilization, based on a custom or external metric, and based on multiple metrics.

Save the following to a file called nginx.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
        resources:
          # You must specify requests for CPU to autoscale
          # based on CPU utilization
          requests:
            cpu: "250m"

This manifest specifies a value for CPU requests. If you want to autoscale based on a resource's utilization as a percentage, you must specify requests for that resource. If you do not specify requests, you can autoscale based only on the absolute value of the resource's utilization, such as milliCPUs for CPU utilization.

To create the Deployment, apply the nginx.yaml manifest:

kubectl apply -f nginx.yaml

The Deployment has spec.replicas set to 3, so three Pods are deployed. You can verify this using the kubectl get deployment nginx command.

Each of the examples in this topic applies a different HPA to an example nginx Deployment.

Autoscaling based on resources utilization

This example creates an HPA object to autoscale the nginx Deployment when CPU utilization surpasses 50%, and ensures that there is always a minimum of 1 replica and a maximum of 10 replicas.

You can create an HPA that targets CPU using the Cloud Console, the kubectl apply command, or for average CPU only, the kubectl autoscale command.

Console

  1. Visit the GKE Workloads menu in Cloud Console.

    Visit the GKE Workloads menu

  2. Click the name of the nginx Deployment.

  3. Expand the Actions menu and select Autoscale.

  4. Specify the following values:

    • Minimum number of Pods: 1
    • Maximum number of Pods: 10
    • Target CPU utilization in percent: 50
  5. Click Autoscale.

kubectl apply

Save the following YAML manifest as a file called nginx-hpa.yaml.

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: nginx
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  minReplicas: 1
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50

To create the HPA, apply the manifest:

kubectl apply -f nginx-hpa.yaml
horizontalpodautoscaler.autoscaling/nginx created

kubectl autoscale

To create an HPA object that only targets average CPU utilization, you can use the kubectl autoscale command.

kubectl autoscale deployment nginx --cpu-percent=50 --min=1 --max=10

To get a list of HPA objects in the cluster, use the kubectl get hpa command:

kubectl get hpa
NAME    REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx   Deployment/nginx   0%/50%    1         10        3          61s

To get details about the HPA, you can use the Cloud Console or the kubectl command.

Console

  1. Visit the GKE Workloads menu in Cloud Console.

    Visit the GKE Workloads menu

  2. Click the name of the nginx Deployment.

  3. View the HPA's configuration in the Autoscaler section of the page.

  4. View more details about autoscaling events in the Events tab.

kubectl get

To get details about the HPA, you can use kubectl get hpa with the -o yaml flag. The status field contains information about the current number of replicas and any recent autoscaling events.

kubectl get hpa nginx -o yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2019-10-30T19:42:59Z","reason":"ScaleDownStabilized","message":"recent
      recommendations were higher than current one, applying the highest recent recommendation"},{"type":"ScalingActive","status":"True","lastTransitionTime":"2019-10-30T19:42:59Z","reason":"ValidMetricFound","message":"the
      HPA was able to successfully calculate a replica count from cpu resource utilization
      (percentage of request)"},{"type":"ScalingLimited","status":"False","lastTransitionTime":"2019-10-30T19:42:59Z","reason":"DesiredWithinRange","message":"the
      desired count is within the acceptable range"}]'
    autoscaling.alpha.kubernetes.io/current-metrics: '[{"type":"Resource","resource":{"name":"cpu","currentAverageUtilization":0,"currentAverageValue":"0"}}]'
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"spec":{"maxReplicas":10,"minReplicas":1,"scaleTargetRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"nginx"},"targetCPUUtilizationPercentage":50}}
  creationTimestamp: "2019-10-30T19:42:43Z"
  name: nginx
  namespace: default
  resourceVersion: "220050"
  selfLink: /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/nginx
  uid: 70d1067d-fb4d-11e9-8b2a-42010a8e013f
spec:
  maxReplicas: 10
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  targetCPUUtilizationPercentage: 50
status:
  currentCPUUtilizationPercentage: 0
  currentReplicas: 3
  desiredReplicas: 3

Before following the remaining examples in this topic, delete the HPA:

kubectl delete hpa nginx

When you delete an HPA, the number of replicas of the Deployment remain the same. A Deployment does not automatically revert back to its state before an HPA was applied.

You can learn more about deleting an HPA.

Autoscaling based on a custom or external metric

You can follow along with step-by-step tutorials to create HPAs for custom metrics and external metrics.

Autoscaling based on multiple metrics

This example creates an HPA that autoscales based on CPU utilization and a custom metric called packets_per_second.

If you followed the previous example and still have an HPA called nginx, delete it before following this example.

This example requires apiVersion: autoscaling/v2beta1. For more information about the available APIs, see API versions for HPA objects.

Save this YAML manifest as a file called nginx-multiple.yaml.

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: nginx
spec:
  maxReplicas: 10
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 50
  - type: Resource
    resource:
      name: memory
      targetAverageValue: 100Mi
  # Uncomment these lines if you create the custom packets_per_second metric and
  # configure your app to export the metric.
  # - type: Pods
  #   pods:
  #     metricName: packets_per_second
  #     targetAverageValue: 100

Apply the YAML manifest:

kubectl apply -f nginx-multiple.yaml
horizontalpodautoscaler.autoscaling/nginx created

When created, the HPA monitors the nginx Deployment for average CPU utilization, average memory utilization, and (if you uncommented it) the custom packets_per_second metric. The HPA autoscales the Deployment based on the metric whose value would create the larger autoscale event.

Viewing details about an HPA

To view an HPA's configuration and statistics, use kubectl describe hpa <var>[HPA-NAME]</var>. If your HPA uses apiVersion: autoscaling/v2beta1, use kubectl describe hpa.v2beta1.autoscaling <var>[HPA-NAME]</var> instead.

Each HPA's current status is shown in Conditions field, and autoscaling events are listed in the Events field.

kubectl describe hpa nginx

If the HPA uses apiVersion: autoscaling/v2beta1 and is based on multiple metrics, the kubectl describe hpa command only shows the CPU metric. To see all metrics, use the kubectl describe hpa.v2beta1.autoscaling command instead.

kubectl describe hpa.v2beta1.autoscaling nginx
Name:                     nginx
Namespace:                default
Labels:                   <none>
Annotations:              autoscaling.alpha.kubernetes.io/conditions:
                            [{"type":"AbleToScale","status":"True","lastTransitionTime":"2019-10-30T21:15:22Z","reason":"ReadyForNewScale","message":"recommended size...
                          autoscaling.alpha.kubernetes.io/current-metrics:
                            [{"type":"Resource","resource":{"name":"memory","currentAverageValue":"1998848"}},{"type":"Resource","resource":{"name":"cpu","currentAver...
                          autoscaling.alpha.kubernetes.io/metrics: [{"type":"Resource","resource":{"name":"memory","targetAverageValue":"100Mi"}}]
                          kubectl.kubernetes.io/last-applied-configuration:
                            {"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"s...
CreationTimestamp:        Wed, 30 Oct 2019 14:15:13 -0700
Reference:                Deployment/nginx
Target CPU utilization:   50%
Current CPU utilization:  0%
Min replicas:             1
Max replicas:             10
Deployment pods:          1 current / 1 desired
Events:                   <none>

Deleting an HPA

You can delete an HPA using the Cloud Console or the kubectl delete command.

Console

To delete the nginx HPA:

  1. Visit the GKE Workloads menu in Cloud Console.

    Visit the GKE Workloads menu

  2. Click the name of the nginx Deployment.

  3. Expand the Actions menu and select Autoscale.

  4. Select Disable Autoscaler.

kubectl delete

To delete the nginx HPA, use the kubectl delete command:

kubectl delete hpa nginx

When you delete an HPA, the Deployment or (or other deployment object) remains at its existing scale, and does not revert back to the number of replicas in the Deployment's original manifest. To manually scale the Deployment back to three Pods, you can use the kubectl scale command:

kubectl scale deployment nginx --replicas=3

Cleaning up

  1. Delete the HPA, if you have not done so:

    kubectl delete hpa nginx
    
  2. Delete the nginx Deployment:

    kubectl delete deployment nginx
    
  3. Optionally, delete the cluster.

What's next

Var denne siden nyttig? Si fra hva du synes:

Send tilbakemelding om ...

Kubernetes Engine Documentation