Autoscaling Deployments with Custom Metrics

This tutorial demonstrates how to automatically scale your Kubernetes Engine workloads based on custom metrics exported to Stackdriver by Kubernetes Pods. To learn how to autoscale workloads based on other metrics available in Stackdriver visit Autoscaling deployments with External Metrics.

Objectives

To set up autoscaling with custom metrics on Kubernetes Engine, you must:

  1. Deploy custom metrics Stackdriver adapter.
  2. Export custom metrics to Stackdriver.
  3. Deploy HorizontalPodAutoscaler (HPA) resource to scale your Deployment based on the custom metrics.

Before you begin

Take the following steps to enable the Kubernetes Engine API:
  1. Visit the Kubernetes Engine page in the Google Cloud Platform Console.
  2. Create or select a project.
  3. Wait for the API and related services to be enabled. This can take several minutes.
  4. Make sure that billing is enabled for your project.

    Learn how to enable billing

Install the following command-line tools used in this tutorial:

  • gcloud is used to create and delete Kubernetes Engine clusters. gcloud is included in the Google Cloud SDK.
  • kubectl is used to manage Kubernetes, the cluster orchestration system used by Kubernetes Engine. You can install kubectl using gcloud:
    gcloud components install kubectl

Set defaults for the gcloud command-line tool

To save time typing your project ID and Compute Engine zone options in the gcloud command-line tool, you can set the defaults:
gcloud config set project PROJECT_ID
gcloud config set compute/zone us-central1-b

Create cluster and set up monitoring

Choosing a custom metric

There are two ways to autoscale with custom metrics:

  • You can export a custom metric from every Pod in the Deployment and target the average value per Pod.
  • You can export a custom metric from a single Pod outside of the Deployment and target the total value.

Within the given limits, a Deployment can scale its replicated Pods based on the metric value. Metrics with a total target value should always be defined such that scaling brings the value of the metric closer to the target value.

For example, consider scaling a frontend application based on the queries-per-second metric. When the metric value increases, the number of Pods should scale up, with each Pod serving a similar amount of traffic as before. Exporting queries-per-second value for each Pod and setting a desired target average value results in the desired behavior. However, exporting the total number of queries-per-second and setting a total target value for this metric doesn't produce the desired behavior in this case, since increasing the number of Pods doesn't reduce total traffic.

Other metrics, such as average request latency, can be used directly with total target value to scale Deployments, depending on the use case.

Step 1: Deploy Custom Metrics Stackdriver Adapter

To grant Kubernetes Engine objects access to metrics stored in Stackdriver, you need to deploy the Custom Metrics Stackdriver Adapter.

To deploy the adapter in your cluster, run the following command:

kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/adapter-beta.yaml

Step 2: Export the metric to Stackdriver

You can export your metrics to Stackdriver either directly from your application, or by exposing them in Prometheus format and adding the Prometheus-to-Stackdriver adapter to your Pod's containers.

You can view the exported metrics from the metrics explorer by searching for custom/[METRIC_NAME] (such as custom/foo).

Exporting metrics from the application

You can create custom metrics and export them directly to Stackdriver from your application code. To learn more, refer to Creating Custom Metrics in the Stackdriver Monitoring documentation. You can also take advantage of Stackdriver's auto-creation of custom metrics feature.

Your metric needs to meet the following requirements:

  • Metric kind must be GAUGE
  • Metric type can be either DOUBLE or INT64
  • Metric name must start with custom.googleapis.com/ prefix, followed by a simple name
  • Resource type must be "gke_container"
  • Resource labels must include:
    • pod_id set to Pod UID, which can be obtained via the Downward API
    • container_name = “”
    • project_id, zone, cluster_name, which can be obtained by your application from the metadata server. To get values, you can use Google Cloud's compute metadata client.

The following manifest file describes a Deployment that runs a single instance of a Go application that exports metrics using Stackdriver client libraries:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    run: custom-metric-sd
  name: custom-metric-sd
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      run: custom-metric-sd
  template:
    metadata:
      labels:
        run: custom-metric-sd
    spec:
      containers:
      - command: ["./direct-to-sd"]
        args: ["--metric-name=foo", "--metric-value=40", "--pod-id=$(POD_ID)"]
        image: gcr.io/google-samples/sd-dummy-exporter:latest
        name: sd-dummy-exporter
        resources:
          requests:
            cpu: 100m
        env:
          - name: POD_ID
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.uid

Exporting using Prometheus

You can expose metrics in your application in Prometheus format and deploy the Prometheus-to-Stackdriver adapter, which scrapes the metrics and exports them to Stackdriver. For examples of exposing metrics in Prometheus format, refer to Kubernetes instrumentation guide.

Your metric needs to meet the following requirements:

  • Metric type must be Gauge
  • Metric name must not contain the custom.googleapis.com prefix

Deploy Prometheus as a container in the Pod from which you export metrics, and pass the following flags to the container:

  • pod-id and namespace-id: Set to Pod and namespace UID, obtained via Downward API
  • source=http://localhost:[PORT], where [PORT] is the port on which your metrics are exposed.
  • stackdriver-prefix=custom.googleapis.com

The following manifest file describes a Pod with a Go application that exposes metrics using the Prometheus client libraries and an adapter container:

apiVersion: v1
kind: Pod
metadata:
  name: custom-metric-prometheus-sd
spec:
  containers:
  - command:
    - /bin/sh
    - -c
    - ./prometheus-dummy-exporter --metric-name=foo --metric-value=40 --port=8080
    image: gcr.io/google-samples/prometheus-dummy-exporter:latest
    imagePullPolicy: Always
    name: prometheus-dummy-exporter
    resources:
      requests:
        cpu: 100m
  - name: prometheus-to-sd
    image: gcr.io/google-containers/prometheus-to-sd:v0.2.3
    command:
    - /monitor
    - --source=:http://localhost:8080
    - --stackdriver-prefix=custom.googleapis.com
    - --pod-id=$(POD_ID)
    - --namespace-id=$(POD_NAMESPACE)
    env:
    - name: POD_ID
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.uid
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          fieldPath: metadata.namespace

Step 3: Create HorizontalPodAutoscaler object

Once you have exported metrics to Stackdriver, you can deploy a HPA to scale your Deployment based on the metrics.

The following steps depend on how you chose to collect and export your metrics.

Autoscaling based on metrics from all Pods

The HPA uses the metrics to compute an average and compare it to the target average value.

In the application-to-Stackdriver export example, a Deployment contains Pods that export metric. The following manifest file describes a HorizontalPodAutoscaler object that scales a Deployment based on the target average value for the metric:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: custom-metric-sd
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: custom-metric-sd
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods
    pods:
      metricName: foo
      targetAverageValue: 20

Autoscaling based on metrics from a single Pod

The HPA directly compares the value exposed by a single Pod to the specified target value. This Pod doesn't have to be bound to the scaled workflow.

In Prometheus-to-Stackdriver export example, a single Pod exports the metric. The following manifest file describes a Deployment and a HPA that scales a Deployment based on the target value for the metric:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: dummy-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      k8s-app: dummy-deployment
  template:
    metadata:
      labels:
        k8s-app: dummy-deployment
    spec:
      containers:
      - name: long
        image: busybox
        command: ["/bin/sh",  "-c", "sleep 180000000"]
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: dummy-deployment-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: dummy-deployment
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Object
    object:
      target:
        kind: Pod
        name: custom-metric-prometheus-sd
      metricName: foo
      targetValue: 20

Cleaning up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:

Delete your Kubernetes Engine cluster by running the following command:

gcloud container clusters delete [CLUSTER-NAME]

Troubleshooting

If you run into issues with this tutorial, try the following debugging steps:

  1. Run kubectl api-versions and verify that the custom.metrics.k8s.io/v1beta1 API is registered. If you don't see this API in the list, make sure that Custom Metrics Adapter (deployed in Step 1) is running in the cluster.
  2. Visit Metrics Explorer and verify that your custom metric is being exported to Stackdriver. Look for metrics starting with custom.googleapis.com/[NAME]. If you don't see your metric listed:

    • Ensure that the exporter Deployment (deployed in Step 2) is running.
    • If you customized the service account of your nodes with --service-account, make sure it has the Monitoring Metric Writer IAM role (roles/monitoring.metricWriter).
    • If you customized the scope of your nodes with --scopes, make sure your nodes have the monitoring scope.
  3. Run kubectl describe hpa [DEPLOYMENT_NAME] and verify that your custom metric is being read by the HPA. If you see errors:

    • Ensure that the scaled Deployment (deployed in Step 2) is running.
    • If you customized the service account of your nodes with --service-account, make sure it has the Monitoring Viewer IAM role (roles/monitoring.viewer).

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Kubernetes Engine Tutorials