This page discusses custom and external metrics, which Horizontal Pod Autoscaler can use to automatically increase or decrease the number of replicas of a given workload.
Unlike Vertical Pod Autoscaler, Horizontal Pod Autoscaler does not modify the workload's configured requests. Horizontal Pod Autoscaler scales only the number of replicas.
Custom and external metrics allow workloads to adapt to conditions besides the workload itself. Consider an application that pulls tasks from a queue and completes them. Your application might have Service-Level objective (SLO) for time to process a task, or for number of tasks pending. If the queue is increasing, more replicas of the workload might meet your workload's SLO. If the queue is empty or is decreasing more quickly than expected, you might be able to save money by running fewer replicas, while still meeting your workload's SLO.
Custom metrics and external metrics differ from each other:
- A custom metric is reported from your application running in Kubernetes. To learn more, see Custom metrics in this topic.
- An external metric is reported from an application or service not running on your cluster, but whose performance impacts your Kubernetes application. For information, the metric could be reported from Cloud Monitoring or Pub/Sub.
Both custom and external metrics also work in the other direction. For example, a low number of tasks in a queue might indicate that the application is performing well and may be eligible for autoscaling down.
Your application can report a custom metric to Cloud Monitoring. You can configure Kubernetes to respond to these metrics and scale your workload automatically. For example, you can scale your application based on metrics such as queries per second, writes per second, network performance, latency when communicating with a different application, or other metrics that make sense for your workload.
A custom metric can be selected for any of the following:
- A particular node, Pod, or any Kubernetes object of any kind, including a CustomResourceDefinition (CRD).
- The average value for a metric reported by all Pods in a Deployment
You can filter a given custom metric by label, by adding a
selector field set
to the label's key and value. For example, you can set
selector: "environment=prod" to only aggregate metric values with the
environment=prod. The selector can be a binary combination of several
label expressions. For more information, see
Log-based metric labels in
the Monitoring documentation.
Before you can use custom metrics, you must enable Monitoring in your Google Cloud project and install the Stackdriver adapter on your cluster. After custom metrics are exported to Monitoring, they can trigger autoscaling events by Horizontal Pod Autoscaler to change the shape of the workload.
Custom metrics must be exported from your application in a specific format. The Monitoring UI includes a metric auto-creation tool to help you automatically create custom metrics. If you use the auto-creation tool to create custom metrics, Monitoring detects them automatically.
For more details, see the tutorial about autoscaling Deployments with Custom Metrics.
If you need to scale your workload based on the performance of an application or service outside of Kubernetes, you can configure an external metric. For example, you might need to increase the capacity of your application to ingest messages from Pub/Sub if the number of undelivered messages is trending upward.
The external application needs to export the metric to a Monitoring instance that the cluster can access. The trend of each metric over time causes Horizontal Pod Autoscaler to change the shape of the workload automatically.
For more details, see the tutorial about autoscaling Deployments with External Metrics.
To import metrics to Monitoring, you can either:
- Export metrics from the application using the Cloud Monitoring API, or
- Configure the application to emit metrics in Prometheus format. Then, run the Prometheus-to-Stackdriver adapter. This is a small open-source sidecar container that scrapes the metrics, translates them to the Monitoring format, and pushes them to the Monitoring API.
For more information, see Creating metrics in the Monitoring documentation.