Minimum instances (services)

Stay organized with collections Save and categorize content based on your preferences.

This page describes how to enable idle instances for your service using the minimum instances setting.

For Cloud Run services, Cloud Run by default scales in to the number of instances based on the number of incoming requests. However, if your service requires reduced latency, especially when scaling from zero active instances, you can change this default behavior by specifying a minimum number of container instances to be kept warm and ready to serve requests. Refer to General development tips for more details on this optimization.

Cloud Run removes instances that are not serving requests (idle). With minimum instances set, Cloud Run keeps at least the number of minimum instances running, even if they're not serving requests. Active instances above min-instances might become idle, if they are not receiving requests.

For example, if min-instances is 10, and the number of active instances is 0, then the number of idle instances is 10. When the number of active instances increases to 6, then the number of idle instances decreases to 4.

Billing

Instances kept running using the minimum instances feature do incur billing costs. Because these charges are very predictable, Google recommends purchasing a Committed use discount.

Minimum instances and always-allocated CPU

You can configure CPU to be always-allocated if you need CPU outside of requests.

Minimum instances restarts

Minimum instances can be restarted at any time.

Revisions and minimum instances

Minimum instances are started only if the revision is addressable. A revision is addressable if either of the following is true:

  • It receives a percentage of the traffic
  • it was assigned a revision tag

Setting and updating minimum instances

Any configuration change leads to the creation of a new revision. Subsequent revisions will also automatically get this configuration setting unless you make explicit updates to change it.

By default, container instances have min-instances turned off, with a setting of 0. You can change this default using the Google Cloud console, the gcloud command line, or a YAML file when you create a new service or deploy a new revision:

Console

  1. Go to Cloud Run

  2. Click Create Service if you are configuring a new service you are deploying to. If you are configuring an existing service, click on the service, then click Edit and Deploy New Revision.

  3. If you are configuring a new service, fill out the initial service settings page as desired, then click Container, Networking, Security to expand the service configuration page.

  4. Click the Container tab.

    image

  5. In the field labelled Minimum number of instances, specify the desired number of container instances to be kept warm, ready to receive requests.

  6. Click Create or Deploy.

Command line

You can update min-instance of a given service by using the following command:

gcloud run services update SERVICE --min-instances MIN-VALUE

Replace

  • SERVICE with the name of your service and
  • MIN-VALUE with the desired number of container instances to be kept warm, ready to receive requests. Specify default to clear any minimum instance setting.

You can also set min-instance during deployment using the command:

gcloud run deploy --image IMAGE_URL --min-instances MIN-VALUE

Replace

  • IMAGE_URL with a reference to the container image, for example, us-docker.pkg.dev/cloudrun/container/hello:latest
  • MIN-VALUE with the desired number of container instances to be kept warm, ready to receive requests. Specify default to clear any minimum instance setting.

YAML

You can download and view existing service configurations using the gcloud run services describe --format export command, which yields cleaned results in YAML format. You can then modify the fields described below and upload the modified YAML using the gcloud run services replace command. Make sure you only modify fields as documented.

  1. To view and download the configuration:

    gcloud run services describe SERVICE --format export > service.yaml
  2. Update the autoscaling.knative.dev/minScale: attribute:

    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: SERVICE
    spec:
      template:
        metadata:
          annotations:
            autoscaling.knative.dev/minScale: 'MIN-INSTANCE'
          name: REVISION

    Replace

    • SERVICE with the name of your Cloud Run service
    • MIN-INSTANCE with the desired number of instances to be kept warm, ready to receive requests.
    • REVISION with a new revision name or delete it (if present). If you supply a new revision name, it must meet the following criteria:
      • Starts with SERVICE-
      • Contains only lowercase letters, numbers and -
      • Does not end with a -
      • Does not exceed 63 characters
  3. Replace the service with its new configuration using the following command:

    gcloud run services replace service.yaml

Terraform

To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.

Add the following to a google_cloud_run_service resource in your Terraform configuration, under the template attribute. Replace 1 with your service's desired minimum number of instances.

metadata {
  annotations = {

    # Max instances
    # https://cloud.google.com/run/docs/configuring/max-instances
    "autoscaling.knative.dev/maxScale" = 10

    # Min instances
    # https://cloud.google.com/run/docs/configuring/min-instances
    "autoscaling.knative.dev/minScale" = 1

    # If true, garbage-collect CPU when once a request finishes
    # https://cloud.google.com/run/docs/configuring/cpu-allocation
    "run.googleapis.com/cpu-throttling" = false
  }
}

View minimum instances settings

To view the current minimum instances settings for your Cloud Run service:

Console

  1. Go to Cloud Run

  2. Click the service you are interested in to open the Service details page.

  3. Click the Revisions tab.

  4. In the details panel at the right, the minimum instances setting is listed under the Container tab.

Command line

  1. Use the following command:

    gcloud run services describe SERVICE
  2. Locate the minimum instances setting in the returned configuration.