Using minimum instances

By default, Cloud Run scales in to the number of instances based on the number of incoming requests. However, if your service requires reduced latency and you want to limit the number of cold starts, you can change this default behavior by specifying a minimum number of container instances to be kept warm and ready to serve requests.

Instances kept running in this way do incur billing costs.

This page describes how to enable idle instances for your service using the minimum instances setting.

Revisions and minimum instances

Minimum instances are started only if the revision is addressable. A revision is addressable if either of the following is true:

  • It receives a percentage of the traffic
  • it was assigned a revision tag

Setting and updating minimum instances

Any configuration change leads to the creation of a new revision. Subsequent revisions will also automatically get this configuration setting unless you make explicit updates to change it.

By default, container instances have min-instances turned off, with a setting of 0. You can change this default using the Cloud Console, the gcloud command line, or a YAML file when you create a new service or deploy a new revision:

Console

  1. Go to Cloud Run

  2. Click Create Service if you are configuring a new service you are deploying to. If you are configuring an existing service, click on the service, then click Edit and Deploy New Revision.

  3. Under Advanced Settings, click Container.

    image

  4. In the field labelled Minimum number of instances, specify the desired number of container instances to be kept warm, ready to receive requests.

  5. Click Create or Deploy.

Command line

You can update min-instance of a given service by using the following command:

gcloud beta run services update SERVICE --min-instances MIN-VALUE

Replace

  • SERVICE with the name of your service and
  • MIN-VALUE with the desired number of container instances to be kept warm, ready to receive requests. Specify default to clear any minimum instance setting.

You can also set min-instance during deployment using the command:

gcloud beta run deploy --image IMAGE_URL --min-instances MIN-VALUE

Replace

  • IMAGE_URL with a reference to the container image, for example, gcr.io/myproject/my-image:latest
  • MIN-VALUE with the desired number of container instances to be kept warm, ready to receive requests. Specify default to clear any minimum instance setting.

YAML

You can download and view existing service configuration using the gcloud run services describe --format export command, which yields cleaned results in YAML format. You can then modify the fields described below and upload the modified YAML using the gcloud beta run services replace command. Make sure you only modify fields as documented.

  1. To view and download the configuration:

    gcloud run services describe SERVICE --format export > service.yaml
  2. Update the autoscaling.knative.dev/minScale: attribute:

    spec:
     template:
       metadata:
         annotations:
           autoscaling.knative.dev/minScale: 'MIN-INSTANCE' 

    Replace

    • MIN-INSTANCE with the desired number of instances to be kept warm, ready to receive requests.
  3. Replace the service with its new configuration using the following command:

    gcloud beta run services replace service.yaml