Setting a maximum number of container instances

This page describes how to set the maximum number of container instances that can be used for your Cloud Run service. Specifying maximum instances in Cloud Run allows you to limit the scaling of your service in response to incoming requests. Use this setting as a way to control your costs or to limit the number of connections to a backing service, such as to a database.

Note that to specify a maximum number of instances greater than 1000 for Cloud Run (fully managed), you must first request a quota increase.

For more information on the way Cloud Run autoscales container instances, refer to Instance autoscaling.

Setting and updating maximum instances

Any configuration change leads to the creation of a new revision. Subsequent revisions will also automatically get this configuration setting unless you make explicit updates to change it.

By default, container instances can scale up to 1000 instances. You can change this default using the Cloud Console, the gcloud command line, or a YAML file when you create a new service or deploy a new revision:

Console

  1. Go to Cloud Run

  2. Click CREATE SERVICE if you are configuring a new service you are deploying to. If you are configuring an existing service, click on the service, then click EDIT & DEPLOY NEW REVISION.

  3. Click SHOW ADVANCED SETTINGS > CONTAINER.

    image

  4. In the field labelled Maximum number of instances, specify the desired maximum number of container instances, using any integer value from 1 to 1000 or more if you requested a quota increase.

  5. Click Create or Deploy.

Command line

You can update the maximum number of container instances of a given service by using the following command:

gcloud run services update SERVICE --max-instances MAX-VALUE

Replace

  • SERVICE with the name of your service and
  • MAX-VALUE with the desired maximum number of container instances, using any integer value from 1 to 1000 or more if you requested a quota increase. Specify default to clear any maximum instance setting.

You can also set the maximum number of container instances during deployment using the command:

gcloud run deploy --image gcr.io/PROJECT-ID/IMAGE --max-instances MAX-VALUE

Replace

  • PROJECT-ID with your Google Cloud project ID.
  • IMAGE with the name of your image.
  • MAX-VALUE with the desired maximum number of container instances, using any integer value from 1 to 1000or more if you requested a quota increase. Specify default to clear any maximum instance setting.

YAML

You can download and view existing service configuration using the gcloud run services describe --format export command, which yields cleaned results in YAML format. You can then modify the fields described below and upload the modified YAML using the gcloud beta run services replace command. Make sure you only modify fields as documented.

  1. To view and download the configuration:

    gcloud run services describe SERVICE --format export > service.yaml
  2. Update the autoscaling.knative.dev/maxScale: attribute:

    spec:
     template:
       metadata:
         annotations:
           autoscaling.knative.dev/maxScale: 'MAX-INSTANCE' 

    Replace

    • MAX-INSTANCE with the desired maximum number.
  3. Replace the service with its new configuration using the following command:

    gcloud beta run services replace service.yaml