Setting maximum concurrent requests per instance (services)

Stay organized with collections Save and categorize content based on your preferences.

To understand the maximum concurrent requests per instance setting, read the concept document.

Any configuration change leads to the creation of a new revision. Subsequent revisions will also automatically get this configuration setting unless you make explicit updates to change it.

For Cloud Run services, you can set maximum concurrent requests per instance using the Google Cloud console, the gcloud command line, or using a .yaml file when you create a new service or deploy a new revision:

Console

  1. Go to Cloud Run

  2. Click Create Service if you are configuring a new service you are deploying to. If you are configuring an existing service, click on the service, then click Edit and Deploy New Revision.

  3. If you are configuring a new service, fill out the initial service settings page as desired, then click Container, connections, security to expand the service configuration page.

  4. Click the Container tab.

    image

  5. Set the desired maximum concurrent requests per instance value in the text box Maximum requests per container.

  6. Click Create or Deploy.

Command line

To set maximum concurrent requests per instance, use the following command:

gcloud run services update SERVICE --concurrency CONCURRENCY

Replace

  • SERVICE with the name of your service.
  • CONCURRENCY with the maximum number of concurrent requests per container instance. For example the following sets a maximum of 1 concurrent requests:

    gcloud run services update SERVICE --concurrency 1

Changing the maximum concurrent requests per instance of a given service will capture this setting in a new revision.

To revert to the default maximum concurrent requests per instance (80), use the command

gcloud run services update SERVICE --concurrency default

Replace SERVICE with the name of the service you are configuring.

YAML

You can download and view existing service configuration using the gcloud run services describe --format export command, which yields cleaned results in YAML format. You can then modify the fields described below and upload the modified YAML using the gcloud run services replace command. Make sure you only modify fields as documented.

  1. To view and download the configuration:

    gcloud run services describe SERVICE --format export > service.yaml
  2. Update the containerConcurrency attribute:

    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: SERVICE
    spec:
      template:
        metadata:
          name: REVISION
        spec:
          containerConcurrency: CONCURRENCY

    Replace

    • SERVICE with the name of your Cloud Run service
    • IMAGE_URL with a reference to the container image, for example, us-docker.pkg.dev/cloudrun/container/hello:latest
    • CONCURRENCY with the maximum number of concurrent requests per container instance.
    • REVISION with a new revision name or delete it (if present). If you supply a new revision name, it must meet the following criteria:
      • Starts with SERVICE-
      • Contains only lowercase letters, numbers and -
      • Does not end with a -
      • Does not exceed 63 characters
  3. Replace the service with its new configuration using the following command:

    gcloud run services replace service.yaml

Terraform

Add the following to a google_cloud_run_service resource in your Terraform configuration, under template.spec.containers. Replace 80 with your desired maximum number of concurrent requests.

# Maximum concurrent requests
# https://cloud.google.com/run/docs/configuring/concurrency
container_concurrency = 80

To apply your Terraform configuration in a Google Cloud project, complete the following steps:

  1. Launch Cloud Shell.
  2. Set the Google Cloud project where you want to apply the Terraform configuration:
    export GOOGLE_CLOUD_PROJECT=PROJECT_ID
    
  3. Create a directory and open a new file in that directory. The filename must have the .tf extension, for example main.tf:
    mkdir DIRECTORY && cd DIRECTORY && nano main.tf
    
  4. Copy the sample into main.tf.
  5. Review and modify the sample parameters to apply to your environment.
  6. Save your changes by pressing Ctrl-x and then y.
  7. Initialize Terraform:
    terraform init
  8. Review the configuration and verify that the resources that Terraform is going to create or update match your expectations:
    terraform plan

    Make corrections to the configuration as necessary.

  9. Apply the Terraform configuration by running the following command and entering yes at the prompt:
    terraform apply

    Wait until Terraform displays the "Apply complete!" message.

  10. Open your Google Cloud project to view the results. In the Google Cloud console, navigate to your resources in the UI to make sure that Terraform has created or updated them.

View concurrency settings

To view the current concurrency settings for your Cloud Run service:

Console

  1. Go to Cloud Run

  2. Click the service you are interested in to open the Service details page.

  3. Click the Revisions tab.

  4. In the details panel at the right, the concurrency setting is listed under the Container tab.

Command line

  1. Use the following command:

    gcloud run services describe SERVICE
  2. Locate the concurrency setting in the returned configuration.