Concurrency
Cloud Run functions supports handling multiple concurrent requests on a single function instance. This can be helpful in preventing cold starts since an already warmed instance can process multiple requests simultaneously, thereby reducing overall latency.
When concurrency is enabled, Cloud Run functions does not provide isolation between concurrent requests processed by the same function instance. In such cases, you must ensure that your function code is safe to execute concurrently. Note the following considerations about certain language runtimes:
Node.js is inherently single-threaded. To take advantage of concurrency, use JavaScript's asynchronous code style, which is idiomatic in Node.js. See Asynchronous flow control in the official Node.js documentation for details.
We recommend starting with a lower concurrency like 8, and then moving it up. Starting with a concurrency that is too high could lead to unintended behavior due to resource constraints (such as memory or CPU).
For Python 3.8 and later, supporting high concurrency per function instance requires enough threads to handle the concurrency. We recommend that you set a runtime environment variable so that the threads value is equal to the concurrency value, for example:
THREADS=8
.
By default, function instances handle only one request at a time. You can change this behavior per function by setting a concurrency value as shown in the next section.
Set a concurrency value
The default concurrency value is 1. You can set a function's concurrency value to override the default value. The concurrency value represents the maximum number of concurrent requests that a single instance of the function can handle.
A concurrency value greater than 1 results in your function code being executed concurrently on a single instance. The maximum concurrency value is 1000 (though we recommend starting with a lower value and working your way up). Setting a concurrency value greater than 1 requires a function to have 1 or more vCPUs - see Memory limits for the default memory and vCPU values.
You can set concurrency for a function in either of the following ways:
- Cloud Run functions: gcloud CLI or Google Cloud console.
- Cloud Run: gcloud CLI or Google Cloud console.
Set concurrency using Cloud Run functions
gcloud
You can set a concurrency value using the gcloud CLI by deploying a
function with the
--concurrency
flag:
gcloud functions deploy YOUR_FUNCTION_NAME \ --gen2 \ --concurrency=CONCURRENCY_VALUE \ FLAGS...
Where CONCURRENCY_VALUE is the maximum number of concurrent requests allowed per container instance. Leave concurrency unspecified to receive the server default value.
Console
To set a concurrency value using the Google Cloud console:
- Go to the Cloud Run functions Overview page in the Google Cloud console.
- Click the name of your function to go to its Function details page.
- Click Edit.
- Expand the Runtime, build... section at the end of the page and click the Runtime tab.
- Under Concurrency, enter a concurrency value in the field labeled Maximum concurrent requests per instance.
- Click Next.
Click Deploy.
This is a necessary step for your changes to go into effect.
Set concurrency using Cloud Run
gcloud
To set a concurrency value using the gcloud CLI, update the
underlying Cloud Run service and specify the
--concurrency
flag:
gcloud run services update YOUR_FUNCTION_NAME --concurrency CONCURRENCY_VALUE
Where CONCURRENCY_VALUE is the maximum number of concurrent requests allowed per container instance. Leave concurrency unspecified to receive the server default value.
Console
To set a concurrency value using the Google Cloud console:
- Go to the Cloud Run functions Overview page in the Google Cloud console.
- Click the name of your function to go to its Function details page.
- In the pane labeled Powered by Cloud Run, click the name of your function to go to the underlying Cloud Run service's Service details page.
- Click Edit & deploy new revision at the top of the page.
- Open the Container tab.
Enter a concurrency value in the field labeled Maximum concurrent requests per instance.
If the value you supplied for Maximum concurrent requests per instance is greater than 1, scroll down the page and open the Containers section. Make sure that the CPU field contains a value of 1 or more.
Click Deploy.
This is a necessary step for your changes to go into effect.
Cloud Run functions builds on the concurrency support provided by Cloud Run. To learn more, see Maximum concurrent requests per instance (services) in the Cloud Run documentation.