Code execution in Cloud Run

A key advantage of using Cloud Run to host AI agents is that it isolates code using its secure execution environment. By building a code sandbox tool in Cloud Run and running it in your container, you can execute application code using any programming language you choose.

Secure two-layer sandboxing

Cloud Run isolates all instances by using a two-layer sandbox that consists of a hardware-backed layer equivalent to individual VMs (x86 virtualization) and a software kernel layer. For more information, see Security design overview.

When you deploy your code, Cloud Run confines the code within the sandboxing environment. This isolation lets you run untrusted code, such as code generated by a large language model (LLM), with greater security. When you execute untrusted code, restrict IAM permissions on your Cloud Run service and use VPC firewall rules to prevent your code from making calls to the internet.

Code execution modes

Cloud Run provides the following modes for code execution:

  • Asynchronous execution: to avoid disrupting the main application flow, execute tasks asynchronously by using a Cloud Run job for longer-running or background tasks. For example, execute a Cloud Run job that uploads code to Cloud Storage, installs the required dependencies, and then processes and stores the results back in Cloud Storage.

  • Synchronous execution: for processes that require an immediate response, use a Cloud Run service. A Cloud Run service has a maximum timeout of one hour, which provides a significant amount of time for your code to run. To limit instances to process one request at a time, set the concurrency value to 1. You can also retrieve the code to execute as part of the request body, return the result in the response, and then terminate the container instance.

    The following image shows the two modes of code execution:

    For asynchronous execution, code is uploaded to Cloud Storage, which is executed as a Cloud Run job. The job sends the results of the execution to Cloud Storage to be stored. For synchronous execution, the agent sends a request to the Cloud Run service, with code in the request body. The service has a one-hour timeout and a concurrency of `1`. The Cloud Run service processes the code and sends it to the service instance, which returns a response back to the agent.
    Figure 1. Code execution modes in Cloud Run

What's next