This documentation is for the Latest version of Cloud Run for Anthos, which uses Anthos fleets and Anthos Service Mesh. Learn more.

The past version has been archived but the documentation remains available for existing users.

Resource model

The following diagram shows the Cloud Run for Anthos resource model:

Cloud Run for Anthos services and revisions

The diagram shows a Google Cloud project containing two Cloud Run for Anthos services, Service A and Service B, each of which has several revisions.

In the diagram, Service A is receiving many requests, which results in the startup and running of several container instances. Note that Service B is not currently receiving requests, so no container instance is started yet.

Cloud Run for Anthos services

The service is the main resource of Cloud Run for Anthos. Each service is located in a specific GKE cluster namespace.

A given Google Cloud project can run many services in different regions or GKE clusters.

Each service exposes a unique endpoint and automatically scales the underlying infrastructure to handle incoming requests.

Cloud Run for Anthos revisions

Each deployment to a service creates a revision. A revision consists of a specific container image, along with environment settings such as environment variables, memory limits, or concurrency value.

Revisions are immutable: once a revision has been created, it cannot be modified. For example, when you deploy a container image to a new Cloud Run for Anthos service, the first revision is created. If you then deploy a different container image to that same service, a second revision is created. If you subsequently set an environment variable, a third revision is created, and so on.

Requests are automatically routed as soon as possible to the latest healthy service revision. You can split traffic between different revisions as desired.

Cloud Run for Anthos container instances

Each revision receiving requests is automatically scaled to the number of container instances needed to handle all these requests. Note that a container instance can receive many requests at the same time. With the concurrency setting, you can set the maximum number of requests that can be sent in parallel to a given container instance.