The following diagram shows the Cloud Run resource model for services:
The diagram shows a Google Cloud project containing three Cloud Run
services, Service A
, Service B
and Service C
, each of which has several revisions.
In the diagram, Service A
is receiving many requests, which results in the
startup and running of several instances, each running a single container. Note that Service B
is
not currently receiving requests, so no instance is started yet.
Service C
is running multiple containers per instance within each revision: note that only
the ingress container receives the request. Every instance with multiple
containers scales as an independent unit.
Cloud Run services
The service is the main resource of Cloud Run. Each service is located in a specific Google Cloud region (Cloud Run). For redundancy and failover, services are automatically replicated across multiple zones in the region they are in. A given Google Cloud project can run many services in different regions.
Each service exposes a unique endpoint and automatically scales the underlying infrastructure to handle incoming requests.
Cloud Run revisions
Each deployment to a service creates a revision. A revision consists of one or more container images, along with environment settings such as environment variables, memory limits, or concurrency value.
Revisions are immutable: once a revision has been created, it cannot be modified. For example, when you deploy a container image to a new Cloud Run service, the first revision is created. If you then deploy a different container image to that same service, a second revision is created. If you subsequently set an environment variable, a third revision is created, and so on.
Requests are automatically routed as soon as possible to the latest healthy service revision.
Cloud Run jobs
Each job is located in a specific Google Cloud region and executes one or more containers to completion. A job consists of one or multiple independent tasks that are executed in parallel in a given job execution. Each task runs one container, and might retry it.
Cloud Run job executions
When a job is executed, a job execution is created in which all job tasks are started. All tasks in a job execution must complete successfully for the job execution to be successful. You can set timeouts on tasks and specify the number of retries in case of task failure. If any task exceeds its maximum number of retries, that task is marked as failed and the job is marked as failed. By default, tasks execute in parallel up to a maximum of 100, but you can specify a lower maximum if any of your backing resources require it.
Cloud Run instances
Each revision receiving requests is automatically scaled to the number of instances needed to handle all these requests. Note that the ingress container within an instance can receive many requests at the same time. With the concurrency setting, you can set the maximum number of requests that can be sent in parallel to a given instance.