Lifecycle of a container on Cloud Run
Developer Advocate, Google Cloud
Editor's note: Today's post comes from Wietse Venema, a software engineer and trainer at Binx.io and the author of the O’Reilly book about Google Cloud Run. In today's post, Wietse shares how understand the full container lifecycle, and the possible state transitions within it, so you can make the most of Cloud Run.
Serverless platform Cloud Run runs and autoscales your container-based application. You can make the most of this platform when you understand the full container lifecycle and the possible state transitions within it. Let’s review the states, from starting to stopped.
First, some context for those who have never heard of Cloud Run before (if you have, skip down to “Starting a Container”). The developer workflow on Cloud Run is a straightforward, three-step process:
- Write your application using your favorite programming language. Your application should start an HTTP server.
- Build and package your application into a container image.
- Deploy the container image to Cloud Run.
Once you deploy your container image, you’ll get a unique HTTPS endpoint back. Cloud Run then starts your container on demand to handle requests and ensures that all incoming requests are handled by dynamically adding and removing containers. Explore the hands-on quickstart to try it out for yourself.
It’s important to understand the distinction between a container image and a container. A container image is a package with your application and everything it needs to run; it’s the archive you store and distribute. A container represents the running processes of your application.
You can build and package your application into a container image in multiple ways. Docker gives you low-level control and flexibility. Jib and Buildpacks offer a higher-level, hands-off experience. You don’t need to be a container expert to be productive with Cloud Run, but if you are, Cloud Run won’t be in your way. Choose the containerization method that works best for you and your project.
Starting a Container
When a container starts, the following happens:
- Cloud Run creates the container’s root filesystem by materializing the container image.
- Once the container filesystem is ready, Cloud Run runs the entrypoint program of the container (your application).
- While your application is starting, Cloud Run continuously probes port 8080 to check whether your application is ready. (You can change the port number if you need to.)
- Once your application starts accepting TCP connections, Cloud Run forwards incoming HTTP requests to your container.
Remember, Cloud Run can only deploy container images that are stored in a Docker repository on Artifact Registry. However, it doesn’t pull the entire image from there every time it starts a new container. That would be needlessly slow.
Instead, Cloud Run pulls your container image from Artifact Registry only once, when you deploy a new version (called a revision on Cloud Run). It then makes a copy of your container image and stores it internally.
The internal storage is fast, ensuring that your image size is not a bottleneck for container startup time. Large images load as quickly as small ones. That’s useful to know if you’re trying to improve cold start latency. A cold start happens when a request comes in and no containers are available to handle it. In this case, Cloud Run will hold the request while it starts a new container.
If you want to be sure a container is always available to handle requests, configure minimum instances, which will help reduce the number of cold starts.
Because Cloud Run copies the image, you won’t get into trouble if you accidentally delete a deployed container image from Artifact Registry. The copy ensures that your Cloud Run service will continue to work.
When a container is not handling any requests, it is considered idle. On a traditional server, you might not think twice about this. But on Cloud Run, this is an important state:
- An idle container is free. You’re only billed for the resources your container uses when it is starting, handling requests (with a 100ms granularity), or shutting down.
- An idle container’s CPU is throttled to nearly zero. This means your application will run at a really slow pace. That makes sense, considering this is CPU time you’re not paying for.
When your container’s CPU is throttled, however, you can’t reliably perform background tasks on your container. Take a look at Cloud Tasks if you want to reliably schedule work to be performed later.
When a container handles a request after being idle, Cloud Run will unthrottle the container’s CPU instantly. Your application — and your user — won’t notice any lag.
Cloud Run can keep idle containers around longer than you might expect, too, in order to handle traffic spikes and reduce cold starts. Don’t count on it, though. Idle containers can be shut down at any time.
If your container is idle, Cloud Run can decide to stop it. By default, a container just disappears when it is shut down.
However, you can build your application to handle a SIGTERM signal (a Linux kernel feature). The SIGTERM signal warns your application that shutdown is imminent. That gives the application 10 seconds to clean things up before the container is removed, such as closing database connections or flushing buffers with data you still need to send somewhere else. You can learn how to handle SIGTERMs on Cloud Run so that your shutdowns will be graceful rather than abrupt.
So far, I’ve looked at Cloud Run’s happy state transitions. What happens if your application crashes and stops while it is handling requests?
When Things Go Wrong
Under normal circumstances, Cloud Run never stops a container that is handling requests. However, a container can stop suddenly in two cases: if your application exits (for instance due to an error in your application code) or if the container exceeds the memory limit.
If a container stops while it is handling requests, it takes down all its in-flight requests at that time: Those requests will fail with an error. While Cloud Run is starting a replacement container, new requests might have to wait. That’s something you’ll want to avoid.
You can avoid running out of memory by configuring memory limits. By default, a container gets 256MB of memory on Cloud Run, but you can increase the allocation to 4GB. Keep in mind, though, if your application allocates too much memory, Cloud Run will also stop the container without a SIGTERM warning.
In this post, you learned about the entire lifecycle of a container on Cloud Run, from starting to serving and shutting down. Here are the highlights:
- Cloud Run stores a local copy of your container image to load it really fast when it starts a container.
- A container is considered idle when it is not serving requests. You’re not paying for idle containers, but their CPU is throttled to nearly zero. Idle containers can be shut down.
- With SIGTERM you can shut down gracefully, but it’s not guaranteed to happen. Watch your memory limits and make sure errors don’t crash your application.
I’m a software engineer and trainer at Binx.io and the author of the O’Reilly book about Google Cloud Run (read the full chapter outline). Connect with me on Twitter: @wietsevenema (open DMs).