Cloud Run reliability guide

Last reviewed 2023-08-08 UTC

Cloud Run is a managed compute platform suitable for deploying containerized applications, and is serverless. Cloud Run abstracts away all infrastructure so users can focus on building applications.

Best practices

  • Cloud Run general tips - how to implement a Cloud Run service, start containers quickly, use global variables, and improve container security.
  • Load testing best practices - how to load test Cloud Run services, including addressing concurrency problems before load testing, managing the maximum number of instances, choosing the best region for load testing, and ensuring services scale with load.
  • Instance scaling - how to scale and limit container instances and minimize response time by keeping some instances idle instead of stopping them.
  • Using minimum instances - specify the least number of container instances ready to serve, and when set appropriately high, minimize average response time by reducing the number of cold starts.
  • Optimizing Java applications for Cloud Run - understand the tradeoffs of some optimizations for Cloud Run services written in Java, and reduce startup time and memory usage.
  • Optimizing Python applications for Cloud Run - optimize the container image by improving efficiency of the WSGI server, and optimize applications by reducing the number of threads and executing startup tasks in parallel.