How Instances are Managed

Instances are the computing units that App Engine uses to automatically scale your application. At any given time, your application can be running on one instance or many instances, with requests being spread across all of them.

Your instances with manual and basic scaling should run indefinitely, but there is no uptime guarantee. Hardware or software failures that cause early termination or frequent restarts can occur without warning and can take considerable time to resolve.

All flexible instances are restarted on a weekly basis. During restarts, critical, backwards-compatible updates are automatically rolled out to the underlying operating system. Your application's image will remain the same across restarts.

Health checking

App Engine sends periodic health check requests to confirm that an instance has been successfully deployed, and to check that a running instance maintains a healthy status. Each health check must be answered within a specified time interval. An instance is unhealthy when it fails to respond to a specified number of consecutive health check requests. An unhealthy instance will not receive any client requests, but health checks will still be sent. If an unhealthy instance continues to fail to respond to a predetermined number of consecutive health checks, it will be restarted.

There are two types of health checks: updated and legacy. Updated health check requests are enabled by default and have default threshold values. You can customize health checking by adding an optional health check section to your app's app.yaml file. You can also disable health checks entirely.

Whichever type of health check you decide to use, a healthy application should respond with an HTTP status code of 200.

Monitoring resource usage

The Instances page of the GCP Console provides visibility into how your instances are performing. You can see the memory and CPU usage of each instance, uptime, number of requests, and other statistics. You can also manually initiate the shutdown process for any instance.

Instance location

Instances are automatically located by geographical region according to the project settings.

Instance scaling

While an application is running, incoming requests are routed to an existing or new instance of the appropriate service/version. Each active version must have at least one instance running, and the scaling type of a service/version controls how additional instances are created. Scaling settings are configured in the app.yaml file. There are two scaling types:

Manual scaling
A service with manual scaling uses resident instances that continuously run the specified number of instances irrespective of the load level. This allows tasks such as complex initializations and applications that rely on the state of the memory over time.
Automatic scaling
Auto scaling services use dynamic instances that get created based on request rate, response latencies, and other application metrics. However, if you specify a number of minimum idle instances, that specified number of instances run as resident instances while any additional instances are dynamic.
Was this page helpful? Let us know how we did:

Send feedback about...

App Engine flexible environment for Node.js docs