An Overview of App Engine

Services, versions, and instances

At the highest level, an App Engine application is made up of one or more services, which can be configured to use different runtimes and to operate with different performance settings. Services let developers factor large applications into logical components that can share App Engine features and communicate in a secure fashion.

A deployed service behaves like a microservice. By using multiple services you can deploy your app as a set of microservices, which is a popular design pattern.

An app that handles customer requests might include separate services to handle other tasks, such as:

  • API requests from mobile devices
  • Internal, admin-like requests
  • Backend processing such as billing pipelines and data analysis

Each service consists of source code and configuration files. The files used by a service represent a version of the service. When you deploy a service, you always deploy a specific version of the service. Having versions for each of your services allows you to roll back with a single click in the GCP Console, or to use traffic splitting to gradually increase traffic to the newly deployed version of a service.

Each service and each version must have a name. A name can contain numbers, letters, and hyphens. It cannot be longer than 63 characters and cannot start or end with a hyphen. Choose a unique name for each service and each version. Don't reuse names between services and versions.

While running, a particular version will have one or more instances. App Engine by default scales the number of instances running up and down to match the load, thus providing consistent performance for your app at all times while minimizing idle instances and thus reducing cost.

Hierarchy graph of services/versions/instances

Features, such as Datastore, and Task Queues, are shared by all the services in an application. Every service, version, and instance has its own unique URI, for example, Incoming user requests are routed to an instance of a particular service/version according to URL addressing conventions and an optional customized dispatch file.

Instance scaling

While an application is running, incoming requests are routed to an existing or new instance of the appropriate service/version. The scaling type of a service/version controls how instances are created. Scaling settings are configured in the app.yaml file. There are two scaling types:

Manual scaling
A service with manual scaling runs continuously the exact same number of instances irrespective of the load level. This allows tasks such as complex initializations and applications that rely on the state of the memory over time.
Automatic scaling
Automatic scaling is based on request rate, response latencies, and other application metrics.

Monitoring resource usage

The Instances page of the GCP Console provides visibility into how instances are performing. You can see the memory and CPU usage of each instance, uptime, number of requests, and other statistics. You can also manually initiate the shutdown process for any instance.


When you write to stdout or stderr, entries appear in the GCP Console Logs page.

Communication between services

Services can communicate with other services and external applications in various ways. The simplest approach is to send HTTP requests to a service by including its name in the URL: <service-name> If you require secure communication between services, you can authorize requests.

Services can also communicate via Cloud Pub/Sub. Pub/Sub provides reliable asynchronous many-to-many messaging between processes, including App Engine. These processes can be individual instances of your application, services, or even external applications.

Services and external applications can also share data stored in databases such as Cloud Datastore, Cloud SQL, or third-party databases.


There are limits to the number of services, versions, and instances (for services with manual scaling) for each application:

Description Limit
Maximum services per application 5
Maximum versions per application 5 *
Maximum instances per version with manual scaling 20

*Backend services, such as a backend service that is used by an external HTTP load balancer, can count towards your maximum versions limit.

Not all projects have the above limits. As your use of Google Cloud Platform expands over time, your limits may increase accordingly. If you expect a notable upcoming increase in usage, you can proactively request adjustments from the App Engine Quotas page in the GCP Console.

Send feedback about...

App Engine flexible environment for Node.js docs