Services, versions, and instances
At the highest level, an App Engine application is made up of one or more services, which can be configured to use different runtimes and to operate with different performance settings. Services let developers factor large applications into logical components that can share App Engine features and communicate in a secure fashion.
An app that handles customer requests might include separate services to handle other tasks, such as:
- API requests from mobile devices
- Internal, admin-like requests
- Backend processing such as billing pipelines and data analysis
Each service consists of source code and configuration files. The files used by a service represent a version of the service. When you deploy a service, you always deploy a specific version of the service. Having versions for each of your services allows you to roll back with a single click in the GCP Console, or to use traffic splitting to gradually increase traffic to the newly deployed version of a service.
Each service and each version must have a name. A name can contain numbers, letters, and hyphens. It cannot be longer than 63 characters and cannot start or end with a hyphen. Choose a unique name for each service and each version. Don't reuse names between services and versions.
While running, a particular version will have one or more instances. App Engine by default scales the number of instances running up and down to match the load, thus providing consistent performance for your app at all times while minimizing idle instances and thus reducing cost.
Features, such as Datastore, and Task Queues, are shared by all the services in
an application. Every service, version, and instance has its own unique URI, for
v1.my-service.my-app.appspot.com. Incoming user requests are routed
to an instance of a particular service/version according to URL addressing
conventions and an optional customized dispatch file.
While an application is running, incoming requests are routed to an existing or new instance of the appropriate service/version. The scaling type of a service/version controls how instances are created. Scaling settings are configured in the
app.yaml file. There are two scaling types:
- A service with manual scaling runs continuously the exact same number of instances irrespective of the load level. This allows tasks such as complex initializations and applications that rely on the state of the memory over time.
- Automatic scaling is based on request rate, response latencies, and other application metrics.
Monitoring resource usage
The Instances page of the GCP Console provides visibility into how instances are performing. You can see the memory and CPU usage of each instance, uptime, number of requests, and other statistics. You can also manually initiate the shutdown process for any instance.
Communication between services
Services can communicate with other services and external applications in various ways. The simplest approach is to send HTTP requests to a service by including its name in the URL:
<service-name>.app-id.appspot.com. If you require secure communication between services,
you can authorize requests.
There are limits to the number of services, versions, and instances (for services with manual scaling) for each application:
|Maximum services per application||5|
|Maximum versions per application||5 *|
|Maximum instances per version with manual scaling||20|
*Backend services, such as a backend service that is used by an external HTTP load balancer, can count towards your maximum versions limit.
Not all projects have the above limits. As your use of Google Cloud Platform expands over time, your limits may increase accordingly. If you expect a notable upcoming increase in usage, you can proactively request adjustments from the App Engine Quotas page in the GCP Console.