App Engine and Services
At the highest level, an App Engine application is made up of one or more services. Services let developers factor large applications into logical components. These components can then share App Engine features, such as Memcache, and communicate in a secure fashion, but can, if desired, be configured to use different runtimes and to operate with different performance settings.
An app that handles customer requests might include separate services to handle other tasks, such as:
- API requests from mobile devices
- Internal, admin-like requests
- Backend processing such as billing pipelines and data analysis
Versions and instances
Each service consists of source code and the configuration file. The files used by a service represent a version of the service. When you deploy a service, you always deploy a specific version of the service. Having versions for each of your services allows you to roll back with a single click in the GCP Console, or to use traffic splitting to gradually increase traffic to the newly deployed version of a service.
Each service and each version must have a name. Choose a unique name for each service and each version. Don't reuse names between services and versions.
While running, a particular version will have one or more instances. App Engine by default scales the number of instances running up and down to match the load, thus providing consistent performance for your app at all times while minimizing idle instances and thus reducing cost.
The diagram below illustrates the hierarchy of a running App Engine application:
Scaling types and instance classes
When you upload a version of a service, the configuration file specifies a scaling type and instance class that apply to every instance of that version. The scaling type controls how instances are created. The instance class determines compute resources (memory size and CPU speed) and pricing. There are three scaling types: manual, basic, and automatic. The available instance classes depend on the scaling type.
- Automatic Scaling
- Automatic scaling is based on request rate, response latencies, and other application metrics.
- Manual Scaling
- A service with manual scaling runs continuously, allowing you to perform complex initialization and rely on the state of its memory over time.
- Basic Scaling
- A service with basic scaling will create an instance when the application receives a request. The instance will be turned down when the app becomes idle. Basic scaling is ideal for work that is intermittent or driven by user activity.
This table compares the performance features of the three scaling types:
|Feature||Automatic scaling||Manual scaling||Basic scaling|
|Deadlines||60-second deadline for HTTP requests, 10-minute deadline for task queue tasks.||
Requests can run for up to 24 hours. A manually-scaled instance can choose to
||Same as manual scaling.|
|Background threads||Not allowed||Allowed||Allowed|
|Residence||Instances are evicted from memory based on usage patterns.||
Instances remain in memory, and state is preserved across requests. When
instances are restarted, an
Instances are evicted based on the
|Startup and shutdown||Instances are created on demand to handle requests and automatically turned down when idle.||
Instances are sent a start request automatically by App Engine in the
form of an empty GET request to
Instances are created on demand to handle requests and automatically
turned down when idle, based on the
|Instance addressability||Instances are anonymous.||
Instance "i" of version "v" of service "s" is addressable at the URL:
||Same as manual scaling.|
App Engine scales the number of instances automatically in response to
processing volume. This scaling factors in the
You configure the number of instances of each version in that
service's configuration file. The number of instances usually
corresponds to the size of a dataset being held in memory or the desired
throughput for offline work. You can adjust the
number of instances of a manually-scaled version very quickly, without
stopping instances that are currently running, using the Modules API
A service with basic scaling is configured by setting the maximum number
of instances in the
|Free daily usage quota||28 instance-hours||8 instance-hours||8 instance-hours|
Communication between services
Every service, version, and instance has its own unique URI, for example,
v1.my-service.my-app.appspot.com. Incoming user requests are routed to an
instance of a particular service/version according to URL addressing
conventions and an optional customized dispatch file.
You can also pass requests between services and from services to external endpoints using the URL Fetch API.All the services in an application share the state of the Datastore and Memcache services. They can also collaborate by assigning work between them to Task Queues. To access these shared services, use the corresponding App Engine APIs. Calls to these APIs are automatically mapped to the application’s namespace.
The maximum number of services and versions that you can deploy depends on your app's pricing:
|Limit||Free app||Paid app|
|Maximum services per app||5||105|
|Maximum versions per app||15||210|
There is also a limit to the number of instances for each service with basic or manual scaling:
|Maximum instances per manual/basic scaling version|
|Free app||Paid app US||Paid app EU|
|20||25 (200 for