A Python App Engine application can be configured by a file named
app.yaml that contains cpu, memory, network and disk resources,
automatic or manual scaling configurations, and general settings.
A Python app specifies runtime configuration, including versions and URLs, in a file named
app.yaml. The following is an example of an
app.yaml file for a Python application:
runtime: python env: flex entrypoint: gunicorn -b :$PORT main:app runtime_config: python_version: 3
The syntax of
app.yaml is the YAML format.
The YAML format supports comments. A line that begins with a pound (
#) character is ignored:
# This is a comment.
URL and file path patterns use POSIX extended regular expression syntax, excluding collating elements and collation classes. Back-references to grouped matches (e.g.
\1) are supported, as are these Perl extensions:
\w \W \s \S \d \D.
app.yaml file can include these general settings, note that some of them are required:
This setting is required. It is name of the App Engine language runtime
used by this application. To specify Python, use python.
This selects a full implementation of Python 2.7 or
3.4. Use the
||Select the flexible environment.|
Required if creating a service. Optional for the default service. Each
service and each version must have a name. A name can contain numbers,
letters, and hyphens. It cannot be longer than 63 characters and cannot
start or end with a hyphen. Choose a unique name for each service and each
version. Don't reuse names between services and versions.
Note: Services were previously called "modules."
For example, to skip files whose names end in
skip_files: - ^(.*/)?\.bak$
You can use the
runtime_config section to select a specific version of Python:
runtime_config: python_version: <version number>
The valid values for
3which uses the latest supported Python 3.x release, currently 3.5.2.
3.4which uses Python 3.4.2.
3.5which uses Python 3.5.2.
2which uses Python 2.7.9.
You don't need to include the
runtime_config section if you are using
Python 2, which is the default.
The flexible runtime environment
There are separate sections in the configuration file for specifying network settings, compute resources, and health checking behavior.
You can specify network settings in your
app.yaml configuration file, for
network: instance_tag: TAG_NAME name: NETWORK_NAME subnetwork_name: SUBNETWORK_NAME forwarded_ports: - PORT - HOST_PORT:CONTAINER_PORT - PORT/tcp - HOST_PORT:CONTAINER_PORT/udp
You can use the following options when configuring network settings:
||A tag with that name is assigned to each instance of the service when it is created. Tags can be useful in
||Every VM instance in the flexible environment is assigned to a Google Compute Engine network when it is created. Use this setting to specify a network name. Give the short name, not the resource path (for example,
||Optional. You can segment your network and use a custom subnetwork. Ensure that the network
||Optional. You can forward ports from your instance (
Advanced network configuration
You can segment your Compute Engine network into subnetworks. This allows you to enable VPN scenarios, such as accessing databases within your corporate network.
To enable subnetworks for your App Engine application:
Add the network name and subnetwork name to your
app.yamlfile, as specified above.
To establish a VPN, create a gateway and a tunnel for a custom subnet network.
Port forwarding allows for direct connections to the Docker container on your instances. This traffic can travel over any protocol. Port forwarding is intended to help with situations where you might need to attach a debugger or profiler.
By default, incoming traffic from outside your network is not allowed through
the Google Cloud Platform
After you have specified port forwarding in your
app.yaml file, you must add a
firewall rule that allows traffic from the ports you want opened.
For example, if you want to forward TCP traffic from port
entrypoint: gunicorn -b :$PORT -b :2222 main:app
In the network settings of your
network: forwarded_ports: - 2222/tcp
These settings control the computing resources. App Engine assigns a machine type based on the amount of CPU and memory you've specified. The machine is guaranteed to have at least the level of resources you've specified, it might have more.
You can specify up to eight volumes of tmpfs in the resource settings. You can then enable workloads that require shared memory via tmpfs and can improve file system I/O.
resources: cpu: 2 memory_gb: 2.3 disk_size_gb: 10 volumes: - name: ramdisk1 volume_type: tmpfs size_gb: 0.5
You can use the following options when configuring resource settings:
||The number of cores; it must be one or an even number between 2 and 32.||1 core|
RAM in GB. The requested memory for your application, which does not include the ~0.4 GB of memory that is required for the overhead of some processes. Each CPU core requires a total memory between 0.9 and 6.5 GB.
To calculate the requested memory:
For the example above where you have specified 2 cores, you can
request between 1.4 and 12.6 GB. The total amount of memory available to
the application is set by the runtime environment as the environment
||Size in GB. The minimum is 10 GB and the maximum is 10240 GB.||10 GB|
Required, if using volumes. Name of the volume. Names must be unique and
between 1 and 63 characters. Characters can be lowercase letters,
numbers, or dashes. The first character must be a letter, and the last
character cannot be a dash. The volume is mounted in the app container
||Required, if using volumes. Must be
||Required, if using volumes. Size of the volume, in GB. The minimum is 0.001 GB and the maximum is the amount of memory available in the application container and on the underlying device. Google does not add additional RAM to your system to satisfy the disk requirements. RAM allocated for tmpfs volumes will be subtracted from memory available to the app container. The precision is system dependent.|
Periodic health check requests are used to confirm that a VM instance has been successfully deployed, and to check that a running instance maintains a healthy status. Each health check must be answered within a specified time interval. An instance is unhealthy when it fails to respond to a specified number of consecutive health check requests. An unhealthy instance will not receive any client requests, but health checks will still be sent. If an unhealthy instance continues to fail to respond to a predetermined number of consecutive health checks, it will be restarted.
Health check requests are enabled by default, with default threshold values. You can customize VM health checking by adding an optional health check section to your configuration file:
health_check: enable_health_check: True check_interval_sec: 5 timeout_sec: 4 unhealthy_threshold: 2 healthy_threshold: 2
You can use the following options with health checks:
||Enable/disable health checks. Health checks are enabled by default.
To disable health checking, set to
||Time interval between checks.||1 second|
||Health check timeout interval.||1 second|
||An instance is unhealthy after failing this number of consecutive checks.||1 check|
||An unhealthy instance becomes healthy again after successfully
responding to this number of consecutive checks.
||When the number of failed consecutive health checks exceeds this threshold, the instance is restarted.||300 checks|
Service scaling settings
The keys used to control scaling of a service depend on the type of scaling you assign to a service:
If you do not specify any scaling, then automatic scaling is selected by default.
You can set automatic scaling in the
app.yaml file. For example:
service: my-service runtime: python env: flex automatic_scaling: min_num_instances: 5 max_num_instances: 20 cool_down_period_sec: 120 # default value cpu_utilization: target_utilization: 0.5
When you use automatic scaling you must specify the minimum and maximum number of instances. The other settings are optional.
||Automatic scaling is assumed by default. Include this line if you are going to specify any of the automatic scaling settings.|
||Default is 20. Specifies the maximum number of instances that each version of your app can scale up to. The maximum number of instances in your project is limited by your project's resource quota.|
||The number of seconds that the autoscaler should wait before it starts collecting information from a new instance. This prevents the autoscaler from collecting information when the instance is initializing, during which the collected usage would not be reliable. The cool-down period must be greater than or equal to 60 seconds. The default is 120 seconds.|
||This header is required if you are going to specify the target CPU utilization.|
||Target CPU utilization (default 0.5). CPU use is averaged across all running instances and is used to decide when to reduce or increase the number of instances.|
Manual scalingYou can set manual scaling in the
app.yamlfile. For example:
service: my-service runtime: python env: flex manual_scaling: instances: 5
The following table lists the settings you can use with manual scaling:
||Required to enable manual scaling for a service.|
The number of instances to assign to the service at the start.
This number can later be altered
by using the Modules API
Defining environment variables
You can define environment variables in
app.yaml to make them available to the app:
It is then possible to get these values using
env_variables: MY_VAR: 'my value'