Creating Environments

This page explains how to create a Cloud Composer environment and override default Airflow environment settings during the creation process.

A Cloud Composer environment runs the Apache Airflow software. When creating a new environment in a Google Cloud Platform (GCP) project, you can specify several parameters, such as the Compute Engine machine type or the number of nodes in the cluster.

Before you begin

  • Enable the Cloud Composer API.

  • The following permission is required to create Cloud Composer environments: environments.create. For more information, see Cloud Composer Access Control.

  • By default, Cloud Composer environments run as the Compute Engine default service account. During environment creation, you can specify a custom service account. At a minimum, the service account requires the permissions that the composer.worker role provides to access resources in the Cloud Composer environment. You might see some additional Google-owned service accounts in your project’s IAM policy or in GCP Console. For information about the types and roles available, see Service Accounts.

  • If your custom service account needs to access other resources in your Google Cloud Platform project during task execution, you can add the required permissions to the service account. Or you can provide the relevant credentials as an Airflow connection and then reference the connection in the operator.

  • To use Shared VPC with Cloud Composer, you must configure the subnetwork with secondary ranges named composer-pods and composer-services to support Alias IPs. For information, see Configuring Shared VPC.

    Shared VPC is in Beta and requires the gcloud beta composer environments create command.

  • Most gcloud composer commands require a location. You can specify the location by using the --location flag or by setting the default location.

  • Some Airflow configurations are preconfigured for Cloud Composer, and you cannot change them.

  • It takes up to one hour to deploy the Airflow web interface and complete the environment creation process.

Creating a new environment

To create a Cloud Composer environment:

Console

  1. Open the Create Environment page in the Google Cloud Platform Console.

    Open the Create Environment page

  2. Enter a name for your environment.

    The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens, and cannot end with a hyphen.

  3. Under Node configuration, specify the settings for the Google Kubernetes Engine cluster. If you do not specify a setting, the default is used.

    SettingDescription
    Node countThe number of Google Kubernetes Engine nodes used to run the environment. The default is 3 nodes. The node count is the only Google Kubernetes Engine cluster setting that you can change after environment creation.
    Location(Required) The Compute Engine region where the environment is created.
    Zone suffixThe Compute Engine zone where the virtual machine instances that run Apache Airflow are created. A random zone within the location is selected if unspecified.
    Machine typeThe Compute Engine machine type used for cluster instances. The machine type determines the number of CPUs and the amount of memory for your environment. Cloud Composer supports Compute Engine standard machine types.

    The default machine type is n1-standard-1.

    Disk sizeThe disk size in GB used for the node VM instances. The minimum size is 20 GB. The default size is 100 GB.
    OAuth ScopesThe set of Google API scopes made available on all node VM instances. The default is https://www.googleapis.com/auth/cloud-platform and must be included in the list of specified scopes.
    Network IDThe Virtual Private Cloud network ID that is used for machine communications. The network ID is required to specify a subnetwork.

    The default network is used if unspecified. For Shared VPC, use the gcloud beta composer environments create command.

    Subnetwork IDThe Compute Engine subnetwork ID that is used for machine communications.
    Service accountThe Google Cloud Platform service account to be used by the node VM instances. The default Compute Engine service account is used if unspecified.
    TagsThe list of instance tags applied to all the node VM instances. Tags are used to identify valid sources or targets for network firewalls. Each tag within the list must comply with RFC 1035.
    Python versionThe Python version to use for your environment. Supported versions are Python 2 and Python 3. The default version is 2.

    Python 3 support is in Beta and available through the GCP Console and the v1beta1 API.

  4. (Optional) To change or override the default values in the Airflow configuration file (airflow.cfg), click Add Airflow configuration property.

  5. (Optional) To configure environment variables, click Add environment variable. See Environment Variables for requirements.

  6. (Optional) To add a label, click Add labels.

    Label keys and label values can only contain letters, numbers, dashes, and underscores. Label keys must start with a letter or number.

  7. Click Create.

gcloud

gcloud composer environments create ENVIRONMENT_NAME \
    --location LOCATION \
    OTHER_ARGUMENTS

The following parameters are required:

  • ENVIRONMENT_NAME is the name of the environment
  • LOCATION is the Compute Engine region where the environment is located.

The following arguments are optional:

  • airflow-configs is a list of SECTION_NAME-PROPERTY_NAME=VALUE Airflow configuration overrides. The section name and property name must be separated by a hyphen.
  • disk-size is the disk size in GB used for the node VMs. The minimum size is 20 GB. The default disk size is 100 GB.
  • env-variables is a list of NAME=VALUE environment variables that are set on the Airflow scheduler, worker, and webserver processes.
  • labels are user-specified labels that are attached to the environment and its resources.
  • machine-type is the Compute Engine machine type used for cluster instances. The machine type determines the number of CPUs and the amount of memory for your environment. Cloud Composer supports Compute Engine standard machine types. The default machine type is n1-standard-1.
  • network is the Virtual Private Cloud network used for machine communications. The network is required to specify a subnetwork. The default network is used if unspecified. When using Shared VPC, the network's relative resource name must be provided using the format projects/HOST_PROJECT_ID/global/networks/NETWORK_ID. For Shared VPC subnetwork requirements, see subnetwork below.
  • node-count is the number of GKE nodes used to run the environment. The default node count is 3. The node count is the only Google Kubernetes Engine cluster setting that you can change after environment creation.
  • oauth-scopes is the set of Google API scopes made available on all of the node VMs. The default OAuth scope is https://www.googleapis.com/auth/cloud-platform and must be included in the list of scopes if specified.
  • subnetwork is the Compute Engine subnetwork to which the environment is connected. For Shared VPC, you must configure the subnetwork with secondary ranges named composer-pods and composer-services to support Alias IPs. The subnetwork name must also be specified as a relative resource name using the format projects/HOST_PROJECT_ID/regions/REGION_ID/subnetworks/SUBNET_ID. Shared VPC is in Beta and requires the gcloud beta composer environments create command.
  • service-account is the Google Cloud Platform service account to be used by the node VM instances. The default Compute Engine service account is used if unspecified.
  • tags is the list of instance tags applied to all the node VMs. Tags are used to identify valid sources or targets for network firewalls. Each tag within the list must comply with RFC 1035.

The following example creates an environment in the us-central1 region that uses the n1-standard-2 machine type with a beta environment label:

gcloud composer environments create test-environment \
    --location us-central1 \
    --zone us-central1-f \
    --machine-type n1-standard-2 \
    --labels env=beta  

The following Shared VPC example creates an environment in the host project. The environment is in the us-central1 region and uses the n1-standard-2 machine type with a beta environment label:

gcloud composer environments create host-project-environment \
    --network vpc-network-name --subnetwork vpc-subnetwork-name
    --location us-central1 \
    --zone us-central1-f \
    --machine-type n1-standard-2 \
    --labels env=beta  

API

To create a new Cloud Composer environment with the Cloud Composer REST API, construct an environments.create API request, filling in the Environment resource with your configuration information.

Configuring email notifications

To receive notifications, configure your environment variables to send email through the SendGrid email service.

  1. If you haven't already, sign up with SendGrid via the Google Cloud Platform Console and create an API key. As a Google Cloud Platform developer, you can start with 12,000 free emails per month.

  2. In the GCP Console, open the Create Environment page.

    Open the Create Environment page

  3. Under Node configuration, click Add environment variable.

  4. Enter the following environment variables:

    Name Value
    SENDGRID_MAIL_FROM The From: email address, such as noreply-composer@.
    SENDGRID_API_KEY Your SendGrid API key.

Overriding Airflow configurations when creating an environment

When you create or update an environment, you can override Apache Airflow configuration properties. Some properties are blocked.

Console

  1. Open the Create Environment page.

    Open the Create Environment page

  2. Under Airflow configuration overrides, click Add Airflow configuration override.

  3. Enter the Section, Key, and new Value for the configuration.

gcloud

To override Airflow configurations when you create an environment:

gcloud composer environments create ENVIRONMENT_NAME \
    --location LOCATION \
    --airflow-configs=KEY=VALUE,KEY=VALUE,...
where:

  • ENVIRONMENT_NAME is the name of the environment.
  • LOCATION is the Compute Engine region where the environment is located.
  • KEY=VALUE is the configuration section and the property name separated by a hyphen, such as core-print_stats_interval, and its corresponding value.

For example:

gcloud composer environments create test-environment \
    --location us-central1 \
    --airflow-configs=core-load_example=True,webserver-dag_orientation=TB 

The command terminates when the operation is finished. To avoid waiting, use the --async flag. See the 'gcloud composer environments update' reference page for additional examples.

API

To override Airflow properties during the creation of the Cloud Composer environment with the Cloud Composer REST API, fill in the Environment resource's optional airflowConfigOverrides field when constructing the environments.create request.

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Composer