This page describes how to use Airflow connections.
Airflow connections enable you to access resources in Google Cloud projects from a Cloud Composer environment. You create Airflow connection IDs to store information, such as logins and hostnames, and your workflows reference the connection IDs. Airflow connections are the recommended way to store secrets and credentials used in workflows.
Airflow connections enable you to store the connection information that is required for a Cloud Composer environment to communicate with other APIs, such as Google Cloud projects, other cloud providers, or third-party services.
An Airflow connection can store details, for example credentials, hostnames or additional API parameters. Each connection has an associated ID that you can use in workflow tasks to reference the preset details. We recommend that you use Airflow connections to store secrets and credentials for workflow tasks.
The Google Cloud connection type enables Google Cloud integrations.
Fernet key and secured connections
When you create a new environment, Cloud Composer generates a unique,
permanent fernet key for the environment and secures connection extras by default. You
can view the fernet_key
in the Airflow Configuration. For information about
how connections are secured, see Securing Connections.
Using the default connections
By default, Cloud Composer configures the following Airflow connections for Google Cloud Platform:
bigquery_default
google_cloud_default
google_cloud_datastore_default
google_cloud_storage_default
You can use these connections from your DAGs by using the default connection ID. The following example uses the BigQueryOperator with the default connection.
Airflow 2
Airflow 1
You can also specifying the connection ID explicitly when you create the operator.
Airflow 2
Airflow 1
Accessing resources in another project
The recommended way to allow your Cloud Composer environment to access resources in Google Cloud projects is by using the default connections and by assigning the appropriate Identity and Access Management permissions to the service account associated with your environment.
The following sections provide examples for how to allow reads and writes
to Cloud Storage buckets in your-storage-project
for a
Cloud Composer environment deployed in the project ID your-composer-project
.
Determining the service account associated with your environment
Console
- In the Google Cloud console, open the Environments page.
- In the Name column, click the name of the environment to open its Environment details page.
- Note the Service account. This value is an email address, such as
service-account-name@your-composer-project.iam.gserviceaccount.com
.
gcloud
Enter the following command and replace the VARIABLES with appropriate values:
gcloud composer environments describe ENVIRONMENT_NAME \ --location LOCATION \ --format="get(config.nodeConfig.serviceAccount)"
The output shows an address, such as
service-account-name@your-composer-project.iam.gserviceaccount.com
.
Granting the appropriate IAM permissions to the service account
To allow reads and writes to Cloud Storage buckets in
your-storage-project
, grant the roles/storage.objectAdmin
role to the service account associated
with your Cloud Composer environment.
Console
In the IAM & Admin page for your storage project.
Click Add members.
In the Add members dialog, specify the full email address of the service account associated with your Cloud Composer environment.
In the Select a role drop down, select the appropriate permissions. For this example, select the Storage > Object Admin role.
Click Add.
gcloud
Use the gcloud projects add-iam-policy-binding
command to add project-level IAM permissions. Replace the VARIABLES with
appropriate values:
gcloud projects add-iam-policy-binding YOUR_STORAGE_PROJECT \ --member=serviceAccount:SERVICE_ACCOUNT_EMAIL \ --role=roles/storage.objectAdmin
After the appropriate permissions are granted, you can access resources in
the your-storage-project
project with the same default Airflow connections
that you use to access resources in the your-composer-project
project.
Creating new Airflow connections
Before you begin
Grant the appropriate IAM permissions to the service account associated with your Cloud Composer environment and use the default connections in your DAG definitions. Follow the steps in this section if you are unable to do so.
Creating a connection to another project
The following steps provide examples for how to allow reads and writes to
Cloud Storage buckets in your-storage-project
for a Cloud Composer
environment deployed in the project ID your-composer-project
.
Create a service account in
your-storage-project
:In the Google Cloud console, go to the Service Accounts page.
Select your project.
Click
Create Service Account.In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
Optional: In the Service account description field, enter a description for the service account.
Click Create and continue.
Click the Select a role field and select a role you want to grant to the service account, such as Storage > Object Admin.
Click Done to finish creating the service account.
Do not close your browser window. You will use it in the next step.
Download a JSON key for the service account you just created:
- In the Google Cloud console, click the email address for the service account that you created.
- Click Keys.
- Click Add key, then click Create new key.
Click Create. A JSON key file is downloaded to your computer.
Make sure to store the key file securely, because it can be used to authenticate as your service account. You can move and rename this file however you would like.
Click Close.
Create a new connection:
Airflow UI
Access the Airflow web interface for your Cloud Composer environment.
In the Airflow web interface, open the Admin > Connections page.
To open the new connection form, click the Create tab.
Create a new connection:
- To choose a connection ID, fill out the Conn Id field, such as
my_gcp_connection
. Use this ID in your DAG definition files. - In the Conn Type field, select the Google Cloud Platform option.
- Enter a value for the Project Id that corresponds to the project that your service account belongs to.
Do one of the following:
- Copy the service account JSON key file that you downloaded into
the
data/
directory of your environment's Cloud Storage bucket. Then, in Keyfile Path, enter the local file path on the Airflow worker to the JSON keyfile's location, such as/home/airflow/gcs/data/keyfile.json
. - In Keyfile JSON, copy the contents of the service account JSON key file that you downloaded.
Users with access to Airflow connections through the CLI or Web UI can read credentials stored in
keyfile_dict
. To secure these credentials, we recommend that you use Keyfile Path and use a Cloud Storage ACL to restrict access to the key file.- Copy the service account JSON key file that you downloaded into
the
Enter a value in the Scopes field. It is recommended to use
https://www.googleapis.com/auth/cloud-platform
as the scope and to use IAM permissions on the service account to limit access to Google Cloud resources.To create the connection, click Save.
- To choose a connection ID, fill out the Conn Id field, such as
gcloud
Enter the following command:
Airflow 1
gcloud composer environments run \ ENVIRONMENT_NAME \ --location LOCATION \ connections -- --add \ --conn_id=CONNECTION_ID \ --conn_type=google_cloud_platform \ --conn_extra '{"extra__google_cloud_platform__CMD_ARGS": "...", "extra__google_cloud_platform__CMD_ARGS": "...", ...}'
Airflow 2
gcloud composer environments run \ ENVIRONMENT_NAME \ --location LOCATION \ connections add -- \ CONNECTION_ID \ --conn-type=google_cloud_platform \ --conn-extra '{"extra__google_cloud_platform__CMD_ARGS": "...", "extra__google_cloud_platform__CMD_ARGS": "...", ...}'
where:
ENVIRONMENT_NAME
is the name of the environment.LOCATION
is the region where the environment is located.CONNECTION_ID
is identifier for the connection. Use lower-case characters, and separate words with underscores.CMD_ARGS
are the following:project
is a project ID. Onlyextra__google_cloud_platform__project
is required.key_path
is a local file path on the Airflow worker to a JSON keyfile, such as/home/airflow/gcs/data/keyfile.json
. If provided, also requiresscope
. Usekey_path
orkeyfile_dict
, not both.keyfile_dict
is a JSON object that specifies the contents of the JSON keyfile that you downloaded. If provided, also requiresscope
. Usekeyfile_dict
orkey_path
, not both. Users with access to Airflow connections through the CLI or Web UI can read credentials stored inkeyfile_dict
. To secure these credentials, we recommend that you usekey_path
and apply a Cloud Storage ACL to restrict access to the key file.scope
is a comma-separated list of OAuth scopes.
For example:
Airflow 1
gcloud composer environments run test-environment \ --location us-central1 connections -- --add \ --conn_id=my_gcp_connection --conn_type=google_cloud_platform \ --conn_extra '{"extra__google_cloud_platform__project": "your-storage-project", "extra__google_cloud_platform__key_path": "/home/airflow/gcs/data/keyfile.json", "extra__google_cloud_platform__scope": "https://www.googleapis.com/auth/cloud-platform"}'
Airflow 2
gcloud composer environments run test-environment \ --location us-central1 connections add -- \ my_gcp_connection \ --conn-type=google_cloud_platform \ --conn-extra '{"extra__google_cloud_platform__project": "your-storage-project", "extra__google_cloud_platform__key_path": "/home/airflow/gcs/data/keyfile.json", "extra__google_cloud_platform__scope": "https://www.googleapis.com/auth/cloud-platform"}'
Using a new Airflow connection
To use the connection you created, set it as the corresponding connection ID argument when you construct a Google Cloud Airflow operator.
Airflow 2
Airflow 1