Cloud Composer 1 | Cloud Composer 2 | Cloud Composer 3
This page explains how to connect to a Cloud SQL instance that runs the Airflow database of your Cloud Composer environment and run SQL queries.
For example, you might want to run queries directly on the Airflow database, make database backups, gather statistics based on the database content, or retrieve any other custom information from the database.
For a Cloud Composer environment, one way to do it is by connecting to the Airflow database from a VM in the GKE cluster of your environment.
Get the name and region of your environment's cluster
In the Google Cloud console, go to the Environments page.
Select your environment.
Go to the Environment configuration tab.
The GKE cluster item lists the name and the region of your environment's GKE cluster.
For example, if the value of this parameter is
projects/example-project/locations/us-central1/clusters/us-central1-composer-exam-e-23125af4-gke
, then the cluster name isus-central1-composer-exam-e-23125af4-gke
, the cluster region isus-central1
.
Get the database connection parameters
Airflow 2
- Access the Airflow web interface for your environment.
In the Airflow web interface, go to Admin > Configurations.
Find the
sql_alchemy_conn
parameter.Get the username, password, and database name from the value of this parameter.
For example, if the value of this parameter is
postgresql+psycopg2://root:example-password@127.0.0.1:3306/composer-1-17-0-airflow-2-0-1-400fa094
, then the username isroot
, the password isexample-password
, the database name iscomposer-1-17-0-airflow-2-0-1-400fa094
.
Airflow 1
Cloud Composer 2 supports only Airflow 2.
Get the database endpoint address
Public IP
In the Google Cloud console, go to the Kubernetes Engine > Gateways, Services & Ingress page:
Select the
airflow-sqlproxy-service
service for your cluster. You can use the name of the cluster or the listed IP range for endpoints to look up the correct item.On the Service details page, find the IP address of a serving pod. It is listed in the Serving pods section, in the Endpoints column.
Private IP - PSC
In the Google Cloud console, go to the Network services > Private Service Connect page:
Click the endpoint with the name of your environment. For example, it can be named
us-central1-example-environ-d9cb394f-psc
.Your environment connects to the database through the IP address specified in the IP address row.
Private IP - VPC peerings
In the Google Cloud console, go to the Kubernetes Engine > Gateways, Services & Ingress page:
Select the
airflow-sqlproxy-service
service for your cluster. You can use the name of the cluster or the listed IP range for endpoints to look up the correct item.On the Service details page, find the IP address of a serving pod. It is listed in the Serving pods section, in the Endpoints column.
Create a VM instance
Create a new VM instance with the following parameters:
Region. Select the same region as your cluster.
Zone. Select any zone in the same region as your cluster.
Boot disk. Select a Linux distribution, for example,
Ubuntu 21.04 LTS
.Specify an IP address in the cluster's IP range:
- Expand the Management, security, disks, networking, sole tenancy item.
- Go to the Networking tab.
- In Network interfaces, expand the default interface item.
- Expand Show alias IP ranges.
- In the Subnet range Select the
gke-services
IP range for your cluster. For example,gke-europe-west3-composer-exam7b206598-gke-services-115b26e7
. You obtained the name of the cluster on the previous step. - In the Alias IP range, specify an IP address for the VM instance that in the IP address range of the specified subnet range.
Connect to the VM instance and install the SQL client package
Airflow 2
Connect to the VM instance that you created on the previous step.
Install the
postgresql-client
package.sudo apt-get update sudo apt-get install postgresql-client
Airflow 1
Cloud Composer 2 supports only Airflow 2.
Connect to the Airflow database
While connected to the VM instance, run the following command to connect to the Airflow database:
Airflow 2
psql --user=USERNAME --password \
--host=DB_ENDPOINT \
--port=3306 \
DB_NAME
Airflow 1
Cloud Composer 2 supports only Airflow 2.
Replace with the values obtained on the previous steps:
USERNAME
with the username.DB_ENDPOINT
with the IP address of the database endpoint that you obtained earlier in this guide.DB_NAME
with the database name.
Run SQL queries
A prompt appears. You can run SQL queries in this prompt. For example:
SELECT * FROM dag LIMIT 10;
Dump database contents and transfer them to a bucket
To dump database contents into a file, run the following command:
Airflow 2
pg_dump --user=USERNAME --password \
--host=DB_ENDPOINT \
--port=3306 \
DB_NAME > DUMP_FILE
Airflow 1
Cloud Composer 2 supports only Airflow 2.
Replace:
USERNAME
with the username.DB_ENDPOINT
with the IP address of the database endpoint.DB_NAME
with the database name.DUMP_FILE
with the name of the dump file. For example,dump.sql
.
To transfer the dump file to a Cloud Storage bucket:
gcloud storage cp DUMP_FILE BUCKET_ADDRESS
Replace:
DUMP_FILE
with the name of the dump file.BUCKET_ADDRESS
with the address of the bucket. For example,gs://europe-west3-db-dump
.
As an alternative, you can transfer the dump file locally
with gcloud compute scp
.