Cloud Composer 1 | Cloud Composer 2
VPC Service Controls enable organizations to define a perimeter around Google Cloud resources to mitigate data exfiltration risks.
Cloud Composer environments can be deployed within a service perimeter. By configuring your environment with VPC Service Controls, you can keep sensitive data private while taking advantage of the fully-managed workflow orchestration capabilities of Cloud Composer.
VPC Service Controls support for Cloud Composer means that:
- Cloud Composer can now be selected as a secured service inside a VPC Service Controls perimeter.
- All underlying resources used by Cloud Composer are configured to support VPC Service Controls architecture and follow its rules.
Deploying Cloud Composer environments with VPC Service Controls gives you:
- Reduced risk of data exfiltration.
- Protection against data exposure due to misconfigured access controls.
- Reduced risk of malicious users copying data to unauthorized Google Cloud resources, or external attackers accessing Google Cloud resources from the internet.
Airflow web server in VPC Service Controls mode
In VPC Service Controls mode, Cloud Composer runs two instances of the Airflow web server. Identity-Aware Proxy load balances user traffic between these instances. Airflow web servers run in "read-only" mode, which means:
DAG Serialization is enabled. As a result, Airflow web server does not parse DAG definition files.
Plugins are not synced to the web server, so you cannot modify or extend the web server functionality with plugins.
The Airflow web server uses a container image that is pre-built by the Cloud Composer service. If you install PyPI images in your environment, these images are not installed on the web server container image.
We recommend to protect access to the Airflow web server with Network ACLs. You can specify the IP ranges that can access the Airflow web server for a new or for an existing environment.
Creating a service perimeter
See Creating a service perimeter to learn how to create and configure service perimeters. Make sure to select Cloud Composer as one of the services secured within the perimeter.
Creating environments in a perimeter
There are additional steps required to deploy Cloud Composer inside a perimeter. When creating your Cloud Composer environment:
Enable Access Context Manager API and Cloud Composer API for your project. See Enabling APIs for reference.
Add the following services to the perimeter for maximum protection of your environment: Cloud SQL, Pub/Sub, Monitoring, Cloud Storage, GKE, Container Registry, Artifact Registry, and Compute Engine.
Use version composer-1.10.4 or later.
Make sure that DAGs serialization is enabled. If your environment uses Cloud Composer version 1.15.0 and later, the serialization is enabled by default.
Create a new Cloud Composer environment with Private IP enabled. Note that this setting must be configured during the environment creation.
When creating your environment, remember to configure access to the Airflow web server. For maximum protection, only allow access to the web server from specific IP ranges. For details, see the "Configure web server network access" step in Creating a new environment.
Configuring existing environments with VPC Service Controls
You can add the project containing your environment to the perimeter if:
You created the perimeter as described in the previous section
Your environments are Private IP environments.
- Your environments have DAG serialization enabled.
Installing PyPI packages
In the default VPC Service Controls configuration, Cloud Composer only supports installing PyPI packages from private repositories that are reachable from the private IP address space of the VPC network. The recommended configuration for this process is to set up a private PyPI repository, populate it with vetted packages used by your organization, then configure Cloud Composer to install Python dependencies from a private repository.
It's also possible to install PyPI packages from repositories outside the private IP space. Follow these steps:
- Configure Cloud NAT to allow Cloud Composer running in the private IP space to connect with external PyPI repositories.
- Configure your firewall rules to allow outbound connections from the Composer cluster to the repository.
When using this setup, make sure you understand the risks of using external repositories. Be sure that you trust the content and integrity of any external repositories, because these connections could potentially be used as an exfiltration vector.
Network configuration checklist
Your VPC network must be configured properly to create Cloud Composer environments inside a perimeter. Make sure to follow the configuration requirements listed below.
Firewall rules
Navigate to the VPC network -> Firewall section in console, and verify that the following firewall rules are configured.
Configure DNS service in your VPC as described in VPC Service Controls support for Cloud DNS. As an alternative, you can allow egress from GKE Node IP range to anywhere on port 53.
Allow ingress and egress traffic from GKE Node IP range to GKE Node IP range, all ports.
Allow ingress and egress traffic between GKE Node IP range and Pods IP range, all ports.
Allow ingress and egress traffic between GKE Node IP range and Services IP range, all ports.
Allow ingress and egress traffic between GKE Pods and Services IP ranges, all ports.
Allow egress from GKE Node IP range to GKE Master IP range, all ports.
Allow egress from GKE Node IP range to 199.36.153.4/30, port 443 (
restricted.googleapis.com
).Allow ingress from GCP Health Checks 130.211.0.0/22,35.191.0.0/16 to the Node IP range. TCP Ports 80 and 443.
Allow egress from the Node IP range to GCP Health Checks. TCP ports 80 and 443.
- Allow egress from GKE Node IP range to Web server IP range, TCP ports 3306 and 3307.
See Using firewall rules to learn how to check, add, and update rules for your VPC network. Use Connectivity Tool to validate the connectivity between IP ranges mentioned above.
Connectivity to the restricted.googleapis.com
endpoint
Configure connectivity to the restricted.googleapis.com
endpoint:
Verify the existence of a DNS mapping from
*.googleapis.com
torestricted.googleapis.com
.DNS
*.gcr.io
should resolve to199.36.153.4/30
similarly to thegoogleapis.com
endpoint. To do that, create a new zone as:CNAME *.gcr.io -> gcr.io. A gcr.io. -> 199.36.153.4, 199.36.153.5, 199.36.153.6, 199.36.153.7
.DNS
*.pkg.dev
should resolve to199.36.153.4/30
similarly to thegoogleapis.com
endpoint. To do that, create a new zone as:CNAME *.pkg.dev -> pkg.dev. A pkg.dev. -> 199.36.153.4, 199.36.153.5, 199.36.153.6, 199.36.153.7
.DNS
*.composer.cloud.google.com
should resolve to199.36.153.4/30
similarly to thegoogleapis.com
endpoint. To do that, create a new zone as:CNAME *.composer.cloud.google.com -> composer.cloud.google.com. A composer.cloud.google.com. -> 199.36.153.4, 199.36.153.5, 199.36.153.6, 199.36.153.7
.
For more information, see Setting up private connectivity to Google APIs and services.
VPC Service Controls logs
When troubleshooting environment creation issues, you can analyze audit logs generated by VPC Service Controls.
In addition to other log messages, you can check logs for information about
cloud-airflow-prod@system.gserviceaccount.com
and
service-PROJECT_ID@cloudcomposer-accounts.iam.gserviceaccount.com
service accounts that configure components of your environments.
Cloud Composer service uses the
cloud-airflow-prod@system.gserviceaccount.com
service account to manage
tenant project components of your environments.
The
service-PROJECT_ID@cloudcomposer-accounts.iam.gserviceaccount.com
service account, also known as Composer Service Agent Service Account
manages
environment components in
service and host projects.
Limitations
- All VPC Service Controls network constraints also apply to your Cloud Composer environments. See the VPC Service Controls documentation for details.
Displaying a rendered template with functions in the web UI with DAG serialization enabled is supported for environments running Cloud Composer version 1.12.0 or later and Airflow version 1.10.9 or later.
Setting the
async_dagbag_loader
flag toTrue
is not supported while DAG serialization is enabled.Enabling DAG serialization disables all Airflow web server plugins, as they could risk the security of the VPC network where Cloud Composer is deployed. This doesn't impact the behaviour of scheduler or worker plugins, including Airflow operators, sensors etc.
- When Cloud Composer is running inside a perimeter, access to public PyPI repositories is restricted. See Installing Python dependencies to learn how to install PyPI modules in Private IP mode.