This page provides an overview of the features and capabilities of Cloud Composer.
Cloud Composer is a managed Apache Airflow service that helps you create, schedule, monitor and manage workflows.
A Cloud Composer environment is a wrapper around Apache Airflow. Cloud Composer creates the following components for each environment:
- Web server: The web server runs the Apache Airflow web interface, and Cloud Identity-Aware Proxy protects the interface. For more information, see Airflow Web Interface.
- Database: The database holds the Apache Airflow metadata.
- Cloud Storage bucket: Cloud Composer associates a Cloud Storage bucket with the environment. The associated bucket stores the DAGs, logs, custom plugins, and data for the environment. For more information about the storage bucket for Cloud Composer, see Cloud Storage.
To access and manage your Airflow environments, you can use the following Airflow-native tools:
- Web interface: You can access the Airflow web interface from the Google Cloud Platform Console or by direct URL with the appropriate permissions. For information, see Airflow Web Interface.
- Command line tools: After you install the Cloud SDK, you can run
gcloud beta composer composer environmentscommands to issue Airflow CLI commands to Cloud Composer environments. For information, see Airflow Command-line Interface.
In addition to native tools, the Cloud Composer REST or RPC APIs provide programmatic access to your Airflow environments. For more information, see APIs & References.
In general, the configurations that Cloud Composer provides for Apache Airflow is the same as a locally-hosted Airflow deployment. Some Airflow configurations are preconfigured in Cloud Composer, and you cannot change the configuration properties. Other configurations, you specify when creating or updating your environment. For more information, see Airflow Configurations.
Airflow DAGs (workflows)
An Apache Airflow DAG is a workflow: a collection of tasks with additional task dependencies. Cloud Composer uses Cloud Storage to store DAGs. To add or remove DAGs from your Cloud Composer environment, you add or remove the DAGs from the Cloud Storage bucket associated with the environment. Once moved to the storage bucket, DAGs are automatically added and scheduled in your environment.
In addition to scheduling DAGs, you can trigger DAGs manually or in response to events, such as changes that occur in the associated Cloud Storage bucket. For more information, see Triggering DAGs.
You can install custom plugins, such as custom, in-house Apache Airflow operators, hooks, sensors, or interfaces, into your Cloud Composer environment. For more information, see Cloud Composer Plugins.
You can install Python dependencies from the Python Package Index in your environment, or if the dependencies are not in the package index, you can use the plugins feature. For more information, see Installing Python Dependencies.
You manage security at the GCP project level and can assign Cloud Identity and Access Management (IAM) roles that prevent individual users from modifying or creating environments. If someone does not have access to your project or does not have an appropriate Cloud Composer IAM role, that person cannot access any of your environments. For more information, see Cloud Composer Access Control.
Logging and monitoring
You can view Airflow logs that are associated with single DAG tasks
in the Airflow web interface
logs folder in the associated Cloud Storage bucket.
Streaming logs are available for Cloud Composer. You can access the streaming logs in Logs Viewer in the Google Cloud Platform Console and by using Stackdriver. For information about using Stackdriver, see Monitoring Cloud Composer Environments.
Cloud Composer also provides audit logs, such as Admin Activity audit logs, for your GCP projects. For information, see Viewing Audit Logs.