In a normal Cloud Composer environment, Directed Acrylic Graphs (DAGs) are processed continuously by the Airflow scheduler and web server. You can improve the reliability and performance of the Airflow web server by enabling DAG serialization, which forces the scheduler to process DAG files before they are sent to the web server.
How it works
Without DAG serialization, DAGs are processed simultaneously by the scheduler and web server, and the web server loads the entire DAG bag as soon as it starts. Enabling DAG serialization forces the scheduler to parse all DAG files before the web server starts, storing the results in a serialized DAG table. The web server then loads each DAG on-demand from the table for processing. Serializing DAGs in this way reduced the CPU and memory usage by the web server, especially when processing a large number of DAGs.
Prerequisites and limitations
DAG serialization can only be enabled on Cloud Composer environments using Composer version 1.8.2 or newer AND Airflow version 1.10.3 or newer. See the Cloud Composer version list for all available versions.
DAG serialization can't be enabled at the same time as asynchronous DAG loading.
Enabling DAG serialization disables all Airflow web server plugins for Cloud Composer. This doesn't impact scheduler or worker plugins, including Airflow operators, sensors etc.
Enabling DAG serialization
To enable DAG serialization, you must specify the following configuration parameters:
[core] min_serialized_dag_update_interval controls how frequently the serialized DAG
is updated in the database, while
[scheduler] dag_dir_list_interval controls
how frequently removed DAGs are deleted from the database. We recommend setting these to 30 seconds, as
a high update frequency can negatively impact performance.
Overriding Airflow configurations
There are two ways to override Airflow configurations:
On the Environments page:
During environment creation, following the instructions in Creating environments.
After environment creation, as shown in the image below.
Disabling DAG serialization
To disable DAG serialization, use Airflow configuration overrides
[core] store_serialized_dags and
[core] store_dag_code to
To learn more about DAG serialization, read the relevant article in the Airflow documentation.