Deploying a self-managed Airflow web server

This page describes how to deploy the Airflow web server to a Cloud Composer environment's Kubernetes cluster. Use this guide if you:

  • Require control over where the Airflow web server is deployed.
  • Have a DAG that must be imported from a consistent set of IP addresses, such as for authentication with on-premises systems.

Before you begin

  • Be familiar with How to deploy a workload to Google Kubernetes Engine.
  • Install the Cloud SDK.
  • Create a Cloud Composer environment.
  • You'll be deploying the Airflow web server to a worker machine type, which has fewer vCPUs than the default web server and is shared with the Airflow workers. Depending on load size, you might need to increase the number of worker nodes.
  • Cloud Composer garbage collection can remove older images. As a best practice, synchronize the web server image path/tag with the image that the scheduler and workers are running—each time you install a package or upgrade versions. To do so, retrieve the image name from the scheduler pod configuration and use that value to update your self-managed web server.

Determine the Cloud Composer environment's GKE cluster

Use the gcloud composer environments describe command to show the properties of a Cloud Composer environment, including the GKE cluster.

The cluster is listed as the gkeCluster. Also take note of the zone where the cluster is deployed, for example us-central1-b, by looking at the last part of the location property (config > nodeConfig > location).

gcloud composer environments describe ENVIRONMENT_NAME \
    --location LOCATION 


  • ENVIRONMENT_NAME is the name of the environment.
  • LOCATION is the Compute Engine region where the environment is located.

This document now refers to the cluster as ${GKE_CLUSTER} and the zone as ${GKE_LOCATION}.

Connect to the GKE cluster

Use gcloud to connect the kubectl command to the cluster.

gcloud container clusters get-credentials ${GKE_CLUSTER} --zone ${GKE_LOCATION}

Get the pod configuration for the scheduler

The Airflow web server uses the same Docker image as the Airflow scheduler, so get the configuration of the scheduler pod to use as a starting point.

kubectl get pods --all-namespaces

Look for a pod with a name like airflow-scheduler-1a2b3c-x0yz. Get the configuration for the scheduler pod and write it to airflow-webserver.yaml.

kubectl get pod -n NAMESPACE airflow-scheduler-1a2b3c-x0yz -o yaml > airflow-webserver.yaml

where NAMESPACE is the namespace in which the scheduler pod runs, such as composer-1-7-2-airflow-1-9-0-4d5e6f.

Create the web server deployment configuration

Modify airflow-webserver.yaml in a plain text editor to create a web server deployment configuration.

  1. Replace the apiVersion, kind, and metadata sections with the following deployment configuration. Do not delete the original spec section. You use it at a later step.

    apiVersion: apps/v1
    kind: Deployment
      name: airflow-webserver
        run: airflow-webserver
      replicas: 1
          run: airflow-webserver
          maxSurge: 1
          maxUnavailable: 1
        type: RollingUpdate
            run: airflow-webserver
  2. Replace airflow-scheduler with airflow-webserver in labels and names. Note: the web server container image does not change. The same image is used for workers, the scheduler, and the web server.

  3. Delete the status section and all sections that are nested inside it.

  4. Indent the original spec section so that spec is a key for the template section.

  5. Replace - scheduler with - webserver in the - args: section.

  6. Replace the livenessProbe section with one that polls the health endpoint.

                - curl
                - localhost:8080/_ah/health

Create the web server service configuration

Create a service configuration file called airflow-webserver-service.yaml.

apiVersion: v1
kind: Service
  name: airflow-webserver-service
    run: airflow-webserver
  - port: 8080
    protocol: TCP
    targetPort: 8080
    run: airflow-webserver
  sessionAffinity: None
  type: ClusterIP

Deploy the web server

  1. Deploy the web server pod.

    kubectl create -n NAMESPACE -f airflow-webserver.yaml
  2. Deploy the web server service.

    kubectl create -n NAMESPACE -f airflow-webserver-service.yaml

Connect to the web server

Because the deployment uses ClusterIP, the web server is not accessible from outside the Kubernetes cluster without using a proxy.

  1. Find the web server pod.

    kubectl get pods --all-namespaces
  2. Forward the web server port to your local machine.

    kubectl -n NAMESPACE port-forward airflow-webserver-1a2b3cd-0x9yz 8080:8080
  3. Open the Airflow web server in your web browser at http://localhost:8080/admin/.

What's next