Create a GKE cluster and deploy a workload using Terraform


In this quickstart, you learn how to create a Google Kubernetes Engine (GKE) Autopilot cluster and deploy a workload using Terraform.

Infrastructure as Code (IaC) is a practice of managing and provisioning software infrastructure resources using code. Terraform is a popular open source IaC tool that supports a wide range of Cloud services, including GKE. As a GKE platform administrator, you can use Terraform to standardize configuration of your Kubernetes clusters and streamline your DevOps workflows. To learn more, see Terraform support for GKE.

Objectives

  • Create an IPv6 Virtual Private Cloud (VPC) network
  • Create a GKE Autopilot cluster
  • Deploy a workload on your cluster
  • Expose the workload using a Service

Before you begin

Take the following steps to enable the Kubernetes Engine API:

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Enable the GKE API.

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Google Cloud project.

  7. Enable the GKE API.

    Enable the API

  8. Make sure that you have the following role or roles on the project: roles/container.admin, roles/compute.networkAdmin, roles/iam.serviceAccountUser

    Check for the roles

    1. In the Google Cloud console, go to the IAM page.

      Go to IAM
    2. Select the project.
    3. In the Principal column, find the row that has your email address.

      If your email address isn't in that column, then you do not have any roles.

    4. In the Role column for the row with your email address, check whether the list of roles includes the required roles.

    Grant the roles

    1. In the Google Cloud console, go to the IAM page.

      Go to IAM
    2. Select the project.
    3. Click Grant access.
    4. In the New principals field, enter your email address.
    5. In the Select a role list, select a role.
    6. To grant additional roles, click Add another role and add each additional role.
    7. Click Save.

You should be familiar with the basics of Terraform. You can use the following resources:

Prepare the environment

In this tutorial you use Cloud Shell to manage resources hosted on Google Cloud. Cloud Shell is preinstalled with the software you need for this tutorial, including Terraform, kubectl, and the the Google Cloud CLI.

  1. Launch a Cloud Shell session from the Google Cloud console, by clicking the Cloud Shell activation icon Activate Cloud Shell Activate Shell Button. This launches a session in the bottom pane of the Google Cloud console.

    The service credentials associated with this virtual machine are automatic, so you don't have to set up or download a service account key.

  2. Before you run commands, set your default project in the gcloud CLI using the following command:

    gcloud config set project PROJECT_ID
    

    Replace PROJECT_ID with your project ID.

  3. Clone the GitHub repository:

    git clone https://github.com/terraform-google-modules/terraform-docs-samples.git --single-branch
    
  4. Change to the working directory:

    cd terraform-docs-samples/gke/quickstart/autopilot
    

Review the Terraform files

The Google Cloud Provider is a plugin that lets you manage and provision Google Cloud resources using Terraform, HashiCorp's Infrastructure as Code (IaC) tool. It serves as a bridge between Terraform configurations and the Google Cloud APIs, allowing you to define infrastructure resources, such as virtual machines and networks, in a declarative manner.

  1. Review the cluster.tf file:

    cat cluster.tf
    

    The output is similar to the following

    resource "google_compute_network" "default" {
      name = "example-network"
    
      auto_create_subnetworks  = false
      enable_ula_internal_ipv6 = true
    }
    
    resource "google_compute_subnetwork" "default" {
      name = "example-subnetwork"
    
      ip_cidr_range = "10.0.0.0/16"
      region        = "us-central1"
    
      stack_type       = "IPV4_IPV6"
      ipv6_access_type = "INTERNAL"
    
      network = google_compute_network.default.id
      secondary_ip_range {
        range_name    = "services-range"
        ip_cidr_range = "192.168.0.0/24"
      }
    
      secondary_ip_range {
        range_name    = "pod-ranges"
        ip_cidr_range = "192.168.1.0/24"
      }
    }
    
    resource "google_container_cluster" "default" {
      name = "example-autopilot-cluster"
    
      location                 = "us-central1"
      enable_autopilot         = true
      enable_l4_ilb_subsetting = true
    
      network    = google_compute_network.default.id
      subnetwork = google_compute_subnetwork.default.id
    
      ip_allocation_policy {
        stack_type                    = "IPV4_IPV6"
        services_secondary_range_name = google_compute_subnetwork.default.secondary_ip_range[0].range_name
        cluster_secondary_range_name  = google_compute_subnetwork.default.secondary_ip_range[1].range_name
      }
    
      # Set `deletion_protection` to `true` will ensure that one cannot
      # accidentally delete this instance by use of Terraform.
      deletion_protection = false
    }

    This file describes the following resources:

  2. Review the app.tf file:

    cat app.tf
    

    The output is similar to the following:

    data "google_client_config" "default" {}
    
    provider "kubernetes" {
      host                   = "https://${google_container_cluster.default.endpoint}"
      token                  = data.google_client_config.default.access_token
      cluster_ca_certificate = base64decode(google_container_cluster.default.master_auth[0].cluster_ca_certificate)
    
      ignore_annotations = [
        "^autopilot\\.gke\\.io\\/.*",
        "^cloud\\.google\\.com\\/.*"
      ]
    }
    
    resource "kubernetes_deployment_v1" "default" {
      metadata {
        name = "example-hello-app-deployment"
      }
    
      spec {
        selector {
          match_labels = {
            app = "hello-app"
          }
        }
    
        template {
          metadata {
            labels = {
              app = "hello-app"
            }
          }
    
          spec {
            container {
              image = "us-docker.pkg.dev/google-samples/containers/gke/hello-app:2.0"
              name  = "hello-app-container"
    
              port {
                container_port = 8080
                name           = "hello-app-svc"
              }
    
              security_context {
                allow_privilege_escalation = false
                privileged                 = false
                read_only_root_filesystem  = false
    
                capabilities {
                  add  = []
                  drop = ["NET_RAW"]
                }
              }
    
              liveness_probe {
                http_get {
                  path = "/"
                  port = "hello-app-svc"
    
                  http_header {
                    name  = "X-Custom-Header"
                    value = "Awesome"
                  }
                }
    
                initial_delay_seconds = 3
                period_seconds        = 3
              }
            }
    
            security_context {
              run_as_non_root = true
    
              seccomp_profile {
                type = "RuntimeDefault"
              }
            }
    
            # Toleration is currently required to prevent perpetual diff:
            # https://github.com/hashicorp/terraform-provider-kubernetes/pull/2380
            toleration {
              effect   = "NoSchedule"
              key      = "kubernetes.io/arch"
              operator = "Equal"
              value    = "amd64"
            }
          }
        }
      }
    }
    
    resource "kubernetes_service_v1" "default" {
      metadata {
        name = "example-hello-app-loadbalancer"
        annotations = {
          "networking.gke.io/load-balancer-type" = "Internal" # Remove to create an external loadbalance
        }
      }
    
      spec {
        selector = {
          app = kubernetes_deployment_v1.default.spec[0].selector[0].match_labels.app
        }
    
        ip_family_policy = "RequireDualStack"
    
        port {
          port        = 80
          target_port = kubernetes_deployment_v1.default.spec[0].template[0].spec[0].container[0].port[0].name
        }
    
        type = "LoadBalancer"
      }
    
      depends_on = [time_sleep.wait_service_cleanup]
    }
    
    # Provide time for Service cleanup
    resource "time_sleep" "wait_service_cleanup" {
      depends_on = [google_container_cluster.default]
    
      destroy_duration = "180s"
    }

    This file describes the following resources:

    • A Deployment with a sample container image.
    • A Service of type LoadBalancer. The Service exposes the Deployment on port 80. To expose your application to the internet, configure an external load balancer by removing the networking.gke.io/load-balancer-type annotation.

Create a cluster and deploy an application

  1. In Cloud Shell, run this command to verify that Terraform is available:

    terraform
    

    The output should be similar to the following:

    Usage: terraform [global options] <subcommand> [args]
    
    The available commands for execution are listed below.
    The primary workflow commands are given first, followed by
    less common or more advanced commands.
    
    Main commands:
      init          Prepare your working directory for other commands
      validate      Check whether the configuration is valid
      plan          Show changes required by the current configuration
      apply         Create or update infrastructure
      destroy       Destroy previously-created infrastructure
    
  2. Initialize Terraform:

    terraform init
    
  3. Plan the Terraform configuration:

    terraform plan
    
  4. Apply the Terraform configuration

    terraform apply
    

    When prompted, enter yes to confirm actions. This command might take several minutes to complete. The output is similar to the following:

    Apply complete! Resources: 6 added, 0 changed, 0 destroyed.
    

Verify the cluster is working

Do the following to confirm your cluster is running correctly:

  1. Go to the Workloads page in the Google Cloud console:

    Go to Workloads

  2. Click the example-hello-app-deployment workload. The Pod details page displays. This page shows information about the Pod, such as annotations, containers running on the Pod, Services exposing the Pod, and metrics including CPU, Memory, and Disk usage.

  3. Go to the Services & Ingress page in the Google Cloud console:

    Go to Services & Ingress

  4. Click the example-hello-app-loadbalancer LoadBalancer Service. The Service details page displays. This page shows information about the Service, such as the Pods associated with the Service, and the Ports the Services uses.

  5. In the External endpoints section, click the IPv4 link or the IPv6 link to view your Service in the browser. The output is similar to the following:

    Hello, world!
    Version: 2.0.0
    Hostname: example-hello-app-deployment-5df979c4fb-kdwgr
    

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

In Cloud Shell, run the following command to delete the Terraform resources:

terraform destroy --auto-approve

If you see an error message similar to The network resource 'projects/PROJECT_ID/global/networks/example-network' is already being used by 'projects/PROJECT_ID/global/firewalls/example-network-yqjlfql57iydmsuzd4ot6n5v', do the following:

  1. Delete the firewall rules:

    gcloud compute firewall-rules list --filter="NETWORK:example-network" --format="table[no-heading](name)" | xargs gcloud --quiet compute firewall-rules delete
    
  2. Re-run the Terraform command:

    terraform destroy --auto-approve
    

What's next