Provisioning Anthos clusters with Terraform

This tutorial walks you through provisioning Anthos clusters and Anthos components using the Google Cloud modules for Terraform. Throughout this tutorial, you use Terraform modules to progressively install and configure Anthos components.

This tutorial is for developers and operators who want to automate infrastructure provisioning for Anthos. The document assumes that you are familiar with Terraform and Google Cloud.

The following diagram shows the components used in this tutorial:

Architectural components used to provision Anthos clusters using Terraform.

In the preceding diagram, there is a platform with two Google Kubernetes Engine (GKE) clusters that are registered within an Anthos environ. The clusters sync with Git repositories and apply Kubernetes resources that are contained in the repository using Anthos Config Management. Anthos Service Mesh is installed in multiple clusters that are attached to a production environ.

Objectives

  • Review Anthos-related Terraform modules.
  • Configure new Anthos components by writing Terraform scripts.
  • Review cluster organization best practices.
  • Provision platform resources.

Costs

This tutorial uses the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Cleaning up.

Before you begin

  1. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  3. In the Cloud Console, activate Cloud Shell.

    Activate Cloud Shell

Preparing your environment

  1. In Cloud Shell, create a working directory called anthos-terraform and change into it:

    mkdir -p anthos-terraform/bin
    cd  anthos-terraform
    export WORK_DIR=$PWD
    export PATH=$PATH:$WORK_DIR/bin:
    

    All remaining commands in this tutorial are run within the anthos-terraform directory.

  2. Set your local environment and replace YOUR_PROJECT_ID with the name of the project that you're using:

    export PROJECT_ID=YOUR_PROJECT_ID
    gcloud config set core/project $PROJECT_ID
    
  3. Download and install terraform, kpt, and kustomize:

    TERRAFORM_VERSION=0.14.5
    wget -q https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip 
    unzip terraform_${TERRAFORM_VERSION}_linux_amd64.zip
    chmod +x terraform 
    sudo mv -f terraform /usr/local/bin 
    rm -rf terraform_${TERRAFORM_VERSION}_linux_amd64.zip
    
    curl -o kpt "https://storage.googleapis.com/kpt-dev/latest/linux_amd64/kpt"
    chmod +x kpt
    mv ./kpt $WORK_DIR/bin
    
    curl -o install_kustomize.sh "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" 
    chmod +x install_kustomize.sh && ./install_kustomize.sh
    mv ./kustomize $WORK_DIR/bin
    

Creating base Terraform resources

Terraform uses providers and provider resources to provision infrastructure. Providers are the translation layer between a vendor's API implementation and Terraform resources. A Google Cloud provider lets you run scripts against Google Cloud services. The Google Cloud provider resources cover the breadth of Google Cloud products and services, letting teams implement most use cases that they might come across.

In addition to the provider resources, Google creates and maintains a set of best practice modules that are designed to reduce the implementation time for platform teams. These modules provide patterns and implementations for common use cases. These best practices are implemented as Terraform modules, which are reusable implementations similar to programming functions. The Terraform modules for Google Cloud are available both on the Terraform site and on GitHub.

This tutorial consists of Anthos-related modules for Terraform that use a local storage configuration with various defaults. This tutorial doesn't depict a full production deployment. We recommend that you customize your deployment to meet your architecture needs.

In the following steps, you use GKE modules to provision clusters and enable Anthos on those clusters.

To provide flexibility and extensibility for your Terraform scripts, you can use variables to separate variations from your scripts. You can change which location the GKE clusters reside in. First, you define the variables in the variables.tf file. Then, you provide values for those variables.

  1. In Cloud Shell, create a variables.tf file and include the following variable definitions:

    cat << EOF > variables.tf
    variable "project_id" {
      description = "The project ID to host the cluster in"
    }
    
    variable "primary_region" {
      description = "The primary region to be used"
    }
    variable "primary_zones" {
      description = "The primary zones to be used"
    }
    
    variable "secondary_region" {
      description = "The secondary region to be used"
    }
    variable "secondary_zones" {
      description = "The secondary zones to be used"
    }
    EOF
    
  2. Create the terraform.tfvars file and assign the following variables:

    cat << EOF > terraform.tfvars
    primary_region      = "us-central1"
    primary_zones      = ["us-central1-a"]
    secondary_region      = "us-west1"
    secondary_zones      = ["us-west1-b"]
    EOF
    

    For this tutorial, the primary region and zone are us-central1 and us-central1-a. The secondary region and zone are us-west1 and us-west1-b. For more information, see Geography and regions.

  3. Create the main.tf file and include the following details about the Google Cloud provider:

    cat << EOF > main.tf
    provider "google" {
      project = var.project_id
      region  = var.primary_region
    }
    
    data "google_client_config" "current" {}
    
    data "google_project" "project" {
      project_id = var.project_id
    }
    
    output "project" {
      value = data.google_client_config.current.project
    }
    EOF
    

    This file defines the provider and creates data and output that can serve as input for other module values.

  4. To enable the Google Cloud APIs, create an apis.tf file with the following definitions:

    cat << EOF > apis.tf
    module "project-services" {
      source  = "terraform-google-modules/project-factory/google//modules/project_services"
    
      project_id  = data.google_client_config.current.project
      disable_services_on_destroy = false
      activate_apis = [
        "compute.googleapis.com",
        "iam.googleapis.com",
        "container.googleapis.com",
        "cloudresourcemanager.googleapis.com",
        "anthos.googleapis.com",
        "cloudtrace.googleapis.com",
        "meshca.googleapis.com",
        "meshtelemetry.googleapis.com",
        "meshconfig.googleapis.com",
        "iamcredentials.googleapis.com",
        "gkeconnect.googleapis.com",
        "gkehub.googleapis.com",
        "monitoring.googleapis.com",
        "logging.googleapis.com"
    
      ]
    }
    EOF
    

    Various modules and submodules are available for provisioning the different services within Google Cloud. This file uses the project services submodule which is located under its parent module, the project-factory. The project-factory module is located within the Terraform modules for Google Cloud.

    The following APIs are enabled: Compute Engine, Identity and Access Management, GKE, Resource Manager, Anthos, Cloud Trace, Anthos Service Mesh, Connect, Cloud Monitoring, and Cloud Logging.

  5. Download all required providers:

    terraform init
    
  6. Test and review your configuration:

    terraform plan -var project_id=${PROJECT_ID}
    

    The output provides feedback about your configuration without applying any changes to your environment. Review the changes to check for any errors. The output is similar to the following:

    Plan: 14 to add, 0 to change, 0 to destroy.
    ------------------------------------------------------------------------
    Note: You didn't specify an "-out" parameter to save this plan, so Terraform
    can't guarantee that exactly these actions will be performed if
    "terraform apply" is subsequently run.
    
  7. Apply the configuration:

    terraform apply -var project_id=${PROJECT_ID}
    

    The apis.tf file enabled your APIs, but no additional resources were defined or created.

Create your clusters

In this section, you provision the Anthos architecture by defining two clusters: a primary and a secondary cluster.

The Google Provider includes base resources to create your infrastructure. However, the Terraform modules for Google Cloud provide best practice implementations and reusable modules to help you build architectures. Instead of using the GKE resource directly from the Google Provider to build your clusters, this section shows you how to build your clusters from a module provided in the terraform-google-kubernetes-engine repository.

  1. To define your two clusters, in Cloud Shell create a clusters.tf file with the following information:

    cat << EOF > clusters.tf
    
    # Primary Cluster
    module "primary-cluster" {
      name                    = "primary"
      project_id              = module.project-services.project_id
      source                  = "terraform-google-modules/kubernetes-engine/google//modules/beta-public-cluster"
      version                 = "13.0.0"
      regional                = false
      region                  = var.primary_region
      network                 = "default"
      subnetwork              = "default"
      ip_range_pods           = ""
      ip_range_services       = ""
      zones                   = var.primary_zones
      release_channel         = "REGULAR"
      cluster_resource_labels = { "mesh_id" : "proj-\${data.google_project.project.number}" }
      node_pools = [
        {
          name         = "default-node-pool"
          autoscaling  = false
          auto_upgrade = true
    
          node_count   = 5
          machine_type = "e2-standard-4"
        },
      ]
    
    }
    
    # Secondary Cluster
    module "secondary-cluster" {
      name                    = "secondary"
      project_id              = module.project-services.project_id
      source                  = "terraform-google-modules/kubernetes-engine/google//modules/beta-public-cluster"
      version                 = "13.0.0"
      regional                = false
      region                  = var.secondary_region
      network                 = "default"
      subnetwork              = "default"
      ip_range_pods           = ""
      ip_range_services       = ""
      zones                   = var.secondary_zones
      release_channel         = "REGULAR"
      cluster_resource_labels = { "mesh_id" : "proj-\${data.google_project.project.number}" }
    
      node_pools = [
        {
          name         = "default-node-pool"
          autoscaling  = false
          auto_upgrade = true
    
          node_count   = 5
          machine_type = "e2-standard-4"
        },
      ]
    
    }
    EOF
    

    This example uses many of the defaults for a demonstration cluster that aren't appropriate for a production environment. To build a production-ready cluster, you can use the more advanced best practice implementations.

    This example also uses the kubernetes-engine module. The module requires various inputs such as networks, IP addresses, and locations. The value for project_id is provided by the project_services module. The project_services module was previously implemented to ensure that the APIs are enabled before you build the cluster. The definition also uses the release_channel input to ensure that the cluster is always upgraded with the Regular channel.

    The configuration for the node_pools indicates that there are 5 nodes of e2-standard-4 type virtual machines (VMs). The cluster_resource_labels label sets a label on the cluster to integrate with Anthos Service Mesh. This label indicates which service mesh the clusters belong to. There is only one mesh per project so the label is the same on both clusters.

  2. Provision the clusters:

    terraform init
    terraform plan -var project_id=${PROJECT_ID}
    terraform apply -var project_id=${PROJECT_ID}
    
  3. To review the results of the provisioning, go to the Kubernetes Clusters page.

    Go to Clusters page

Registering with Anthos environ

Environs are a Google Cloud concept for logically organizing clusters and other resources, that let you use and manage multi-cluster capabilities and apply consistent policies across your systems. Environs form a crucial part of how enterprise multi-cluster functionality works in Anthos.

To register your clusters, you can use the hub submodule.

This module registers your clusters with the Anthos environ, a central construct for managing features and clusters

  1. To register your clusters with the environ create a hub.tf file and include the following definitions:

    cat << EOF > hub.tf
    
    module "hub-primary" {
      source           = "terraform-google-modules/kubernetes-engine/google//modules/hub"
    
      project_id       = data.google_client_config.current.project
      cluster_name     = module.primary-cluster.name
      location         = module.primary-cluster.location
      cluster_endpoint = module.primary-cluster.endpoint
      gke_hub_membership_name = "primary"
      gke_hub_sa_name = "primary"
    }
    
    module "hub-secondary" {
      source           = "terraform-google-modules/kubernetes-engine/google//modules/hub"
    
      project_id       = data.google_client_config.current.project
      cluster_name     = module.secondary-cluster.name
      location         = module.secondary-cluster.location
      cluster_endpoint = module.secondary-cluster.endpoint
      gke_hub_membership_name = "secondary"
      gke_hub_sa_name = "secondary"
    }
    EOF
    
  2. Apply the configuration:

    terraform init
    terraform plan -var project_id=${PROJECT_ID}
    terraform apply -var project_id=${PROJECT_ID}
    

Enabling Anthos Service Mesh

Anthos Service Mesh provides traffic management, security, and observability for microservices within GKE. To enable Anthos Service Mesh on a cluster, you can use the asm submodule. This module installs and enables Anthos Service Mesh on your clusters.

  1. To configure Anthos Service Mesh, in Cloud Shell create an asm.tf file and include the following definitions:

    cat << EOF > asm.tf
    
    module "asm-primary" {
      source           = "terraform-google-modules/kubernetes-engine/google//modules/asm"
      version          = "13.0.0"
      project_id       = data.google_client_config.current.project
      cluster_name     = module.primary-cluster.name
      location         = module.primary-cluster.location
      cluster_endpoint = module.primary-cluster.endpoint
    
      asm_dir          = "asm-dir-\${module.primary-cluster.name}"
    
    }
    
    module "asm-secondary" {
      source           = "terraform-google-modules/kubernetes-engine/google//modules/asm"
      version          = "13.0.0"
      project_id       = data.google_client_config.current.project
      cluster_name     = module.secondary-cluster.name
      location         = module.secondary-cluster.location
      cluster_endpoint = module.secondary-cluster.endpoint
    
      asm_dir = "asm-dir-\${module.secondary-cluster.name}"
    
    }
    
    EOF
    

    For cluster-related variables, such as cluster_name, you reference the project from the Google Provider and you reference the cluster variables from the cluster module.

  2. To install Anthos Service Mesh on your clusters, apply the configuration:

    terraform init
    terraform plan -var project_id=${PROJECT_ID}
    terraform apply -var project_id=${PROJECT_ID}
    
  3. To view the resources that Anthos Service Mesh deployed, in the Console, go to the Workloads page.

    Go to Workloads page

Enabling configuration management

In this section, you enable Anthos Config Management on the clusters. Anthos Config Management can sync assets from a Git repository and ensure that the assets are applied to multiple clusters.

You enable Anthos Configuration Management on your clusters and configure it to sync with a public sample repository. In a production environment, you can sync to your own private repository.

  1. In Cloud Shell, download the config-management-operator.yaml configuration file:

    gsutil cp gs://config-management-release/released/latest/config-management-operator.yaml config-management-operator.yaml
    
  2. Add the Anthos Config Management variables to the existing variables.tf file:

    cat << EOF >> variables.tf
    
    variable "acm_repo_location" {
      description = "The location of the git repo ACM will sync to"
    }
    variable "acm_branch" {
      description = "The git branch ACM will sync to"
    }
    variable "acm_dir" {
      description = "The directory in git ACM will sync to"
    }
    EOF
    

    In order to synchronize with a Git repository, Anthos Config Management needs to know the location, branch, and directory to sync to.

  3. To configure the values for the repository, add the Anthos Config Management variables to the existing terraform.tfvars file:

    cat << EOF >> terraform.tfvars
    
    acm_repo_location   = "https://github.com/GoogleCloudPlatform/csp-config-management/"
    acm_branch          = "1.0.0"
    acm_dir             = "foo-corp"
    EOF
    
  4. Create an acm.tf file with the following definition for the submodule from the terraform-google-kubernetes-engine repository:

    cat << EOF > acm.tf
    
    module "acm-primary" {
      source           = "github.com/terraform-google-modules/terraform-google-kubernetes-engine//modules/acm"
    
      project_id       = data.google_client_config.current.project
      cluster_name     = module.primary-cluster.name
      location         = module.primary-cluster.location
      cluster_endpoint = module.primary-cluster.endpoint
    
      operator_path    = "config-management-operator.yaml"
      sync_repo        = var.acm_repo_location
      sync_branch      = var.acm_branch
      policy_dir       = var.acm_dir
    }
    
    module "acm-secondary" {
      source           = "github.com/terraform-google-modules/terraform-google-kubernetes-engine//modules/acm"
    
      project_id       = data.google_client_config.current.project
      cluster_name     = module.secondary-cluster.name
      location         = module.secondary-cluster.location
      cluster_endpoint = module.secondary-cluster.endpoint
    
      operator_path    = "config-management-operator.yaml"
      sync_repo        = var.acm_repo_location
      sync_branch      = var.acm_branch
      policy_dir       = var.acm_dir
    }
    EOF
    

    The acm submodule requires a few key elements. The first section of the submodule provides details about which GKE cluster that you want to enable Anthos Config Management on. In this case, the submodule is passing in variables for cluster_name, location, and cluster_endpoint.

  5. Apply the Anthos Config Management configuration:

    terraform init
    terraform plan -var project_id=${PROJECT_ID}
    terraform apply -var project_id=${PROJECT_ID}
    

    When Terraform finishes provisioning Anthos Config Management, the clusters sync to the repository. This process takes a few minutes.

  6. To see the updates that Anthos Config Management is making, you can watch for the new namespaces that it creates:

    1. Get credentials for the primary cluster:

      gcloud container clusters get-credentials primary --zone us-central1-a
      
    2. Watch for the namespaces that are created and managed by Anthos Config Management:

      watch kubectl get ns -l app.kubernetes.io/managed-by=configmanagement.gke.io
      

      The output is similar to the following:

      NAME               STATUS   AGE
      audit              Active   18m
      shipping-dev       Active   18m
      shipping-prod      Active   18m
      shipping-staging   Active   18m
      

    To exit this watch process, type Control+C.

Cleaning up

The easiest way to eliminate billing is to delete the Cloud project that you created for the tutorial. Alternatively, you can delete the individual resources.

Delete the project

  1. In the Cloud Console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

What's next