Managing your Google Kubernetes Engine workloads with Spotinst Elastigroup

By: Amit Bar Oz, Solutions Architect, Spotinst

This tutorial shows you how to import a Google Kubernetes Engine (GKE) cluster to Spotinst Elastigroup. Elastigroup lets you run your container environments on preemptible VMs at a production-grade level while saving up to 80% on your compute expenses. Elastigroup takes care of replacing your instances seamlessly without any downtime. On top of that, Elastigroup provides a serverless autoscaling experience and optimizes the infrastructure costs where your workloads run without the need to manage or scale a cluster of Compute Engine instances.

With Elastigroup, you can create a GKE cluster with multiple machine types and sizes out of the box. Elastigroup uses smart Tetris scaling and bin-packing algorithms that are driven by container resource requirements rather than node utilization, thus allowing the cluster to be highly utilized. On top of that, Elastigroup manages a dynamic headroom capacity that ensures capacity is available for new workloads at any time so you don't have to wait for a new instance to launch when creating a new deployment.

The following diagram shows a GKE cluster integrated with Elastigroup. Once a new deployment is processed, Elastigroup ensures it has sufficient resources.

Diagram of a GKE cluster integrated with Elastigroup

This tutorial assumes you are familiar with the following:

  • GKE
  • Compute Engine
  • Cloud Shell
  • Kubernetes
  • Docker
  • Linux

Objectives

  • Create a GKE cluster.
  • Create a Spotinst account.
  • Connect your GCP account to your Spotinst account.
  • Import your GKE cluster to Elastigroup.
  • Update the capacity of your GKE node groups.

Costs

The Spotinst billing model works in such that you pay only for what you save, no savings means no costs. The savings are the delta between the price you pay for the preemptible VMs versus the price you would have paid for the equivalent on-demand instances. Spotinst charges 20% of those savings.

Spotinst provides a 14-day free trial to try your workloads on preemptible VMs.

This tutorial uses the following billable components of Google Cloud Platform:

You can use the pricing calculator to generate a cost estimate based on your projected usage.

Before you begin

  1. Select or create a GCP project.

    GO TO THE MANAGE RESOURCES PAGE

  2. Make sure that billing is enabled for your project.

    LEARN HOW TO ENABLE BILLING

  3. Ensure that you have enabled the Google Kubernetes Engine API.

    ENABLE THE GKE API

Creating a GKE cluster

Spotinst Elastigroup supports zonal and regional GKE clusters. If you already have an existing GKE cluster, skip to the next step. For the purposes of this tutorial, you create a GKE cluster with Cloud Shell. The cluster is a zonal cluster that consists of three node zones and each zone has one instance.

  1. In the Google Cloud Platform Console, go to the GKE Clusters page.

    GO TO THE CLUSTERS PAGE

  2. Click Activate Cloud Shell.

    Activate Cloud Shell

  3. Set the gcloud CLI project. Replace [PROJECT_NAME] with your GCP project name.

    gcloud config set project [PROJECT_NAME]
    
  4. Create the cluster.

    gcloud container clusters create gke-spotinst-tutorial \
        --zone us-central1-a \
        --node-locations us-central1-a,us-central1-b,us-central1-c \
        --num-nodes=1
    

    After a few minutes, the cluster consists of three instances in three different instance zones. The output looks like this:

    GCP Console of created cluster

Creating a Spotinst account

You need to create a Spotinst account in order to set up the integration.

  1. Go to the Spotinst signup page.

  2. Enter your details and click Continue to Create Password.

  3. Enter your password, select the Privacy Policy and Terms of Service checkbox and click Sign Up.

  4. A verification email is sent to your email. Verify your email to finish setting up your account and log in to your Spotinst account.

Syncing your GCP account with your Spotinst account

You authenticate Spotinst with your GCP project in order to allow Spotinst to interact with your GKE cluster.

  1. Go to the Spotinst console.

  2. On the Welcome page, click GCP.

    GCP selection button

  3. In the GCP Console, go to the IAM & Admin page.

    GO TO THE IAM & ADMIN PAGE

  4. Click Add.

  5. In the Email field, enter spotinst@spotit-today.iam.gserviceaccount.com and select the Editor role. The Spotinst Service Account email address enables your Spotinst account to manage your project resources.

  6. Go back to the Spotinst console. Enter your GCP Project ID, and then click Save.

    GCP project authentication

Syncing your GKE cluster with Elastigroup

To sync GKE with Elastigroup, you first connect your GKE cluster to Elastigroup, and then you configure Elastigroup to manage your node capacity.

Connect your GKE cluster to Elastigroup

  1. In the Spotinst Console, click Elastigroups and then click Create.

  2. Click GKE.

    GKE selection button

  3. On the General tab, complete the following steps and then click Next.

    1. In the Name field of Elastigroup, enter gke-spotinst-tutorial.
    2. In the Location Type drop-down list, select Zonal.
    3. In the Cluster Location drop-down list, select the zone of your GKE cluster, us-central1-a.
    4. In the Cluster Name drop-down list, select the GKE cluster you want to import. For the purposes of this tutorial, select gke-spotinst-tutorial.
    5. In the Node Pool drop-down list, select default-pool.

      Elastigroup general configuration of node pool

The configuration of your GKE cluster is imported into Elastigroup and you can now configure Elastigroup.

Configure Elastigroup

In this section, you configure Elastigroup to manage nodes connected to your GKE cluster.

  1. After clicking Next in the previous section, review the imported group configurations:

    1. The selected Node Zones are where you want to launch your GKE instances. In this tutorial, they are us-cenral1-a, us-central1-b, and us-central1-c. Make sure to only add zones that are in your GKE cluster details.

      To verify your cluster details, go to the GCP Console and under Node Zones, select the relevant cluster.

      GKE cluster details

      • Optional: To add additional node zones, in the Additional zones section, edit your GKE cluster and select your zones.

        Additional zone selection

    2. In the Capacity fields, enter your group capacity.

      • In the Target field, enter 3. This field specifies the number of instances that are launched when you create the group.
      • In the Minimum field, enter 1. This field specifies the minimum number of instances in the group.
      • In the Maximum field, enter 5. This field specifies the maximum number of instances in the group.

        While working with Spotinst Autoscaler, it is important to specify different values for the Minimum and Maximum fields to allow the autoscaler to scale up and down the group.

    3. In the Machine Types drop-down list, select as many instance types as possible to allow Spotinst Kubernetes AutoScaler to automatically resize clusters based on the demands of the workloads you want to run. For the purposes of this tutorial, select n1-standard-2, n1-standard-4, n1-highmem-2, and n2-highcpu-4.

    4. In the Image drop-down list, select Container-Optimized OS.

      Elastigroup configuration of node zones

  2. To configure autoscaling, complete the following steps:

    1. To enable the AutoScaler, select the Enabled checkbox.
    2. For Headroom Configuration, you allow the AutoScaler to make sure the specified capacity is always allocated to allow new deployments to run without the need to wait for the launch of new instances. One headroom unit is configured from CPU and Memory units. Click Automatically. The unit size is determined by the size of the most frequent pod size in the cluster.

      Enable autoscaling

  3. To expand, click the Advanced section and complete the following steps:

    1. In the Disk Type drop-down list, click Standard Persistent Disk.
    2. In the Disk Size field, enter 100.
    3. Leave the Custom Labels fields blank. They are the Kubernetes labels for the cluster nodes.
    4. The cluster Metadata information is automatically imported. Click Next.

      Advanced configuration

  4. In the Connectivity section, install the Spotinst Kubernetes Controller in the cluster.

    1. Copy your Spotinst token.
    2. In the GCP Console, go to your GKE cluster, and click Connect.
    3. Click Run in Cloud Shell.

      Run in Cloud Shell button

    4. Grant yourself admin privileges.

      kubectl create clusterrolebinding gke-spotinst-tutorial \
          --clusterrole=cluster-admin --user=[USER_EMAIL]
      

      Where:

      • [USER_EMAIL] represents your email address for GCP.
    5. Install the Spotinst k8s controller on top of the cluster.

      #!/usr/bin/env bash
      curl -fsSL http://spotinst-public.s3.amazonaws.com/integrations/kubernetes/cluster-controller/scripts/init.sh | \
          SPOTINST_TOKEN=[ENTER_YOUR_TOKEN_HERE] \
          SPOTINST_ACCOUNT=[ENTER_YOUR_SPOTINST_ACCOUNT_HERE] \
          SPOTINST_CLUSTER_IDENTIFIER=gke-spotinst-tutorial \
          bash
      

      Where:

      • [ENTER_YOUR_TOKEN_HERE] represents the Spotinst token you copied in the previous step.
      • [ENTER_YOUR_SPOTINST_ACCOUNT_HERE] represents your Spotinst account number that was listed on the Connectivity page
      1. Go back to the Spotinst console to test the cluster's connectivity. Click Test and allow a few minutes for the controller to start.
      2. After passing the connectivity test, click Next.
  5. In the Review section, review your cluster configuration and add "fallbackToOd": true to allow the Spotinst algorithm to fall back to an on-demand instance in case there are no available preemptible VMs.

    1. Click to turn on Edit Mode.

      Edit toggle button

    2. In the Strategy section, enter "fallbackToOd": true.

      Fall back flag set to true

  6. To create Elastigroup, click Create. Wait a few minutes until you see the preemptible VMs registered to your GKE cluster under the Nodes section in your GCP Console.

  7. After a few minutes later, run the following command in Cloud Shell:

    kubectl get nodes
    

The preemptible VMs are now registered to your GKE cluster:

GCP Console output of preemptible VMs registration

Updating your GKE node groups capacity

Now that Elastigroup is managing the instances for your GKE cluster, you remove the instances created by the node pools connected to your GKE cluster.

  1. In the GCP Console, go to the GKE Clusters page.

    GO TO THE CLUSTERS PAGE

  2. Click Edit.

  3. In the Node Pools section, set the number of nodes to 0 in all of your node pools, and then click Done.

    Defining node pool value

    The on-demand instances connected to the cluster are drained and removed from the cluster. In the GKE cluster nodes view you can see that the on-demand instances are cordoned and are later removed from the cluster.

    Draining of instances

    The output looks like this:

    GCP Console output of drained instances

After this process is finished, you have a GKE cluster backed with Elastigroup as the infrastructure manager.

As you can see on the Instances tab of Elastigroup, there are three nodes of n1-standard-2 type in three different zones.

Distribution of nodes in zones

Seeing the AutoScaler in action

When working with an AutoScaler and a containerized environment, it is recommended that you specify hard resource limits to your deployments in order to make sure no pod abuses all of the instance resources and affects other pods running within the same instance.

Note: The scaling process might take some time due to cool down between different scaling activities.

  1. In Cloud Shell, create nginx-deployment.yaml file with the following deployment to create one replica of Nginx with resource limits of 0.5 vCPU and 512MB RAM memory.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment
      labels:
        app: nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx:1.15.4
            ports:
            - containerPort: 80
            resources:
              limits:
                memory: 512Mi
                cpu: 500m
    
  2. Apply the deployment.

    kubectl apply -f nginx-deployment.yaml
    
  3. Create the nginx-service.yaml file. This file creates a LoadBalancer resource in your GCP project that directs to your deployment.

    apiVersion: v1
    kind: Service
    metadata:
      name: nginx-nlb
    spec:
      selector:
        app: nginx
      type: LoadBalancer
      ports:

    • protocol: TCP port: 80 targetPort: 80
  4. Apply the service.

    kubectl apply -f nginx-service.yaml
    
  5. Get the load balancer EXTERNAL-IP address.

    kubectl get svc
    

    At first the output might look like that:

    GCP Console output of pending IP address

    After few minutes the service EXTERNAL-IP displays:

    GCP Console output of pending IP address

  6. In your browser, connect to the EXTERNAL-IP address. There is now a running Nginx server.

    Nginx server welcome message

  7. Run the following command to see the pods running on top of the cluster:

    kubectl get pods
    

    The output looks like this:

    GCP Console output of pods running on top of the cluster

    To see the nodes in the cluster, wait 10 minutes without touching the cluster and then run the following command:

    kubectl get nodes
    

    The output looks like this:

    GCP Console output after 10 minutes

    As you can see, there are two nodes at the moment, one ready and one is marked as Disabled (in the process of being scaled down). Spotinst AutoScaler recognized that there are idle resources and therefore triggered a scale down activity.

    In the Spotinst Console, go to Elastigroup view > log. The output looks like this:

    GCP Console output of scaledown activity

    On the Instances tab, there is one instance with the Nginx pod on top of it.

    One instance with Nginx pod

  8. Scale up this deployment to 10 replicas.

    kubectl scale deployment.v1.apps/nginx-deployment --replicas=10
    

    To see that there are pending pods in the cluster, run the following command:

    kubectl get pods
    

    The output looks like this:

    GCP Console output of pending pods

    In the Spotinst console, go to Elastigroup view > log to see that the AutoScaler recognized the needs of the pending pods and triggered a scale up activity.

    GCP Console output of Elastigroup scale up

    To see all the pods that are scheduled, run the following command:

    kubectl get pods
    

    The output looks like this:

    GCP Console output of scheduled pods

    To see the three nodes in the cluster, run the following command:

    kubectl get nodes
    

    The output looks like this:

    GCP Console output of three nodes in the cluster

    In the Spotinst Console, go to Elastigroup view > Instances. On the Instances tab you can see the instances that were launched, and that there are two different instance types running in the cluster.

    Launched instances in Elastigroup

  9. Scale the deployment back to one replica.

    kubectl scale deployment.v1.apps/nginx-deployment --replicas=1
    

    It takes a few minutes for the cluster to scale down. In the Elastigroup log are the following logs:

    Elastigroup logs of scale down activity

    Eventually the cluster scales down to one instance.

    GCP Console output of one instance

Cleaning up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:

Delete the project

  1. In the GCP Console, go to the Projects page.

    Go to the Projects page

  2. In the project list, select the project you want to delete and click Delete .
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Reconfigure the node pool and remove Elastigroup

  1. Re-configure the node pools capacity of the GKE cluster to the original capacity.

    gcloud container clusters resize gke-spotinst-tutorial \
        --node-pool [POOL_NAME] --size [SIZE]
    

    Where:

    • [POOL_NAME] represents the name of the node pool.
    • [SIZE] represents the number of nodes.
  2. Delete the Spotinst Controller deployment.

    kubectl delete deployments spotinst-kubernetes-cluster-controller \
        --namespace=kube-system
    
  3. Delete the Elastigroup you created. The pods are rescheduled on top of the on-demand instances specified in the node pools.

  4. If you created a new GKE cluster for this integration, you can remove it as well.

What's next

Was this page helpful? Let us know how we did:

Send feedback about...