Edit on GitHub
Report issue
Page history

Cleaning up Compute Engine instances at scale

Author(s): @hbougdal @jpatokal ,   Published: 2020-10-14

Hicham Bougdal and Jani Patokallio | Google

Contributed by Google employees.

This tutorial offers a simple and scalable serverless mechanism to automatically delete (garbage-collect) Compute Engine virtual machine (VM) instances after a specified amount of time.

Some cases in which this may be useful:

  • Developers or testers create one-off VM instances for testing a feature, but they might not always remember to manually delete the instances.
  • Workflows can require dynamically starting a large number of Compute Engine worker instances to perform a certain task. A best practice is to have instances delete themselves after the task is complete, but ensuring that this always happens can be difficult if the task is distributed or some workers stop because of errors.

How it works

The following diagram shows a high-level overview of the solution:

High-level overview of the solution

Each Compute Engine instance in scope is assigned two labels:

  • TTL (time to live): Indicates (in minutes) after how much time the VM will not be needed and can be deleted.
  • ENV: Indicates that the instance is part of the pool of VMs that can be checked regularly and can be deleted if the TTL is reached.

The overall flow is the following:

  1. A Cloud Scheduler cron job is triggered regularly (for example, every 5 minutes). The Cloud Scheduler configuration specifies the label of the pool of VMs to target, using the following format: '{"label":"env=test"}'
  2. When the cron job is triggered, Cloud Scheduler pushes a message with the label payload to a Pub/Sub topic.
  3. A Cloud Function is subscribed to the Pub/Sub topic. Each time the function is triggered, it does the following:
    1. Reads the payload of the Pub/Sub message and extracts the label.
    2. Filters all of the Compute Engine instances that have the label.
    3. Iterates through the instances and does the following:
      1. Reads the value of the TTL label for each instance.
      2. Calculates the difference between the current time and the creation time of each instance.
      3. If the difference is greater than the TTL, then the instance is deleted. If not, nothing is done.

Costs

This tutorial uses the following Google Cloud components:

  • Compute Engine
  • Cloud Scheduler
  • Pub/Sub
  • Cloud Functions

To estimate the cost of running this sample, assume that you run a single f1-micro Compute Engine instance for a total of 15 minutes on one day while you test the sample, after which you delete the project, releasing all resources.

Use the pricing calculator to generate a cost estimate based on this projected usage.

Cloud Scheduler is free for up to 3 jobs per month.

New Google Cloud users may be eligible for a free trial.

Before you begin

  1. If you don’t already have one, create a Google Account.

  2. Create a Google Cloud project: In the Cloud Console, select Create Project.

  3. Enable billing for the project.

  4. Open Cloud Shell.

  5. Create an App Engine app, which is required by Cloud Scheduler:

    gcloud app create --region=us-central
    
  6. Enable the APIs used by this tutorial:

        gcloud services enable appengine.googleapis.com cloudbuild.googleapis.com \
          cloudfunctions.googleapis.com cloudscheduler.googleapis.com compute.googleapis.com \
          pubsub.googleapis.com
    

Set up the automated cleanup code

You run the commands in this section in Cloud Shell.

  1. Clone the GitHub repository:

    git clone https://github.com/GoogleCloudPlatform/community
    
  2. Change directories to the cleaning-up-at-scale directory:

    cd community/tutorials/cleaning-up-at-scale
    

    The exact path depends on where you placed the directory when you cloned the sample files from GitHub.

  3. Create the Pub/Sub topic that you will push messages to:

    gcloud pubsub topics create unused-instances
    

    You can verify that the Pub/Sub topic has been created with the following command:

    gcloud pubsub topics list
    

    Topics also appear on the Pub/Sub Topics page in the Cloud Console.

  4. Deploy the Cloud Function that will monitor the Pub/Sub topic and clean up instances:

    gcloud functions deploy clean-unused-instances --trigger-topic=unused-instances --runtime=nodejs12 --entry-point=cleanUnusedInstances
    
  5. Configure Cloud Scheduler to push a message containing the target label every minute to the Pub/Sub topic unused-instances:

    gcloud scheduler jobs create pubsub clean-unused-instances-job --schedule="* * * * *" \
      --topic=unused-instances --message-body='{"label":"env=test"}'
    

    The schedule is specified in unix-cron format. A * in every field means that the job runs every minute, every hour, every day of the month, every month, and every day of the week. More simply put, the job runs once per minute.

    If scanning large numbers of VMs, running less often (such as once per hour) is likely sufficient.

    You can verify that the job has been created with the following command:

    gcloud scheduler jobs list
    

    Jobs also appear on the Cloud Scheduler page in the Cloud Console. On that page, you can view execution logs for each job by clicking View in the Logs column.

Test the automated cleanup

  1. Create a test instance labeled env=test with a two-minute TTL:

    gcloud compute instances create cleanup-test --zone=us-central1-a --machine-type=f1-micro \
      --labels=env=test,ttl=2
    
  2. Check that the new instance has started successfully.

    gcloud compute instances list
    
  3. Wait two minutes and run the same command again:

    gcloud compute instances list
    

    The instance should have been automatically deleted.

You can also see the Cloud Function execution results, including the name of the deleted instance, by viewing the Cloud Function logs from the Cloud Functions page in the Cloud Console.

Shut down resources used in the tutorial

Now that you have tested the automated cleanup of VM instances, you can either delete the entire project or delete the individual resources that you created to prevent further billing for them on your account.

  • You can delete the Cloud Scheduler job on the Cloud Scheduler page in the Cloud Console.

  • You can delete the Cloud Pub/Sub topic and associated subscriptions on the Pub/Sub page of the Cloud Console.

Submit a tutorial

Share step-by-step guides

Submit a tutorial

Request a tutorial

Ask for community help

Submit a request

View tutorials

Search Google Cloud tutorials

View tutorials

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see our Site Policies. Java is a registered trademark of Oracle and/or its affiliates.