Automating cost optimizations with Cloud Functions, Cloud Scheduler, and Cloud Monitoring


This document shows you how to use Cloud Functions to identify and clean up wasted cloud resources, schedule functions to run with Cloud Scheduler, and use Cloud Monitoring alerting policies to execute them based on observed usage. This document is intended for developers, SREs, cloud architects, and cloud infrastructure admins who are looking for a systematic and automated approach to identify and reduce wasteful cloud spending.

This document assumes that you're familiar with the following:

Objectives

  • Delete unused IP addresses: On Google Cloud, static IP addresses are a free resource when they're attached to a load balancer or virtual machine (VM) instance. When a static IP address is reserved, but not used, it accumulates an hourly charge. In apps that heavily depend on static IP addresses and large-scale dynamic provisioning, this waste can become significant over time.
  • Delete orphaned or unused persistent disks: Persistent disks are unused or orphaned if they're created without ever being attached to a VM, or if a machine has multiple disks and one or more disks are detached.
  • Migrate to a less expensive storage classes: Google Cloud offers multiple classes of object storage. Use the class that best fits your needs.

Architecture

The following diagram describes the first part of the deployment, where you schedule a Cloud Function to identify and clean up unused IP addresses.

Architecture of a Cloud Function that identifies and cleans up unused IP addresses.

The first example covers the following:

  • Creating a Compute Engine VM with a static external IP address and a separate unused static external IP address.
  • Deploying a Cloud Function to identify unused addresses.
  • Creating a Cloud Scheduler job to schedule the function to run by using an HTTP trigger.

In the following diagram, you schedule a Cloud Function to identify and clean up unattached and orphaned persistent disks.

Architecture of a Cloud Function that identifies and cleans up unused persistent disks.

The second example covers the following:

  • Creating a Compute Engine VM with two persistent disks and a separate unattached persistent disk. One of the disks is orphaned by being detached from the VM.
  • Deploying a Cloud Function to identify unattached and orphaned persistent disks.
  • Creating a Cloud Scheduler job to schedule the execution of the Cloud Function by using an HTTP trigger.

In the following diagram, you trigger a Cloud Function to migrate a storage bucket to a less expensive storage class from a Monitoring alerting policy.

Architecture of a Cloud Function that migrates a storage bucket.

The third example covers the following:

  • Creating two storage buckets, adding a file to the serving bucket, and generating traffic against it.
  • Creating a Monitoring dashboard to visualize bucket utilization.
  • Deploying a Cloud Function to migrate the idle bucket to a less expensive storage class.
  • Triggering the function by using a payload intended to simulate a notification received from a Monitoring alerting policy.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Enable the Compute Engine, Cloud Functions, and Cloud Storage APIs.

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Google Cloud project.

  7. Enable the Compute Engine, Cloud Functions, and Cloud Storage APIs.

    Enable the APIs

  8. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  9. You run all the commands in this document from Cloud Shell.

Setting up your environment

In this section, you configure the infrastructure and identities that are required for this architecture.

  1. In Cloud Shell, clone the repository and change to the gcf-automated-resource-cleanup directory:

    git clone https://github.com/GoogleCloudPlatform/gcf-automated-resource-cleanup.git && cd gcf-automated-resource-cleanup/
    
  2. Set the environment variables and make the repository folder your $WORKDIR folder, where you run all the commands:

    export PROJECT_ID=$(gcloud config list \
        --format 'value(core.project)' 2>/dev/null)
        WORKDIR=$(pwd)
    
  3. Install Apache Bench, an open source load-generation tool:

    sudo apt-get install apache2-utils
    

Cleaning up unused IP addresses

In this section, you complete the following steps:

  • Create two static IP addresses.
  • Create a VM that uses a static IP address.
  • Review the Cloud Functions code.
  • Deploy the Cloud Function.
  • Test the Cloud Function by using Cloud Scheduler jobs.

Create IP addresses

  1. In Cloud Shell, change to the unused-ip directory:

    cd $WORKDIR/unused-ip
    
  2. Export the names of the IP addresses as variables:

    export USED_IP=used-ip-address
    export UNUSED_IP=unused-ip-address
    
  3. Create two static IP addresses:

    gcloud compute addresses create $USED_IP \
        --project=$PROJECT_ID --region=us-central1
    gcloud compute addresses create $UNUSED_IP \
        --project=$PROJECT_ID --region=us-central1
    

    This example uses the us-central1 region, but you can choose a different region and refer to it consistently throughout the rest of this document.

  4. Confirm that two addresses were created:

    gcloud compute addresses list --filter="region:(us-central1)"
    

    In the output, a status of RESERVED means that the IP addresses aren't in use:

    NAME               ADDRESS/RANGE  TYPE      REGION       SUBNET  STATUS
    unused-ip-address  35.232.144.85  EXTERNAL  us-central1          RESERVED
    used-ip-address    104.197.56.87  EXTERNAL  us-central1          RESERVED
    
  5. Set the used IP address as an environment variable:

    export USED_IP_ADDRESS=$(gcloud compute addresses describe $USED_IP \
        --region=us-central1 --format=json | jq -r '.address')
    

Create a VM

  1. In Cloud Shell, create an instance:

    gcloud compute instances create static-ip-instance \
        --zone=us-central1-a \
        --machine-type=n1-standard-1 \
        --subnet=default \
        --address=$USED_IP_ADDRESS
    
  2. Confirm that one of the IP addresses is now in use:

    gcloud compute addresses list --filter="region:(us-central1)"
    

    The output is similar to the following:

    NAME               ADDRESS/RANGE  TYPE      REGION       SUBNET  STATUS
    unused-ip-address  35.232.144.85  EXTERNAL  us-central1          RESERVED
    used-ip-address    104.197.56.87  EXTERNAL  us-central1          IN_USE
    

Review the Cloud Function code

  • In Cloud Shell, output the main section of the code:

    cat $WORKDIR/unused-ip/function.js | grep "const compute" -A 31
    

    The output is as follows:

    const compute = new Compute();
    compute.getAddresses(function(err, addresses){ // gets all addresses across regions
         if(err){
             console.log("there was an error: " + err);
         }
         if (addresses == null) {
             console.log("no addresses found");
             return;
         }
         console.log("there are " + addresses.length + " addresses");
    
         // iterate through addresses
         for (let item of addresses){
    
              // get metadata for each address
              item.getMetadata(function(err, metadata, apiResponse) {
    
                  // if the address is not used AND if it's at least ageToDelete days old:
                  if ((metadata.status=='RESERVED') & (calculateAge(metadata.creationTimestamp) >= ageToDelete)){
                      // delete address
                      item.delete(function(err, operation, apiResponse2){
                          if (err) {
                              console.log("could not delete address: " + err);
                          }
                      })
                  }
              })
          }
           // return number of addresses evaluated
          res.send("there are " + addresses.length + " total addresses");
      });
    }
    

    In the preceding code sample, pay attention to the following:

    • compute.getAddresses(function(err, addresses){ // gets all addresses across regions
      

      Uses the getAddresses method to retrieve IP addresses across all regions in the project.

    • // get metadata for each address
      item.getMetadata(function(err, metadata, apiResponse) {
         // if the address is not used:
             if (metadata.status=='RESERVED'){
      

      Gets the metadata for each IP address and checks its STATUS field.

    • if ((metadata.status=='RESERVED') &
      (calculateAge(metadata.creationTimestamp) >= ageToDelete)){
      

      Checks whether the IP address is in use, calculates its age by using a helper function, and compares its age against a constant (set to 0 for the purposes of the example).

    • // delete address
      item.delete(function(err, operation, apiResponse2){
      

      Deletes the IP address.

Deploy the Cloud Function

  1. In Cloud Shell, deploy the Cloud Function:

    gcloud functions deploy unused_ip_function --trigger-http --runtime=nodejs8
    
  2. Set the trigger URL as an environment variable:

    export FUNCTION_URL=$(gcloud functions describe unused_ip_function \
        --format=json | jq -r '.httpsTrigger.url')
    

Schedule and test the Cloud Function

  1. In Cloud Shell, create a Cloud Scheduler task to run the Cloud Function at 2 AM every day:

    gcloud scheduler jobs create http unused-ip-job \
        --schedule="* 2 * * *" \
        --uri=$FUNCTION_URL
    
  2. Test the job by manually triggering it:

    gcloud scheduler jobs run unused-ip-job
    
  3. Confirm that the unused IP address was deleted:

    gcloud compute addresses list --filter="region:(us-central1)"
    

    The output is similar to the following:

    NAME             ADDRESS/RANGE  TYPE      REGION       SUBNET  STATUS
    used-ip-address  104.197.56.87  EXTERNAL  us-central1          IN_USE
    

Cleaning up unused and orphaned persistent disks

In this section, you complete the following steps:

  • Create two persistent disks.
  • Create a VM that uses one of the disks.
  • Detach the disk from the VM.
  • Review the Cloud Function code.
  • Deploy the Cloud Function.
  • Test the Cloud Function by using Cloud Scheduler jobs.

Create persistent disks

  1. In Cloud Shell, change to the unattached-pd directory:

    cd $WORKDIR/unattached-pd
    
  2. Export the names of the disks as environment variables:

    export ORPHANED_DISK=orphaned-disk
    export UNUSED_DISK=unused-disk
    
  3. Create the two disks:

    gcloud beta compute disks create $ORPHANED_DISK \
       --project=$PROJECT_ID \
       --type=pd-standard \
       --size=500GB \
       --zone=us-central1-a
    gcloud beta compute disks create $UNUSED_DISK \
        --project=$PROJECT_ID \
        --type=pd-standard \
        --size=500GB \
        --zone=us-central1-a
    
  4. Confirm that the two disks were created:

    gcloud compute disks list
    

    The output is as follows:

    NAME                LOCATION       LOCATION_SCOPE SIZE_GB TYPE         STATUS
    orphaned-disk       us-central1-a  zone           500     pd-standard  READY
    static-ip-instance  us-central1-a  zone           10      pd-standard  READY
    unused-disk         us-central1-a  zone           500     pd-standard  READY
    

Create a VM and inspect the disks

  1. In Cloud Shell, create the instance:

    gcloud compute instances create disk-instance \
        --zone=us-central1-a \
        --machine-type=n1-standard-1 \
        --disk=name=$ORPHANED_DISK,device-name=$ORPHANED_DISK,mode=rw,boot=no
    
  2. Inspect the disk that was attached to the VM:

    gcloud compute disks describe $ORPHANED_DISK \
        --zone=us-central1-a \
        --format=json | jq
    

    The output is similar to the following:

    {
      "creationTimestamp": "2019-06-12T12:21:25.546-07:00",
      "id": "7617542552306904666",
      "kind": "compute#disk",
      "labelFingerprint": "42WmSpB8rSM=",
      "lastAttachTimestamp": "2019-06-12T12:24:53.989-07:00",
      "name": "orphaned-disk",
      "physicalBlockSizeBytes": "4096",
      "selfLink": "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a/disks/orphaned-disk",
      "sizeGb": "500",
      "status": "READY",
      "type": "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a/diskTypes/pd-standard",
      "users": [
        "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a/instances/disk-instance"
      ],
      "zone": "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a"
    }
    

    In the preceding code sample, pay attention to the following:

    • users identifies the VM that the disk is attached to.
    • lastAttachTimestamp identifies when the disk was last attached to a VM.
  3. Inspect the disk that hasn't been attached to a VM:

    gcloud compute disks describe $UNUSED_DISK \
        --zone=us-central1-a \
        --format=json | jq
    

    The output is similar to the following:

    {
      "creationTimestamp": "2019-06-12T12:21:30.905-07:00",
      "id": "1313096191791918677",
      "kind": "compute#disk",
      "labelFingerprint": "42WmSpB8rSM=",
      "name": "unused-disk",
      "physicalBlockSizeBytes": "4096",
      "selfLink": "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a/disks/unused-disk",
      "sizeGb": "500",
      "status": "READY",
      "type": "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a/diskTypes/pd-standard",
      "zone": "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a"
    }
    

    In the preceding code sample, the following is important:

    • The disk doesn't have users listed because it's not currently in use by a VM.
    • The disk doesn't have lastAttachedTimestamp because it's never been used.
  4. Detach the orphaned persistent disk from the VM:

    gcloud compute instances detach-disk disk-instance \
        --device-name=$ORPHANED_DISK \
        --zone=us-central1-a
    
  5. Inspect the orphaned disk:

    gcloud compute disks describe $ORPHANED_DISK \
        --zone=us-central1-a \
        --format=json | jq
    

    The output is similar to the following:

    {
      "creationTimestamp": "2019-06-12T12:21:25.546-07:00",
      "id": "7617542552306904666",
      "kind": "compute#disk",
      "labelFingerprint": "42WmSpB8rSM=",
      "lastAttachTimestamp": "2019-06-12T12:24:53.989-07:00",
      "lastDetachTimestamp": "2019-06-12T12:34:56.040-07:00",
      "name": "orphaned-disk",
      "physicalBlockSizeBytes": "4096",
      "selfLink": "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a/disks/orphaned-disk",
      "sizeGb": "500",
      "status": "READY",
      "type": "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a/diskTypes/pd-standard",
      "zone": "https://www.googleapis.com/compute/v1/projects/automating-cost-optimization/zones/us-central1-a"
    }
    

    In the preceding code sample, the following is important:

    • The disk doesn't have users listed, which indicates that it isn't currently in use.
    • There is now a lastDetachTimestamp entry, indicating when the disk was last detached from a VM and, therefore, when it was last in use.
    • The lastAttachTimestamp field is still present.

Review the Cloud Function code

  1. In Cloud Shell, output the section of the code that retrieves all persistent disks in the project:

    cat $WORKDIR/unattached-pd/main.py | grep "(request)" -A 12
    

    The output is as follows:

    def delete_unattached_pds(request):
        # get list of disks and iterate through it:
        disksRequest = compute.disks().aggregatedList(project=project)
        while disksRequest is not None:
            diskResponse = disksRequest.execute()
            for name, disks_scoped_list in diskResponse['items'].items():
                if disks_scoped_list.get('warning') is None:
                    # got disks
                    for disk in disks_scoped_list['disks']: # iterate through disks
                        diskName = disk['name']
                        diskZone = str((disk['zone'])).rsplit('/',1)[1]
                        print (diskName)
                        print (diskZone)
    

    The function uses theaggregatedList method to get all persistent disks in the Google Cloud project where it's running and iterates through each of the disks.

  2. Output the section of the code that checks the lastAttachTimestamp field and deletes the disk if it doesn't exist:

    cat $WORKDIR/unattached-pd/main.py | grep "handle never" -A 11
    

    The output is as follows:

    # handle never attached disk - delete it
    # lastAttachedTimestamp is not present
    if disk.get("lastAttachTimestamp") is None:
           print ("disk " + diskName + " was never attached - deleting")
           deleteRequest = compute.disks().delete(project=project,
                  zone=diskZone,
                  disk=diskName)
           deleteResponse = deleteRequest.execute()
           waitForZoneOperation(deleteResponse, project, diskZone)
           print ("disk " + diskName + " was deleted")
           Continue
    

    This section deletes the disk if lastAttachTimestamp isn't present—meaning this disk was never in use.

  3. Output the section of the code that calculates the age of the disk if it's orphaned, creates a snapshot of it, and deletes it:

    cat $WORKDIR/unattached-pd/main.py | grep "handle detached" -A 32
    

    The output is as follows:

    # handle detached disk - snapshot and delete
    # lastAttachTimestamp is present AND users is not present AND it meets the age criterium
    if disk.get("users") is None \
        and disk.get("lastDetachTimestamp") is not None \
        and diskAge(disk['lastDetachTimestamp'])>=deleteAge:
    
        print ("disk " + diskName + " has no users and has been detached")
        print ("disk meets age criteria for deletion")
    
        # take a snapshot
        snapShotName = diskName + str(int(time.time()))
        print ("taking snapshot: " + snapShotName)
        snapshotBody = {
            "name": snapShotName
        }
        snapshotRequest = compute.disks().createSnapshot(project=project,
             zone=diskZone,
             disk=diskName,
             body=snapshotBody)
        snapshotResponse = snapshotRequest.execute()
        waitForZoneOperation(snapshotResponse, project, diskZone)
        print ("snapshot completed")
    
        # delete the disk
        print ("deleting disk " + diskName)
        deleteRequest = compute.disks().delete(project=project,
            zone=diskZone,
            disk=diskName)
        deleteResponse = deleteRequest.execute()
        waitForZoneOperation(deleteResponse, project, diskZone)
        print ("disk " + diskName + " was deleted")
        continue
    

    This section of code is used when the disk does have users listed and lastDetachTimestamp is present, which means the disk is currently not in use, but was used at some time. In this case, the Cloud Function creates a snapshot of the disk to retain data and then deletes the disk.

Deploy the Cloud Function

  1. In Cloud Shell, deploy the Cloud Function:

    gcloud functions deploy delete_unattached_pds \
        --trigger-http --runtime=python37
    
  2. Set the trigger URL of the Cloud Function as an environment variable:

    export FUNCTION_URL=$(gcloud functions describe delete_unattached_pds \
        --format=json | jq -r '.httpsTrigger.url')
    

Schedule and test the Cloud Function

  1. In Cloud Shell, create a Cloud Scheduler task to run the Cloud Function at 2 AM every day:

    gcloud scheduler jobs create http unattached-pd-job \
        --schedule="* 2 * * *" \
        --uri=$FUNCTION_URL
    
  2. Test the job:

    gcloud scheduler jobs run unattached-pd-job
    
  3. Confirm that a snapshot of the orphaned disk was created:

    gcloud compute snapshots list
    

    The output is similar to the following:

    NAME                     DISK_SIZE_GB  SRC_DISK                           STATUS
    orphaned-disk1560455894  500           us-central1-a/disks/orphaned-disk  READY
    
  4. Confirm that the unused disk and the orphaned disk were deleted:

    gcloud compute disks list
    

    The output is as follows:

    NAME                LOCATION       LOCATION_SCOPE SIZE_GB  TYPE         STATUS
    disk-instance       us-central1-a  zone           10       pd-standard  READY
    static-ip-instance  us-central1-a  zone           10       pd-standard  READY
    

Migrating storage buckets to less expensive storage classes

Google Cloud provides storage object lifecycle rules that you can use to automatically move objects to different storage classes based on a set of attributes, such as their creation date or live state. However, these rules don't know whether the objects have been accessed. Sometimes, you might want to move newer objects to Nearline Storage if they haven't been accessed for a certain amount of time.

In this section, you complete the following steps:

  • Create two Cloud Storage buckets.
  • Add an object to one of the buckets.
  • Configure Monitoring to observe bucket object access.
  • Review the Cloud Function code that migrates objects from a Regional Storage bucket to a Nearline Storage bucket.
  • Deploy the Cloud Function.
  • Test the Cloud Function by using a Monitoring alert.

Create Cloud Storage buckets and add a file

  1. In Cloud Shell, change to the migrate-storage directory:

    cd $WORKDIR/migrate-storage
    
  2. Create the serving-bucket Cloud Storage bucket that is used later to change storage classes:

    export PROJECT_ID=$(gcloud config list \
        --format 'value(core.project)' 2>/dev/null)
    gsutil mb -c regional -l us-central1 gs://${PROJECT_ID}-serving-bucket
    
  3. Make the bucket public:

    gsutil acl ch -u allUsers:R gs://${PROJECT_ID}-serving-bucket
    
  4. Add a text file to the bucket:

    gsutil cp $WORKDIR/migrate-storage/testfile.txt  \
        gs://${PROJECT_ID}-serving-bucket
    
  5. Make the file public:

    gsutil acl ch -u allUsers:R gs://${PROJECT_ID}-serving-bucket/testfile.txt
    
  6. Confirm that you're able to access the file:

    curl http://storage.googleapis.com/${PROJECT_ID}-serving-bucket/testfile.txt
    

    The output is as follows:

    this is a test
    
  7. Create a second bucket called idle-bucket that doesn't serve any data:

    gsutil mb -c regional -l us-central1 gs://${PROJECT_ID}-idle-bucket
    

Set up a Cloud Monitoring workspace

In this section, you configure Cloud Monitoring to observe bucket usage to understand when bucket objects aren't being used. When the serving bucket isn't used, a Cloud Function migrates the bucket from the Regional Storage class to the Nearline Storage class.

  1. In the Google Cloud console, go to Monitoring.

    Go to Cloud Monitoring

  2. Click New Workspace, and then click Add.

    Wait for the initial configuration to complete.

Create a Cloud Monitoring dashboard

  1. In Monitoring, go to Dashboards, and then click Create Dashboard.

  2. Click Add Chart.

  3. In the Name field, enter Bucket Access.

  4. To find the request content metric for the Cloud Storage bucket, in the Find resource and metric field, enter request, and then select the Request count metric for the gcs_bucket resource.

  5. To group the metrics by bucket name, in the Group By drop-down list, click bucket_name.

  6. To filter by the method name, in the Filter field, enter ReadObject, and then click Apply.

  7. Click Save.

  8. In the name field, enter Bucket Usage.

  9. To confirm that the dashboard is accessible, hold the pointer over Dashboards and verify that Bucket Usage appears.

    You've configured Monitoring to observe object access in your buckets. The chart doesn‘t display any data because there is no traffic to the Cloud Storage buckets.

Generate load on the serving bucket

Now that monitoring is configured, use Apache Bench to send traffic to the serving bucket.

  1. In Cloud Shell, send requests to the object in the serving bucket:

    ab -n 10000 \
        http://storage.googleapis.com/$PROJECT_ID-serving-bucket/testfile.txt
    
  2. In the Google Cloud console, go to Monitoring.

    Go to Cloud Monitoring

  3. To select your Bucket Usage dashboard, hold your pointer over Dashboards and select Bucket Usage. Confirm that there is traffic only to the serving bucket. The request_count metric time series is displayed only for the serving bucket, because the idle bucket doesn't have any traffic to it.

Review and deploy the Cloud Function

  1. In Cloud Shell, output the code that uses the Cloud Function to migrate a storage bucket to the Nearline Storage class:

    cat $WORKDIR/migrate-storage/main.py | grep "migrate_storage(" -A 15
    

    The output is as follows:

    def migrate_storage(request):
        # process incoming request to get the bucket to be migrated:
        request_json = request.get_json(force=True)
        # bucket names are globally unique
        bucket_name = request_json['incident']['resource_name']
    
        # create storage client
        storage_client = storage.Client()
    
        # get bucket
        bucket = storage_client.get_bucket(bucket_name)
    
        # update storage class
        bucket.storage_class = "NEARLINE"
        bucket.patch()
    

    The Cloud Function uses the bucket name passed in the request to change its storage class to Nearline Storage.

  2. Deploy the Cloud Function:

    gcloud functions deploy migrate_storage --trigger-http --runtime=python37
    
  3. Set the trigger URL as an environment variable that you use in the next section:

    export FUNCTION_URL=$(gcloud functions describe migrate_storage \
        --format=json | jq -r '.httpsTrigger.url')
    

Test and validate alerting automation

  1. Set the idle bucket name:

    export IDLE_BUCKET_NAME=$PROJECT_ID-idle-bucket
    
  2. Send a test notification to the Cloud Function you deployed by using the incident.json file:

    envsubst < $WORKDIR/migrate-storage/incident.json | curl -X POST \
        -H "Content-Type: application/json" $FUNCTION_URL -d @-
    

    The output is as follows:

    OK
    

    The output isn't terminated with a newline and therefore is immediately followed by the command prompt.

  3. Confirm that the idle bucket was migrated to Nearline Storage:

    gsutil defstorageclass get gs://$PROJECT_ID-idle-bucket
    

    The output is as follows:

    gs://automating-cost-optimization-idle-bucket: NEARLINE
    

Considerations for a production environment

When you automate cost optimizations in your own Google Cloud environment, consider the following:

  • General considerations: You should increase security for Cloud Functions that have the power to modify or delete Google Cloud resources.
  • Identifying waste: This document covers a few examples of wasted spending. There are many other examples that generally fall into one of three categories:
    • Overprovisioned resources: Resources that are provisioned to be larger than necessary for a given workload, such as VMs with more CPU power and memory than necessary.
    • Idle resources: Resources that are entirely unused.
    • Part-time idle resources: Resources that are only used during business hours.
  • Automating cleanup: In this document, a multi-step process with multiple asynchronous operations was necessary to snapshot and delete the disk. Other Google Cloud resources such as unused IP addresses can use synchronous operations.
  • Deploying at scale: In this document, the Google Cloud project ID is defined in the Cloud Function code. To deploy such a solution at scale, consider using either the Cloud Billing or Cloud Resource Manager APIs to get the list of projects with a billing account or an organization. Then, pass those Google Cloud project IDs as variables to a function. In such a configuration, you need to add the service account of the Cloud Function to the projects where it can clean up or delete resources. We recommend using an automated deployment framework, such as Cloud Deployment Manager or Terraform.
  • Alerting automation: This document shows you how to use a mock payload from a Monitoring alert to trigger the storage class migration. Monitoring alerting policies can be evaluated over a maximum of 23 hours and 59 minutes. In a production environment, this restriction might not be long enough to consider a bucket idle before migrating its storage class. Consider enabling data access audit logs on the Cloud Storage bucket and creating a pipeline that consumes these audit logs to evaluate whether a bucket has been used for serving in the last 30 days. For more information, review understanding audit logs and consider creating an aggregated sink to send logs to Pub/Sub and a Dataflow pipeline to process them.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete the project

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

What's next