Programmatically Scaling Cloud Bigtable

In some cases, it can be useful to scale your Cloud Bigtable cluster programmatically based on metrics such as the cluster's CPU usage. For example, if your cluster is under heavy load, and its CPU usage is extremely high, you can add nodes to the cluster until its CPU usage drops. You can also save money by removing nodes from the cluster when it is not being used heavily.

This page explains how to scale your Cloud Bigtable cluster programmatically and provides a code sample that you can use as a starting point. It also describes some limitations that you should be aware of before you set up programmatic scaling.

How to scale Cloud Bigtable programmatically

Cloud Bigtable exposes a variety of metrics through the Stackdriver Monitoring API. You can programmatically monitor these metrics for your cluster, then use one of the Cloud Bigtable client libraries or the gcloud command-line tool to add or remove nodes based on the current metrics. After you resize your cluster, you can monitor its performance through the Cloud Platform Console or programmatically.

Monitoring API metrics

The Monitoring API provides a variety of metrics that you can use to monitor the current state of your cluster. Some of the most useful metrics for programmatic scaling include:

  • bigtable.googleapis.com/cluster/cpu_load: The cluster's CPU load.
  • bigtable.googleapis.com/cluster/node_count: The number of nodes in the cluster.
  • bigtable.googleapis.com/server/latencies: The distribution of server request latencies for a table.

Sample code

As a starting point for your own programmatic scaling tool, you can use one of the following sample tools:

The sample tools add nodes to a Cloud Bigtable cluster when its CPU load is above a specified value. Similarly, the sample tools remove nodes from a Cloud Bigtable cluster when its CPU load is below a specified value. To run the sample tools, follow the instructions for each sample on GitHub.

The sample tools use the following code to gather information about the CPU load on the cluster:

Java

Timestamp now = timeXMinutesAgo(0);
Timestamp fiveMinutesAgo = timeXMinutesAgo(5);
TimeInterval interval =
    TimeInterval.newBuilder().setStartTime(fiveMinutesAgo).setEndTime(now).build();
String filter = "metric.type=\"" + CPU_METRIC + "\"";
ListTimeSeriesPagedResponse response =
    metricServiceClient.listTimeSeries(projectName, filter, interval, TimeSeriesView.FULL);
return response.getPage().getValues().iterator().next().getPointsList().get(0);

Python

client = monitoring.Client()
query = client.query('bigtable.googleapis.com/cluster/cpu_load', minutes=5)
time_series = list(query)
recent_time_series = time_series[0]
return recent_time_series.points[0].value

Based on the CPU load, the sample tools use the Cloud Bigtable client library to resize the cluster:

Java

double latestValue = getLatestValue().getValue().getDoubleValue();
if (latestValue < CPU_PERCENT_TO_DOWNSCALE) {
  int clusterSize = clusterUtility.getClusterNodeCount(clusterId, zoneId);
  if (clusterSize > MIN_NODE_COUNT) {
    clusterUtility.setClusterSize(clusterId, zoneId,
      Math.max(clusterSize - SIZE_CHANGE_STEP, MIN_NODE_COUNT));
  }
} else if (latestValue > CPU_PERCENT_TO_UPSCALE) {
  int clusterSize = clusterUtility.getClusterNodeCount(clusterId, zoneId);
  if (clusterSize <= MAX_NODE_COUNT) {
    clusterUtility.setClusterSize(clusterId, zoneId,
      Math.min(clusterSize + SIZE_CHANGE_STEP, MAX_NODE_COUNT));
  }
}

Python

bigtable_client = bigtable.Client(admin=True)
instance = bigtable_client.instance(bigtable_instance)
instance.reload()

cluster = instance.cluster(bigtable_cluster)
cluster.reload()

current_node_count = cluster.serve_nodes

if scale_up:
    if current_node_count < max_node_count:
        new_node_count = min(
            current_node_count + size_change_step, max_node_count)
        cluster.serve_nodes = new_node_count
        cluster.update()
        print('Scaled up from {} to {} nodes.'.format(
            current_node_count, new_node_count))
else:
    if current_node_count > min_node_count:
        new_node_count = max(
            current_node_count - size_change_step, min_node_count)
        cluster.serve_nodes = new_node_count
        cluster.update()
        print('Scaled down from {} to {} nodes.'.format(
            current_node_count, new_node_count))

After the cluster is resized, you can use the Cloud Platform Console to monitor how its performance changes over time.

Limitations

Before you set up programmatic scaling for your Cloud Bigtable cluster, be sure to consider the following limitations.

Delay in performance improvements

After you add nodes to a cluster, it can take up to 20 minutes under load before you see a significant improvement in the cluster's performance. As a result, if your workload involves short bursts of high activity, adding nodes to your cluster based on CPU load will not improve performance—by the time Cloud Bigtable rebalances your data, the short burst of activity will be over.

To address this issue, you can add nodes to your cluster, either programmatically or through the Google Cloud Platform Console, before you increase the load on the cluster. This approach gives Cloud Bigtable time to rebalance your data across the additional nodes.

Schema design issues

If there are problems with the schema design for your table, adding nodes to your Cloud Bigtable cluster may not improve performance. For example, if you have a large number of reads or writes to a single row in your table, all of the reads or writes will go to the same node in your cluster; as a result, adding nodes will not improve performance. In contrast, if reads and writes are evenly distributed across rows in your table, adding nodes will generally improve performance.

See Designing Your Schema for details about how to design a schema that allows Cloud Bigtable to scale effectively.

What's next

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Cloud Bigtable Documentation