To use Cloud Bigtable, you create instances, which contain up to 4 clusters that your applications can connect to. Each cluster contains nodes, the compute units that manage your data and perform maintenance tasks.
This page provides more information about Bigtable instances, clusters, and nodes.
- To learn how to create an instance, see Creating an Instance.
- To learn how to manage an instance's clusters, see Adding and deleting clusters.
- To learn how to monitor an instance and its clusters, see Monitoring an instance.
- To learn how to update the number of nodes in a cluster, see Adding and removing nodes.
Before you read this page, you should be familiar with the overview of Bigtable.
A table belongs to an instance, not to a cluster or node. If you have an instance with more than one cluster, you are using replication. This means you can't assign a table to an individual cluster or create unique garbage collection policies for each cluster in an instance. You also can't make each cluster store a different set of data in the same table.
An instance has a few important properties that you need to know about:
- The storage type (SSD or HDD)
- The application profiles, which are primarily for instances that use replication
The following sections describe these properties.
When you create an instance, you must choose whether the instance's clusters will store data on solid-state drives (SSD) or hard disk drives (HDD). SSD is often, but not always, the most efficient and cost-effective choice.
The choice between SSD and HDD is permanent, and every cluster in your instance must use the same type of storage, so make sure you pick the right storage type for your use case. See Choosing between SSD and HDD storage for more information to help you decide.
After you create an instance, Bigtable uses the instance to store application profiles, or app profiles. For instances that use replication, app profiles control how your applications connect to the instance's clusters.
If your instance doesn't use replication, you can still use app profiles to provide separate identifiers for each of your applications, or each function within an application. You can then view separate charts for each app profile in the Cloud Console.
A cluster represents the Bigtable service in a specific location. Each cluster belongs to a single Bigtable instance, and an instance can have up to 4 clusters. When your application sends requests to a Bigtable instance, those requests are handled by one of the clusters in the instance.
Each cluster is located in a single zone.
An instance's clusters must each be in unique zones. You can create an additional cluster in any
zone where Bigtable is available.
For example, if the first cluster is in
us-east1-b, you can choose a different zone in
the same region, such as
us-east1-c, or a zone in a separate region, such as
For a list of zones and regions where
Bigtable is available, see Bigtable
Bigtable instances with only 1 cluster do not use replication. If you add a second cluster to an instance, Bigtable automatically starts replicating your data by keeping separate copies of the data in each of the clusters' zones and synchronizing updates between the copies. You can choose which cluster your applications connect to, which makes it possible to isolate different types of traffic from one another. You can also let Bigtable balance traffic between clusters. If a cluster becomes unavailable, you can fail over from one cluster to another. To learn more about how replication works, see Overview of Replication.
Each cluster in an instance has 1 or more nodes, which are compute resources that Bigtable uses to manage your data.
Behind the scenes, Bigtable splits all of the data in a table into separate tablets. Tablets are stored on disk, separate from the nodes but in the same zone as the nodes. A tablet is associated with a single node.
Each node is responsible for:
- Keeping track of specific tablets on disk.
- Handling incoming reads and writes for its tablets.
- Performing maintenance tasks on its tablets, such as periodic compactions.
A cluster must have enough nodes to support its current workload and the amount of data it stores. Otherwise, the cluster might not be able to handle incoming requests, and latency could go up. Monitor your clusters' CPU and disk usage, and add nodes to an instance when its metrics exceed the recommendations and limits listed below.
For more details about how Bigtable stores and manages data, see Bigtable architecture.
Bigtable reports the following metrics for CPU usage:
|Average CPU utilization||
The average CPU utilization across all nodes in the cluster.
The recommended maximum values provide headroom for brief spikes in usage.
If a cluster exceeds the recommended maximum value for your configuration for more than a few minutes, add nodes to the cluster.
|CPU utilization of hottest node||
CPU utilization for the busiest node in the cluster.
If the hottest node is frequently above the recommended value, even when your average CPU utilization is reasonable, you might be accessing a small part of your data much more frequently than the rest of your data.
|CPU utilization by app profile, method, and table||
CPU utilization by app profile, method, and table.
If you observe higher than expected CPU usage for a cluster, use this metric to determine if the CPU usage of a particular app profile, API method, or table is driving the CPU load.
The values for these metrics should not exceed the following:
|Configuration||Recommended maximum values1|
70% average CPU utilization
|Any number of clusters with single-cluster routing||
70% average CPU utilization
|2 clusters with multi-cluster routing||
35% average CPU
|3 or more clusters with multi-cluster routing||
Depends on your configuration. See the examples of replication settings for common use cases.
Bigtable reports the following metrics for disk usage:
|Storage utilization (bytes)||
The amount of data stored in the cluster.
This value affects your costs. Also, as described below, you might need to add nodes to each cluster as the amount of data increases.
|Storage utilization (% max)||
The percentage of the cluster's storage capacity that is being used. The capacity is based on the number of nodes in your cluster.
In general, do not use more than 70% of the hard limit on total storage, so you have room to add more data. If you do not plan to add significant amounts of data to your instance, you can use up to 100% of the hard limit.
If you are using more than the recommended percentage of the storage limit, add nodes to the cluster. You can also delete existing data, but deleted data takes up more space, not less, until a compaction occurs.
For details about how this value is calculated, see Storage utilization per node.
The percentage your cluster is using of the maximum possible bandwidth for HDD reads and writes. Available only for HDD clusters.
If this value is frequently at 100%, you might experience increased latency. Add nodes to the cluster to reduce the disk load percentage.
Nodes for replicated clusters
In an instance that uses replication, make sure each cluster has enough nodes to support your use case:
If you use replication to provide high availability, or if you use multi-cluster routing in any of your app profiles, each cluster should have the same number of nodes. Also, as shown above under CPU usage, the recommended CPU utilization is reduced by half.
This configuration helps ensure that if an automatic failover is necessary, the responsive cluster has enough capacity to handle all of your traffic.
If all of your app profiles use single-cluster routing, each cluster can have a different number of nodes. Resize each cluster as needed based on the cluster's workload.
Because Bigtable stores a separate copy of your data with each cluster, each cluster must always have enough nodes to support your disk usage and to replicate writes between clusters.
You can still fail over manually from one cluster to another if necessary. However, if one cluster has many more nodes than another, and you need to fail over to the cluster with fewer nodes, you might need to add nodes first. There is no guarantee that additional nodes will be available when you need to fail over—the only way to reserve nodes in advance is to add them to your cluster.
- Create a Bigtable instance.
- Monitor an existing Bigtable instance.
- Add nodes to a cluster in a Bigtable instance.
- Find out how replication works.
- Enable replication by adding a cluster to a Bigtable instance.