Instances, clusters, and nodes

To use Bigtable, you create instances, which contain clusters that your applications can connect to. Each cluster contains nodes, the compute units that manage your data and perform maintenance tasks.

This page provides more information about Bigtable instances, clusters, and nodes.

Before you read this page, you should be familiar with the overview of Bigtable.

Instances

A Bigtable instance is a container for your data. Instances have one or more clusters, located in different zones. Each cluster has at least 1 node.

A table belongs to an instance, not to a cluster or node. If you have an instance with more than one cluster, you are using replication. This means you can't assign a table to an individual cluster or create unique garbage collection policies for each cluster in an instance. You also can't make each cluster store a different set of data in the same table.

An instance has a few important properties that you need to know about:

  • The storage type (SSD or HDD)
  • The application profiles, which are primarily for instances that use replication

The following sections describe these properties.

Storage types

When you create an instance, you must choose whether the instance's clusters will store data on solid-state drives (SSD) or hard disk drives (HDD). SSD is often, but not always, the most efficient and cost-effective choice.

The choice between SSD and HDD is permanent, and every cluster in your instance must use the same type of storage, so make sure you pick the right storage type for your use case. See Choosing between SSD and HDD storage for more information to help you decide.

Application profiles

After you create an instance, Bigtable uses the instance to store application profiles, or app profiles. For instances that use replication, app profiles control how your applications connect to the instance's clusters.

If your instance doesn't use replication, you can still use app profiles to provide separate identifiers for each of your applications, or each function within an application. You can then view separate charts for each app profile in the Google Cloud console.

To learn more about app profiles, see application profiles. To learn how to set up your instance's app profiles, see Configuring app profiles.

Clusters

A cluster represents the Bigtable service in a specific location. Each cluster belongs to a single Bigtable instance, and an instance can have clusters in up to 8 regions. When your application sends requests to a Bigtable instance, those requests are handled by one of the clusters in the instance.

Each cluster is located in a single zone. An instance can have clusters in up to 8 regions where Bigtable is available. Each zone in a region can contain only one cluster. For example, if an instance has a cluster in us-east1-b, you can add a cluster in a different zone in the same region, such as us-east1-c, or a zone in a separate region, such as europe-west2-a.

The number of clusters that you can create in an instance depends on the number of available zones in the regions that you choose. For example, if you create clusters in 8 regions that have 3 zones each, the maximum number of clusters that the instance can have is 24. For a list of zones and regions where Bigtable is available, see Bigtable locations.

Bigtable instances that have only 1 cluster don't use replication. If you add a second cluster to an instance, Bigtable automatically starts replicating your data by keeping separate copies of the data in each of the clusters' zones and synchronizing updates between the copies. You can choose which cluster your applications connect to, which makes it possible to isolate different types of traffic from one another. You can also let Bigtable balance traffic between clusters. If a cluster becomes unavailable, you can fail over from one cluster to another. To learn more about how replication works, see the replication overview.

In most cases, you should enable autoscaling for a cluster, so that Bigtable adds and removes nodes as needed to handle the cluster's workloads.

Nodes

Each cluster in an instance has 1 or more nodes, which are compute resources that Bigtable uses to manage your data.

Behind the scenes, Bigtable splits all of the data in a table into separate tablets. Tablets are stored on disk, separate from the nodes but in the same zone as the nodes. A tablet is associated with a single node.

Each node is responsible for:

  • Keeping track of specific tablets on disk.
  • Handling incoming reads and writes for its tablets.
  • Performing maintenance tasks on its tablets, such as periodic compactions.

A cluster must have enough nodes to support its current workload and the amount of data it stores. Otherwise, the cluster might not be able to handle incoming requests, and latency could go up. Monitor your clusters' CPU and disk usage, and add nodes to an instance when its metrics exceed the recommendations at Plan your capacity.

For more details about how Bigtable stores and manages data, see Bigtable architecture.

Nodes for replicated clusters

When your instance has more than one cluster, failover becomes a consideration when you configure the maximum number of nodes for autoscaling or manually allocate the nodes.

  • If you use multi-cluster routing in any of your app profiles, automatic failover can occur in the event that one or more clusters is unavailable.

  • When you manually fail over from one cluster to another, or when automatic failover occurs, the receiving cluster should ideally have enough capacity to support the load. You can either always allocate enough nodes to support failover, which can be costly, or you can rely on autoscaling to add nodes when traffic fails over, but be aware that there might be a brief impact on performance while the cluster scales up.

  • If all of your app profiles use single-cluster routing, each cluster can have a different number of nodes. Resize each cluster as needed based on the cluster's workload.

    Because Bigtable stores a separate copy of your data with each cluster, each cluster must always have enough nodes to support your disk usage and to replicate writes between clusters.

    You can still fail over manually from one cluster to another if necessary. However, if one cluster has many more nodes than another, and you need to fail over to the cluster with fewer nodes, you might need to add nodes first. There is no guarantee that additional nodes will be available when you need to fail over—the only way to reserve nodes in advance is to add them to your cluster.

What's next