Examples of replication settings

This page describes some common use cases for enabling Bigtable replication, then presents the settings you can use to support these use cases.

Isolate batch analytics workloads from other applications
Create high availability
Provide near-real-time backup
Maintain high availability and regional resilience
Store data close to your users

This page also explains how to decide what settings to use if your use case isn't listed here.

Before you read this page, you should be familiar with the overview of Bigtable replication.

Before you add clusters to an instance, you should be aware of the restrictions that apply when you change garbage collection policies on replicated tables.

Regardless of your use case, always provision enough nodes in every cluster in an instance to ensure that each cluster can handle replication in addition to the load it receives from applications. If a cluster does not have enough nodes, replication delay can increase, the cluster can experience performance issues due to memory buildup, and writes to other clusters in the instance might be rejected.

Isolate batch analytics workloads from other applications

When you use a single cluster to run a batch analytics job that performs numerous large reads alongside an application that performs a mix of reads and writes, the large batch job can slow things down for the application's users. With replication, you can use app profiles with single-cluster routing to route batch analytics jobs and application traffic to different clusters, so that batch jobs don't affect your applications' users.

To isolate two workloads:

Create a new instance with 2 clusters, or add a second cluster to an existing instance.

Follow the standard CPU utilization recommendations for this configuration.
Create 2 app profiles, one called live-traffic and another called batch-analytics.

If your cluster IDs are cluster-a and cluster-b, the live-traffic app profile should route requests to cluster-a and the batch-analytics app profile should route requests to cluster-b. This configuration provides read-your-writes consistency for applications using the same app profile, but not for applications using different app profiles.

You can enable single-row transactions in the live-traffic app profile if necessary. There's no need to enable single-row transactions in the batch-analytics app profile, assuming that you will only use this app profile for reads.
Use the live-traffic app profile to run a live-traffic workload.
While the live-traffic workload is running, use the batch-analytics app profile to run a read-only batch workload.
Monitor the CPU utilization for the instance's clusters, and add nodes to the clusters if necessary.
Monitor client-side latency using a tool of your choice. If you use the HBase client for Java, you can monitor latency with its client-side metrics.

To isolate two smaller workloads from one larger workload:

Create a new instance with 3 clusters, or add clusters to an existing instance until it has 3 clusters.

Follow the standard CPU utilization recommendations for this configuration.

These steps assume that your clusters use the IDs cluster-a, cluster-b, and cluster-c.

Use the same number of nodes in cluster-a and cluster-b if they are serving the same application. Use a larger number of nodes in cluster-c to support the larger workload.
Create the following app profiles:
- live-traffic-app-a: Single-cluster routing from your application to cluster-a
- live-traffic-app-b: Single-cluster routing from your application to cluster-b
- batch-analytics: Single-cluster routing from the batch analytics job to cluster-c
Use the live-traffic app profiles to run live-traffic workloads.
While the live-traffic workloads are running, use the batch-analytics app profile to run a read-only batch workload.
Monitor the CPU utilization for the instance's clusters, and add nodes to the clusters if necessary.
Monitor client-side latency using a tool of your choice. If you use the HBase client for Java, you can monitor latency with its client-side metrics.

Create high availability (HA)

If an instance has only 1 cluster, your data's durability and availability are limited to the zone where that cluster is located. Replication can improve both durability and availability by storing separate copies of your data in multiple zones or regions and automatically failing over between clusters if needed.

To configure your instance for a high availability (HA) use case, create a new app profile that uses multi-cluster routing, or update the default app profile to use multi-cluster routing. This configuration provides eventual consistency. You won't be able to enable single-row transactions because single-row transactions can cause data conflicts when you use multi-cluster routing.

To learn more about how Bigtable helps achieve high availability, see Building a Global Data Presence with Bigtable .

Configurations to improve availability include the following.

Clusters in 3 or more different regions (recommended configuration). The recommended configuration for HA is an instance that has N+2 clusters that are each in a different region. For example, if the minimum number of clusters that you need to serve your data is 2, then you need a 4-cluster instance to maintain HA. This configuration provides uptime even in the rare case that 2 regions become unavailable. We recommend that you spread the clusters across multiple continents.

Example configuration:
- cluster-a in zone us-central1-a in Iowa
- cluster-b in zone europe-west1-d in Belgium
- cluster-c in zone asia-east1-b in Taiwan
For this configuration, provision enough nodes to maintain a target of 23% CPU utilization for a 3-cluster, 3-region instance and 35% CPU utilization for a 4-cluster, 4-region instance. This ensures that even if 2 regions are unavailable, the remaining cluster or clusters can serve all the traffic.
Two clusters in the same region but different zones. This option provides high availability within the region's availability, the ability to fail over without generating cross-region replication costs, and no increased latency on failover. Your data in a replicated Bigtable instance is available as long as any of the zones it is replicated to are available.

Example configuration:
- cluster-a in zone australia-southeast1-a in Sydney
- cluster-b in zone australia-southeast1-b in Sydney
Follow the standard CPU utilization recommendations for this configuration.
Two clusters in different regions. This multi-region configuration provides high availability like the preceding multi-zone configuration, but your data is available even if you cannot connect to one of the regions.

You are charged for replicating writes between regions.

Example configuration:
- cluster-a in zone asia-northeast1-c in Tokyo
- cluster-b in zone asia-east2-b in Hong Kong
Follow the standard CPU utilization recommendations for this configuration.
Two clusters in region A and a third cluster in region B. This option makes your data available even if you cannot connect to one of the regions, and it provides additional capacity in region A.

You are charged for replicating writes between regions. If you write to region A, you are charged once because you have only 1 cluster in region B. If you write to region B, you are charged twice because you have 2 clusters in region A.

Example configuration:
- cluster-a in zone europe-west1-b in Belgium
- cluster-b in zone europe-west1-d in Belgium
- cluster-c in zone europe-north1-c in Finland
Start with a target of 35% CPU utilization in the region with 2 clusters and 70% in the other region. Monitor the instance's clusters and adjust the number of nodes as needed so that each cluster has enough resources to handle a failover.

You can simulate failover for this use case to test your application:

Use an app profile with multi-cluster routing to run a test workload.
Use the Google Cloud console to monitor the instance's clusters and confirm that the clusters are handling incoming requests.
Delete one of the clusters to simulate an outage.

This change also deletes the copy of your data that is stored with the cluster.
Continue to monitor latency and error rates. If the remaining clusters have enough CPU resources, they should be able to keep up with incoming requests.
Add a cluster to the instance, and continue to monitor the instance. Data should start replicating to the new cluster.

Provide near-real-time backup

In some cases—for example, if you can't afford to read stale data—you'll always need to route requests to a single cluster. However, you can still use replication by handling requests with one cluster and keeping another cluster as a near-real-time backup. If the serving cluster becomes unavailable, you can minimize downtime by manually failing over to the backup cluster.

To configure your instance for this use case, create an app profile that uses single-cluster routing, or update the default app profile to use single-cluster routing. The cluster that you specified in your app profile handles incoming requests. The other cluster acts as a backup in case you need to fail over. This arrangement is sometimes known as an active-passive configuration, and it provides both strong consistency and read-your-writes consistency. You can enable single-row transactions in the app profile if necessary.

Follow the standard CPU utilization recommendations for this configuration.

To implement this configuration:

Use the app profile with single-cluster routing to run a workload.
Use the Google Cloud console to monitor the instance's clusters and confirm that only 1 cluster is handling incoming requests.

The other cluster will still use CPU resources to perform replication and other maintenance tasks.
Update the app profile so that it points to the second cluster in your instance.

You will receive a warning about losing read-your-writes consistency, which also means that you lose strong consistency.

If you enabled single-row transactions, you will also receive a warning about the potential for data loss. You will lose data if you send conflicting writes while the failover is occurring.
Continue to monitor your instance. You should see that the second cluster is handling incoming requests.

Maintain high availability and regional resilience

Let's say you have concentrations of customers in two distinct regions within a continent. You want to serve each concentration of customers with Bigtable clusters as close to the customers as possible. You want your data to be highly available within each region, and you might want a failover option if one or more of your clusters is not available.

For this use case, you can create an instance with 2 clusters in region A and 2 clusters in region B. This configuration provides high availability even if you cannot connect to a Google Cloud region. It also provides regional resilience because even if a zone becomes unavailable, the other cluster in that zone's region is still available.

You can choose to use multi-cluster routing or single-cluster routing for this use case, depending on your business needs.

To configure your instance for this use case:

Create a Bigtable instance with 4 clusters: 2 in region A and 2 in region B. Clusters in the same region must be in different zones.

Example configuration:
- cluster-a in zone asia-south1-a in Mumbai
- cluster-b in zone asia-south1-c in Mumbai
- cluster-c in zone asia-northeast1-a in Tokyo
- cluster-d in zone asia-northeast1-b in Tokyo
Place an application server near each region.

You can choose to use multi-cluster routing or single-cluster routing for this use case, depending on your business needs. If you use multi-cluster routing, Bigtable handles failovers automatically. If you use single-cluster routing, you use your own judgment to decide when to fail over to a different cluster.

Single-cluster routing option

You can use single-cluster routing for this use case if you don't want your Bigtable cluster to automatically fail over if a zone or region becomes unavailable. This option is a good choice if you want to manage the costs and latency that might occur if Bigtable starts routing traffic to and from a distant region, or if you prefer to make failover decisions based on your own judgment or business rules.

To implement this configuration, create at least one app profile that uses single-cluster routing for each application that sends requests to the instance. You can route the app profiles to any cluster in the Bigtable instance. For example, if you have three applications running in Mumbai and six in Tokyo, you can configure one app profile for the Mumbai application to route to asia-south1-a and two that route to asia-south1-c. For the Tokyo application, configure three app profiles that route to asia-northeast1-a and three that route to asia-northeast1-b.

Follow the standard CPU utilization recommendations for this configuration.

With this configuration, if one or more clusters become unavailable, you can perform a manual failover or choose to let your data be temporarily unavailable in that zone until the zone is available again.

Multi-cluster routing option

If you're implementing this use case and you want Bigtable to automatically fail over to one region if your application cannot reach the other region, use multi-cluster routing.

To implement this configuration, create a new app profile that uses multi-cluster routing for each application, or update the default app profile to use multi-cluster routing.

This configuration provides eventual consistency. If a region becomes unavailable, Bigtable requests are automatically sent to the other region. When this happens, you are charged for the network traffic to the other region, and your application might experience higher latency because of the greater distance.

When you initially set up your instance, do not exceed 35% CPU utilization for each cluster. This target ensures that each cluster can handle the traffic normally handled by the other cluster in its region if a failover occurs. You might need to adjust this target depending on your traffic and usage patterns.

You can simulate failover for this use case to test your application:

Run a test workload.
Use the Google Cloud console to monitor the instance's clusters and confirm that all 4 clusters are handling incoming requests.
Delete one of the clusters in region A to simulate a problem connecting to a zone.

This change also deletes the copy of your data that is stored with the cluster.
Continue to monitor latency and error rates for the remaining clusters.

If the clusters have enough CPU resources, they should be able to keep up with incoming requests.
Add a cluster to the instance in region A and continue to monitor the instance.

Data should start replicating to the new cluster.
Delete both clusters in region A to simulate a problem connecting to a region.

This change deletes the copies of your data that were in those clusters.
Continue to monitor latency and error rates for the remaining clusters.

If the clusters have enough CPU resources, they should be able to keep up with incoming requests that were previously handled by the other region. If the clusters don't have enough resources, you might need to adjust the number of nodes.

Store data close to your users

If you have users around the globe, you can reduce latency by running your application near your users and putting your data as close to your application as possible. With Bigtable, you can create an instance that has clusters in several Google Cloud regions, and your data is automatically replicated in each region.

For this use case, use app profiles with single-cluster routing. Multi-cluster routing is undesirable for this use case because of the distance between clusters. If a cluster becomes unavailable and its multi-cluster app profile automatically reroutes traffic across a great distance, your application might experience unacceptable latency and incur unexpected, additional network costs.