This page describes some common use cases for enabling Cloud Bigtable replication, then presents the settings you can use to support these use cases:
- Isolate batch analytics jobs from other applications
- Create high availability
- Provide near-real-time backup
It also explains how to decide what settings to use if your use case isn't listed here.
Before you read this page, you should be familiar with the overview of Cloud Bigtable replication.
Before you begin
Isolate serving applications from batch reads
When you use a single cluster to run a batch analytics job that performs numerous large reads, as well as an application that performs a mix of reads and writes, a large batch job can slow things down for the applications' users. With replication, you can route batch analytics jobs to one cluster and application traffic to another cluster, so that batch jobs don't affect your applications' users.
To configure your instance for this use case, create 2 app
profiles, one called
live-traffic-app and another
batch-analytics-job. If your cluster IDs are
live-traffic-app app profile should route requests to
cluster-a, and the
batch-analytics-job app profile should route requests to
cluster-b. This configuration provides read-your-writes
consistency for all of the applications that use these app
You can enable single-row transactions in the
live-traffic-app app profile if necessary. There's no need to enable
single-row transactions in the
batch-analytics-job app profile, assuming that
you will only use this profile for reads.
To test this configuration:
- Use the
live-traffic-appapp profile to run a live-traffic workload.
- While the live-traffic workload is running, use the
batch-analytics-jobapp profile to run a read-only batch workload.
- Monitor the CPU utilization for the instance's clusters, and add nodes to the clusters if necessary.
- Monitor client-side latency using a tool of your choice. If you use the HBase client for Java, you can monitor latency with its client-side metrics.
If an instance has only 1 cluster, your data's durability and availability is limited to the zone where that cluster is located. Replication can improve both durability and availability by storing separate copies of your data in 2 zones and automatically failing over between clusters if needed.
To configure your instance for this use case, create a new app profile that uses multi-cluster routing, or update the default app profile to use multi-cluster routing. This configuration provides eventual consistency. You won't be able to enable single-row transactions, which aren't supported when you use multi-cluster routing.
To test this configuration:
- Use the app profile with multi-cluster routing to run a test workload.
- Use the Google Cloud Platform Console to monitor the instance's clusters and confirm that both clusters are handling incoming requests.
Delete one of the clusters to simulate an outage.
This change also deletes the copy of your data that is stored with the cluster.
Continue to monitor latency and error rates for the remaining cluster. If the cluster has enough CPU resources, it should be able to keep up with incoming requests.
Add a cluster to the instance, and continue to monitor the instance. Data should start replicating to the new cluster.
Provide near-real-time backup
In some cases—for example, if you can't afford to read stale data—you'll always need to route requests to a single cluster. However, you can still use replication by handling requests with one cluster and keeping another cluster as a near-real-time backup. If the serving cluster becomes unavailable, you can minimize downtime by manually failing over to the backup cluster.
To configure your instance for this use case, create an app profile that uses single-cluster routing, or update the default app profile to use single-cluster routing. The cluster that you specified in your app profile will handle incoming requests. The other cluster acts as a backup in case you need to fail over. This arrangement is sometimes known as an active-passive configuration, and it provides both strong consistency and read-your-writes consistency. You can enable single-row transactions in the app profile if necessary.
To test this configuration:
- Use the app profile with single-cluster routing to run a test workload.
Use the GCP Console to monitor the instance's clusters and confirm that only 1 cluster is handling incoming requests.
The other cluster will still use CPU resources to perform replication and other maintenance tasks.
Update the app profile so that it points to the second cluster in your instance.
You will receive a warning about losing read-your-writes consistency, which also means that you lose strong consistency.
If you enabled single-row transactions, you will also receive a warning about the potential for data loss. You will lose data if you send conflicting writes while the failover is occurring.
Continue to monitor your instance. You should see that the second cluster is handling incoming requests.
Other use cases
If you have a use case that isn't described on this page, use the following questions to help you decide how to configure your app profiles:
Do you need to perform single-row transactions, such as read-modify-write operations (including increments and appends) and check-and-mutate operations (also known as conditional mutations or conditional writes)?
Do you want Cloud Bigtable to handle failovers automatically?
If so, your app profiles must use multi-cluster routing. If a cluster can't process an incoming request, Cloud Bigtable automatically fails over to the other cluster. Learn more about automatic failovers.
Do you want to maintain a backup or spare cluster in case your primary cluster is not available?
This configuration also makes it possible to use single-row transactions if necessary.
Do you want to send different kinds of traffic to different clusters?
You can enable single-row transactions in your app profiles if necessary.