Jump to Content
Databases

How to migrate from Apache HBase to Cloud Bigtable with Live Migrations

April 7, 2022
https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog-Banner_2880x1200_v12x-1.max-2600x2600.jpg
Sean Rhee

Product Management, Google Cloud

Cloud Bigtable is a natural destination for Apache HBase workloads, as it is a fully managed service that is compatible with the HBase API. As a result, many customers running business-critical applications with large-scale data and low-latency needs consider migrating to Bigtable.

However, migrating from HBase to Bigtable can still be challenging since you typically have to pause your applications for migration downtime. In addition, some companies choose to write custom tools, which require extensive resources to build and test, adding months to the migration process.

Today, we’re announcing that Live Migrations from Apache HBase to Cloud Bigtable are now generally available. This enables faster and simpler migrations from HBase to Bigtable to ensure accurate data migration, reduce migration effort, and provide a better overall developer experience.

HBase to Bigtable migrations just got easier 

Historically, you would need to manually create tables in Bigtable from your existing HBase tables and execute several steps to export and import data, define target tables, and validate data integrity. This process can be tedious, especially if the migration requires moving multiple tables or pre-splitting tables. 

At Google Cloud, we’re always trying to find ways to make migrations from HBase to Bigtable even easier for our customers. Our latest Live Migration features aim to provide a more straightforward, more efficient, and proven way to migrate data from HBase to Bigtable with minimal downtime. All together, they provide the necessary components to complete a seamless live migration.

We have built four new features:

Now, you can automate the migration process and facilitate end-to-end data pipelines. The Schema Translation Tool fully automates table conversion by connecting to HBase, copying the table schema, and creating similar tables in Bigtable. You can also import HBase snapshots and validate data migration for a more seamless migration process with our Snapshot Import and Migration Validation tools. 

The HBase Bigtable Replication Library, which becomes available today, removes the need for building custom migration tools. It allows you to use HBase replication to sequence bulk imports and live writes correctly, ensuring consistent performance during migration of large workloads.

How live migrations from HBase to Bigtable works

HBase provides asynchronous replication between clusters for various use cases like disaster recovery and data aggregation workloads. The HBase Bigtable Replication Library enables Bigtable to be added as an HBase cluster replication target. HBase to Bigtable replication enables customers to sync mutations happening on their HBase cluster to Bigtable, providing near-zero downtime migrations from HBase to Cloud Bigtable. 

The following diagram shows a live replication from HBase to Bigtable:

https://storage.googleapis.com/gweb-cloudblog-publish/images/HBase_to_Bigtable.max-900x900.jpg

The HBase Cluster is the source database, which can be located in an on-premises network, another cloud provider, or managed data services. Once enabled, live replication allows all the writes happening on the source cluster to be replicated to the target Bigtable Instance.

Before enabling replication, you will need to create all the tables from HBase with the same column families in Bigtable. You can use the Schema Translation Tool to create target tables in Bigtable based on your existing HBase schema. To enable replication, the source cluster must be able to connect to the target Bigtable instance.

Get started with HBase to Bigtable live migrations

To learn more about HBase to Bigtable Live Migrations and how to get started, please visit our documentation page.

To learn more about Bigtable:

Posted in