Stay organized with collections
Save and categorize content based on your preferences.
Installation of the optional HBase component is limited to
Dataproc clusters created with image version
1.5 or
2.0.
While Google Cloud provides many services that let you deploy self-managed Apache
HBase, Bigtable is
often the best option as it provides an open API with HBase and workload portability.
HBase database tables can be migrated to Bigtable for management of the
underlying data, while applications that previously interoperated with HBase,
such as Spark, may remain on Dataproc and securely connect with Bigtable.
In this guide, we provide the high-level steps for getting started with Bigtable
and provide references for migrating data to Bigtable from Dataproc HBase
deployments.
Get started with Bigtable
Cloud Bigtable is a highly scalable and performant NoSQL platform that provides
Apache HBase API client compatibility
and portability for HBase workloads. The client is compatible with HBase API
versions 1.x and 2.x and may be included with the existing application to read
and write to Bigtable. Existing HBase applications may add the Bigtable HBase
client library to read and write data stored in Bigtable.
See
Bigtable and the HBase API
for more information on configuring your HBase application with Bigtable.
Create a Bigtable cluster
You can get started using Bigtable by creating a cluster and tables for
storing data that was previously stored in HBase. Follow the steps in the Bigtable documentation for
creating an instance,
a cluster, and
tables with
the same schema as the HBase tables. For automated creation of tables from HBase
table DDLs, refer to the
schema translator tool.
Open the Bigtable instance in Google Cloud console to view the table and
server-side monitoring charts, including rows per second, latency, and throughput, to manage
the newly provisioned table. For additional information, see
Monitoring.
Migrate data from Dataproc to Bigtable
After you create the tables in Bigtable, you can import and validate
your data by following the guidance at
Migrate HBase on Google Cloud to Bigtable.
After you migrate the data, you can update applications to send reads and writes
to Bigtable.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003eThe HBase component is deprecated in Dataproc version 2.1 and later, and while a Beta version was available in versions 1.5 and 2.0, its use is not recommended due to the ephemeral nature of Dataproc clusters.\u003c/p\u003e\n"],["\u003cp\u003eBigtable is recommended as an alternative to HBase, offering an open API with HBase compatibility and workload portability, making it suitable for applications that previously used HBase.\u003c/p\u003e\n"],["\u003cp\u003eYou can migrate existing HBase applications and their data to Bigtable by using the Bigtable HBase client library and following the provided migration steps.\u003c/p\u003e\n"],["\u003cp\u003eGetting started with Bigtable involves creating a cluster and tables, which can be done manually or through automated tools like the schema translator for HBase DDLs.\u003c/p\u003e\n"],["\u003cp\u003eAfter migrating, you can use tools like server-side monitoring charts to manage the Bigtable tables and review examples of using Spark with Bigtable for continued application functionality.\u003c/p\u003e\n"]]],[],null,["| **Deprecated:** Starting with Dataproc [version 2.1](/dataproc/docs/concepts/versioning/dataproc-release-2.1), you can no longer use the optional HBase component. Dataproc [version 1.5](/dataproc/docs/concepts/versioning/dataproc-release-1.5) and Dataproc [version 2.0](/dataproc/docs/concepts/versioning/dataproc-release-2.0) offer a Beta version of HBase with no support. However, due to the ephemeral nature of Dataproc clusters, using HBase is not recommended.\n\nInstallation of the optional HBase component is limited to\nDataproc clusters created with image version\n[1.5](/dataproc/docs/concepts/versioning/dataproc-release-1.5) or\n[2.0](/dataproc/docs/concepts/versioning/dataproc-release-2.0).\n\nWhile Google Cloud provides many services that let you deploy self-managed Apache\nHBase, [Bigtable](/bigtable/docs/overview) is\noften the best option as it provides an open API with HBase and workload portability.\nHBase database tables can be migrated to Bigtable for management of the\nunderlying data, while applications that previously interoperated with HBase,\nsuch as Spark, may remain on Dataproc and securely connect with Bigtable.\nIn this guide, we provide the high-level steps for getting started with Bigtable\nand provide references for migrating data to Bigtable from Dataproc HBase\ndeployments.\n\nGet started with Bigtable\n\nCloud Bigtable is a highly scalable and performant NoSQL platform that provides\n[Apache HBase API client compatibility](https://cloud.google.com/bigtable/docs/hbase-bigtable)\nand portability for HBase workloads. The client is compatible with HBase API\nversions 1.x and 2.x and may be included with the existing application to read\nand write to Bigtable. Existing HBase applications may add the Bigtable HBase\nclient library to read and write data stored in Bigtable.\n\nSee\n[Bigtable and the HBase API](https://cloud.google.com/bigtable/docs/hbase-bigtable)\nfor more information on configuring your HBase application with Bigtable.\n\nCreate a Bigtable cluster\n\nYou can get started using Bigtable by creating a cluster and tables for\nstoring data that was previously stored in HBase. Follow the steps in the Bigtable documentation for\n[creating an instance](/bigtable/docs/creating-instance#creating-instance),\na cluster, and\n[tables](/bigtable/docs/managing-tables) with\nthe same schema as the HBase tables. For automated creation of tables from HBase\ntable DDLs, refer to the\n[schema translator tool](/bigtable/docs/migrate-hbase-on-google-cloud-to-bigtable#create-destination-table).\n\nOpen the Bigtable instance in Google Cloud console to view the table and\nserver-side monitoring charts, including rows per second, latency, and throughput, to manage\nthe newly provisioned table. For additional information, see\n[Monitoring](/bigtable/docs/monitoring-instance).\n\nMigrate data from Dataproc to Bigtable\n\nAfter you create the tables in Bigtable, you can import and validate\nyour data by following the guidance at\n[Migrate HBase on Google Cloud to Bigtable](/bigtable/docs/migrate-hbase-on-google-cloud-to-bigtable).\nAfter you migrate the data, you can update applications to send reads and writes\nto Bigtable.\n\nWhat's next\n\n- See [Wordcount Spark examples](https://github.com/GoogleCloudPlatform/java-docs-samples/tree/main/bigtable/spark) for running Spark with the Bigtable.\n- Review online migration options with [live replication from HBase to Bigtable](/bigtable/docs/hbase-replication).\n- Watch [How Box modernized their NoSQL databases](https://www.youtube.com/watch?v=DteQ09WFhaU) to understand other benefits."]]