Dataproc HBase Component

You can install additional components when you create a Dataproc cluster using the Optional Components feature. This page describes the HBase component.

The Apache HBase component is the Hadoop database: a distributed, scalable, big data store. The HBase server and Web UI are available on port 16010 on the Dataproc cluster's first master node. You can invoke the HBase CLI (Command Line Interface) with the hbase shell command from a terminal window on the cluster's first master node.

Installing the component

Install the component when you create a Dataproc cluster. The Hbase component can be added to clusters created with Dataproc version 1.5 and later. The HBase component requires the installation of the Zookeeper component, as shown in gcloud command-line tool and console examples, below.

See Supported Dataproc versions for the component version included in each Dataproc image release.

gcloud command

To create a Dataproc cluster that includes the HBase component, use the gcloud beta dataproc clusters create cluster-name command with the --region and --optional-components flags, using image version 1.5 or later.

gcloud beta dataproc clusters create cluster-name \
    --optional-components=HBASE,ZOOKEEPER \
    --region=region \
    --image-version=1.5 \
    --enable-component-gateway \
    ... other flags

REST API

The HBase and required Zookeeper components can be specified through the Dataproc API using SoftwareConfig.Component as part of a clusters.create request.

Console

  1. Enable the component and component gateway.
    • In the Cloud Console, open the Dataproc Create a cluster page. The Set up cluster panel is selected.
    • In the Components section:
      • Under Optional components, select HBase, Zookeeper, and other optional components to install on your cluster.
      • Under Component Gateway, select Enable component gateway.

Setting HBase config properties

Although the default Dataproc HBase configuration settings should be sufficient for most applications, you can modify HBase configuration settings when you create by setting cluster properties using the hbase: file prefix.

gcloud command example to set hbase.rootdir in hbase-site.xml:

gcloud beta dataproc clusters create my-cluster \
    --optional-components=HBASE,ZOOKEEPER \
    --properties=hbase:hbase.rootdir=hdfs://...
    ... other flags (see Installing the component)