When you create a cluster, standard Apache Hadoop ecosystem components are automatically installed on the cluster (see Cloud Dataproc Version List). You can install additional components, called "optional components" on the cluster when you create the cluster. Adding optional components to a cluster is similar to adding components through the use of initialization actions, but has the following advantages:
- Faster cluster startup times
- Tested compatibility with specific Dataproc versions
- Use of a cluster parameter instead of an initialization action script
- Optional components are integrated with other Dataproc components. For example, when Anaconda and Zeppelin are installed on a cluster, Zeppelin will make use of Anaconda's Python interpreter and libraries.
Optional Components can be added to clusters created with Dataproc version 1.3 and later.
Available optional components
in gcloud commands and API requests
|Image Version||Release Stage|
|Anaconda||ANACONDA||1.3 and later||GA|
|Druid||DRUID||1.3 and later||Alpha|
|Hive WebHCat||HIVE_WEBHCAT||1.3 and later||GA|
|Jupyter Notebook||JUPYTER||1.3 and later||GA|
|Presto||PRESTO||1.3 and later||Beta|
|Zeppelin Notebook||ZEPPELIN||1.3 and later||GA|
|Zookeeper||ZOOKEEPER||1.0 and later||GA|
Adding optional components
To create a Dataproc cluster and install one or more
optional components on the cluster, use the
gcloud beta dataproc clusters create cluster-name
command with the
(using image version 1.3 or later).
gcloud dataproc clusters create cluster-name \ --optional-components=COMPONENT-NAME(s) \ --image-version=1.3 \ ... other flags
REST APIOptional components can be specified through the Dataproc API using SoftwareConfig.Component as part of a clusters.create request.
In the Cloud Console, open the Dataproc Create a cluster page. Click "Advanced options" at the bottom of the page to view the Optional Components section.
Click "Select component" to open the Optional components selection panel. Select one or more components to install on your cluster.