You can install additional components when you create a Dataproc cluster using the Optional Components feature. This page describes the Solr component.
The Apache Solr
component is an open source enterprise search platform. The Solr server and
Web UI are available on port
8983 on the cluster's master node(s).
Persisting Solr files: By default, Solr writes and reads the index and
transaction log files in
To persist Solr files, use a Cloud Storage path as the Solr home
directory by setting the
cluster property when you install the component.
Install the component
Install the component when you create a Dataproc cluster. Components can be added to clusters created with Dataproc version 1.3 and later.
See Supported Cloud Dataproc versions for the component version included in each Dataproc image release.
To create a Dataproc cluster that includes the Solr component,
gcloud beta dataproc clusters create cluster-name
command with the
--optional-components flag. The sample command below useas the optional
flag to set a Cloud Storage path as the Solr home directory.
gcloud beta dataproc clusters create cluster-name \ --optional-components=SOLR \ --enable-component-gateway \ ... other flags
--properties="dataproc:solr.gcs.path=gcs://bucket-name/"cluster property to the
gcloud beta dataproc clusters createcommand to set a Cloud Storage bucket where Solr documents will be stored (Solr home directory).
Installing the Solr component from the Cloud Console is currently not supported.