Cloud Dataproc Anaconda Component

You can install additional components when you create a Cloud Dataproc cluster using the Optional Components feature. This page describes the Anaconda component.

The Anaconda component is a Python distribution and Package Manager with over 1000 popular data science packages. The component is installed on all cluster nodes in /opt/conda/anaconda, and becomes the default Python interpreter.

Install the component

Install the component when you create a Cloud Dataproc cluster. Components can be added to clusters created with Cloud Dataproc version 1.3 and later.

See Supported Cloud Dataproc versions for the component version included in each Cloud Dataproc image release.

gcloud command

To create a Cloud Dataproc cluster that includes the Anaconda component, use the gcloud dataproc clusters create cluster-name command with the --optional-components flag (using image version 1.3 or later).

gcloud dataproc clusters create cluster-name \
    --optional-components=ANACONDA \
    --image-version=1.3 \
  ... other flags


The Anaconda component can be specified through the Cloud Dataproc API using SoftwareConfig.Component as part of a clusters.create request.


In the GCP Console, open the Cloud Dataproc Create a cluster page. Click "Advanced options" at the bottom of the page to view the Optional Components section.

Click "Select component" to open the Optional components selection panel. Select one or more components to install on your cluster.

هل كانت هذه الصفحة مفيدة؟ يرجى تقييم أدائنا:

إرسال تعليقات حول...

Cloud Dataproc Documentation
هل تحتاج إلى مساعدة؟ انتقل إلى صفحة الدعم.