Dataproc optional Docker component

You can install additional components like Docker when you create a Dataproc cluster using the Optional components feature. This page describes the Docker component.

The Dataproc component installs a Docker daemon on each cluster node and creates a Linux user "docker" and a Linux group "docker" on each node to run the Docker daemon. This component also creates a "docker" systemd service to run the dockerd service. You should use the systemd service to manage the lifecycle of the Docker service.

Install the component

Install the component when you create a Dataproc cluster. The Docker component can be installed on clusters created with Dataproc image version 1.5 or later.

See Supported Dataproc versions for the component version included in each Dataproc image release.

gcloud command

To create a Dataproc cluster that includes the Docker component, use the gcloud dataproc clusters create cluster-name command with the --optional-components flag.

gcloud dataproc clusters create cluster-name \
    --optional-components=DOCKER \
    --region=region \
    --image-version=1.5 \
    ... other flags

REST API

The Docker component can be specified through the Dataproc API using SoftwareConfig.Component as part of a clusters.create request.

Console

  1. Enable the component.
    • In the Google Cloud console, open the Dataproc Create a cluster page. The Set up cluster panel is selected.
    • In the Components section:
      • Under Optional components, select Docker and other optional components to install on your cluster.

Enable Docker on YARN

See Customize your Spark job runtime environment with Docker on YARN to use a customized Docker image with YARN.

Docker Logging

By default, the Dataproc Docker component writes logs to Cloud Logging by setting the gcplogs driver—see Viewing your logs.

Docker Registry

The Dataproc Docker component configures Docker to use Container Registry in addition to the default Docker registries. Docker will use the Docker credential helper to authenticate with Container Registry.

Using the Docker component on a Kerberos cluster

The Docker optional component can be installed on a cluster that is being created with Kerberos security enabled.