Dataproc provides the ability for graphics processing units (GPUs) to be attached to the master and worker Compute Engine nodes in a Dataproc cluster. You can use these GPUs to accelerate specific workloads on your instances, such as machine learning and data processing.
For more information about what you can do with GPUs and what types of GPU hardware are available, read GPUs on Compute Engine.
Before you begin
- GPUs require special drivers and software. These items are not pre-installed on Dataproc clusters.
- Read about GPU pricing on Compute Engine to understand the cost to use GPUs in your instances.
- Read about restrictions for instances with GPUs to learn how these instances function differently from non-GPU instances.
- Check the quotas page for your project to ensure that you have sufficient GPU quota (
NVIDIA_T4_GPUS
,NVIDIA_P100_GPUS
, orNVIDIA_V100_GPUS
) available in your project. If GPUs are not listed on the quotas page or you require additional GPU quota, request a quota increase.
Types of GPUs
Dataproc nodes support the following GPU types. You must specify GPU type when attaching GPUs to your Dataproc cluster.
nvidia-tesla-l4
- NVIDIA® Tesla® L4nvidia-tesla-a100
- NVIDIA® Tesla® A100nvidia-tesla-p100
- NVIDIA® Tesla® P100nvidia-tesla-v100
- NVIDIA® Tesla® V100nvidia-tesla-p4
- NVIDIA® Tesla® P4nvidia-tesla-t4
- NVIDIA® Tesla® T4nvidia-tesla-p100-vws
- NVIDIA® Tesla® P100 Virtual Workstationsnvidia-tesla-p4-vws
- NVIDIA® Tesla® P4 Virtual Workstationsnvidia-tesla-t4-vws
- NVIDIA® Tesla® T4 Virtual Workstations
Attaching GPUs to clusters
gcloud
Attach GPUs to the master and primary and secondary worker nodes in a Dataproc cluster when
creating the cluster using the
‑‑master-accelerator
,
‑‑worker-accelerator
, and
‑‑secondary-worker-accelerator
flags. These flags take the
following two values:
- the type of GPU to attach to a node, and
- the number of GPUs to attach to the node.
The type of GPU is required, and the number of GPUs is optional (the default is 1 GPU).
Example:
gcloud dataproc clusters create cluster-name \ --region=region \ --master-accelerator type=nvidia-tesla-t4 \ --worker-accelerator type=nvidia-tesla-t4,count=4 \ --secondary-worker-accelerator type=nvidia-tesla-t4,count=4 \ ... other flags
To use GPUs in your cluster, you must install GPU drivers.
REST API
Attach GPUs to the master and primary and secondary worker nodes in a Dataproc cluster
by filling in the
InstanceGroupConfig.AcceleratorConfig
acceleratorTypeUri
and acceleratorCount
fields as part of the
cluster.create
API request.
Console
Click CPU PLATFORM AND GPU→GPUs→ADD GPU in the master and worker nodes sections of the Configure nodes panel on the Create a cluster page in the Google Cloud console to specify the number of GPUs and GPU type for the nodes.
Installing GPU drivers
GPU drivers are required to utilize any GPUs attached to Dataproc nodes. You can install GPU drivers by following the instructions for this initialization action.
Verifying GPU driver install
After you have finished installing the GPU driver on your Dataproc nodes, you can verify that the driver is functioning properly. SSH into the master node of your Dataproc cluster and run the following command:
nvidia-smi
If the driver is functioning properly, the output will display the driver version and GPU statistics (see Verifying the GPU driver install).
Spark configuration
When you submit a job to Spark,
you can use the spark.executorEnv
Spark configuration
runtime environment property
property with the LD_PRELOAD
environment variable to preload needed libraries.
Example:
gcloud dataproc jobs submit spark --cluster=CLUSTER_NAME \ --region=REGION \ --class=org.apache.spark.examples.SparkPi \ --jars=file:///usr/lib/spark/examples/jars/spark-examples.jar \ --properties=spark.executorEnv.LD_PRELOAD=libnvblas.so,spark.task.resource.gpu.amount=1,spark.executor.resource.gpu.amount=1,spark.executor.resource.gpu.discoveryScript=/usr/lib/spark/scripts/gpu/getGpusResources.sh
Example GPU job
You can test GPUs on Dataproc by running any of the following jobs, which benefit when run with GPUs:
- Run one of the Spark ML examples.
- Run the following example with
spark-shell
to run a matrix computation:
import org.apache.spark.mllib.linalg._ import org.apache.spark.mllib.linalg.distributed._ import java.util.Random def makeRandomSquareBlockMatrix(rowsPerBlock: Int, nBlocks: Int): BlockMatrix = { val range = sc.parallelize(1 to nBlocks) val indices = range.cartesian(range) return new BlockMatrix( indices.map( ij => (ij, Matrices.rand(rowsPerBlock, rowsPerBlock, new Random()))), rowsPerBlock, rowsPerBlock, 0, 0) } val N = 1024 * 4 val n = 2 val mat1 = makeRandomSquareBlockMatrix(N, n) val mat2 = makeRandomSquareBlockMatrix(N, n) val mat3 = mat1.multiply(mat2) mat3.blocks.persist.count println("Processing complete!")