Package types (5.8.0rc0)

API documentation for dataproc_v1.types package.

Classes

AcceleratorConfig

Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine <https://cloud.google.com/compute/docs/gpus/>__.

AutoscalingConfig

Autoscaling Policy config associated with the cluster.

AutoscalingPolicy

Describes an autoscaling policy for Dataproc cluster autoscaler.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

AuxiliaryNodeGroup

Node group identification and configuration information.

AuxiliaryServicesConfig

Auxiliary services configuration for a Cluster.

BasicAutoscalingAlgorithm

Basic algorithm for autoscaling.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

BasicYarnAutoscalingConfig

Basic autoscaling configurations for YARN.

Batch

A representation of a batch workload in the service.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

BatchOperationMetadata

Metadata describing the Batch operation.

CancelJobRequest

A request to cancel a job.

Cluster

Describes the identifying information, config, and status of a Dataproc cluster

ClusterConfig

The cluster config.

ClusterMetrics

Contains cluster daemon metrics, such as HDFS and YARN stats.

Beta Feature: This report is available for testing purposes only. It may be changed before final release.

ClusterOperation

The cluster operation triggered by a workflow.

ClusterOperationMetadata

Metadata describing the operation.

ClusterOperationStatus

The status of the operation.

ClusterSelector

A selector that chooses target cluster for jobs based on metadata.

ClusterStatus

The status of a cluster and its instances.

Component

Cluster components that can be activated.

Values: COMPONENT_UNSPECIFIED (0): Unspecified component. Specifying this will cause Cluster creation to fail. ANACONDA (5): The Anaconda python distribution. The Anaconda component is not supported in the Dataproc 2.0 image. The 2.0 image is pre-installed with Miniconda. DOCKER (13): Docker DRUID (9): The Druid query engine. (alpha) FLINK (14): Flink HBASE (11): HBase. (beta) HIVE_WEBHCAT (3): The Hive Web HCatalog (the REST service for accessing HCatalog). HUDI (18): Hudi. JUPYTER (1): The Jupyter Notebook. PRESTO (6): The Presto query engine. TRINO (17): The Trino query engine. RANGER (12): The Ranger service. SOLR (10): The Solr service. ZEPPELIN (4): The Zeppelin notebook. ZOOKEEPER (8): The Zookeeper service.

ConfidentialInstanceConfig

Confidential Instance Config for clusters using Confidential VMs <https://cloud.google.com/compute/confidential-vm/docs>__

CreateAutoscalingPolicyRequest

A request to create an autoscaling policy.

CreateBatchRequest

A request to create a batch workload.

CreateClusterRequest

A request to create a cluster.

CreateNodeGroupRequest

A request to create a node group.

CreateWorkflowTemplateRequest

A request to create a workflow template.

DataprocMetricConfig

Dataproc metric config.

DeleteAutoscalingPolicyRequest

A request to delete an autoscaling policy.

Autoscaling policies in use by one or more clusters will not be deleted.

DeleteBatchRequest

A request to delete a batch workload.

DeleteClusterRequest

A request to delete a cluster.

DeleteJobRequest

A request to delete a job.

DeleteWorkflowTemplateRequest

A request to delete a workflow template.

Currently started workflows will remain running.

DiagnoseClusterRequest

A request to collect cluster diagnostic information.

DiagnoseClusterResults

The location of diagnostic output.

DiskConfig

Specifies the config of disk options for a group of VM instances.

DriverSchedulingConfig

Driver scheduling configuration.

EncryptionConfig

Encryption settings for the cluster.

EndpointConfig

Endpoint config for this cluster

EnvironmentConfig

Environment configuration for a workload.

ExecutionConfig

Execution configuration for a workload.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

FailureAction

Actions in response to failure of a resource associated with a cluster.

Values: FAILURE_ACTION_UNSPECIFIED (0): When FailureAction is unspecified, failure action defaults to NO_ACTION. NO_ACTION (1): Take no action on failure to create a cluster resource. NO_ACTION is the default. DELETE (2): Delete the failed cluster resource.

GceClusterConfig

Common config settings for resources of Compute Engine cluster instances, applicable to all instances in the cluster.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

GetAutoscalingPolicyRequest

A request to fetch an autoscaling policy.

GetBatchRequest

A request to get the resource representation for a batch workload.

GetClusterRequest

Request to get the resource representation for a cluster in a project.

GetJobRequest

A request to get the resource representation for a job in a project.

GetNodeGroupRequest

A request to get a node group .

GetWorkflowTemplateRequest

A request to fetch a workflow template.

GkeClusterConfig

The cluster's GKE config.

GkeNodePoolConfig

The configuration of a GKE node pool used by a Dataproc-on-GKE cluster <https://cloud.google.com/dataproc/docs/concepts/jobs/dataproc-gke#create-a-dataproc-on-gke-cluster>__.

GkeNodePoolTarget

GKE node pools that Dataproc workloads run on.

HadoopJob

A Dataproc job for running Apache Hadoop MapReduce <https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html> jobs on Apache Hadoop YARN <https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/YARN.html>.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

HiveJob

A Dataproc job for running Apache Hive <https://hive.apache.org/>__ queries on YARN.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

IdentityConfig

Identity related configuration, including service account based secure multi-tenancy user mappings.

InstanceFlexibilityPolicy

Instance flexibility Policy allowing a mixture of VM shapes and provisioning models.

InstanceGroupAutoscalingPolicyConfig

Configuration for the size bounds of an instance group, including its proportional size to other groups.

InstanceGroupConfig

The config settings for Compute Engine resources in an instance group, such as a master or worker group.

InstanceReference

A reference to a Compute Engine instance.

InstantiateInlineWorkflowTemplateRequest

A request to instantiate an inline workflow template.

InstantiateWorkflowTemplateRequest

A request to instantiate a workflow template.

Job

A Dataproc job resource.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

JobMetadata

Job Operation metadata.

JobPlacement

Dataproc job config.

JobReference

Encapsulates the full scoping used to reference a job.

JobScheduling

Job scheduling options.

JobStatus

Dataproc job status.

KerberosConfig

Specifies Kerberos related configuration.

KubernetesClusterConfig

The configuration for running the Dataproc cluster on Kubernetes.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

KubernetesSoftwareConfig

The software configuration for this Dataproc cluster running on Kubernetes.

LifecycleConfig

Specifies the cluster auto-delete schedule configuration.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

ListAutoscalingPoliciesRequest

A request to list autoscaling policies in a project.

ListAutoscalingPoliciesResponse

A response to a request to list autoscaling policies in a project.

ListBatchesRequest

A request to list batch workloads in a project.

ListBatchesResponse

A list of batch workloads.

ListClustersRequest

A request to list the clusters in a project.

ListClustersResponse

The list of all clusters in a project.

ListJobsRequest

A request to list jobs in a project.

ListJobsResponse

A list of jobs in a project.

ListWorkflowTemplatesRequest

A request to list workflow templates in a project.

ListWorkflowTemplatesResponse

A response to a request to list workflow templates in a project.

LoggingConfig

The runtime logging config of the job.

ManagedCluster

Cluster that is managed by the workflow.

ManagedGroupConfig

Specifies the resources used to actively manage an instance group.

MetastoreConfig

Specifies a Metastore configuration.

NodeGroup

Dataproc Node Group. The Dataproc NodeGroup resource is not related to the Dataproc NodeGroupAffinity resource.

NodeGroupAffinity

Node Group Affinity for clusters using sole-tenant node groups. The Dataproc NodeGroupAffinity resource is not related to the Dataproc NodeGroup resource.

NodeGroupOperationMetadata

Metadata describing the node group operation.

NodeInitializationAction

Specifies an executable to run on a fully configured node and a timeout period for executable completion.

OrderedJob

A job executed by the workflow.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

ParameterValidation

Configuration for parameter validation.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

PeripheralsConfig

Auxiliary services configuration for a workload.

PigJob

A Dataproc job for running Apache Pig <https://pig.apache.org/>__ queries on YARN.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

PrestoJob

A Dataproc job for running Presto <https://prestosql.io/> queries. IMPORTANT: The Dataproc Presto Optional Component <https://cloud.google.com/dataproc/docs/concepts/components/presto> must be enabled when the cluster is created to submit a Presto job to the cluster.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

PySparkBatch

A configuration for running an Apache PySpark <https://spark.apache.org/docs/latest/api/python/getting_started/quickstart.html>__ batch workload.

PySparkJob

A Dataproc job for running Apache PySpark <https://spark.apache.org/docs/0.9.0/python-programming-guide.html>__ applications on YARN.

QueryList

A list of queries to run on a cluster.

RegexValidation

Validation based on regular expressions.

ReservationAffinity

Reservation Affinity for consuming Zonal reservation.

ResizeNodeGroupRequest

A request to resize a node group.

RuntimeConfig

Runtime configuration for a workload.

RuntimeInfo

Runtime information about workload execution.

SecurityConfig

Security related configuration, including encryption, Kerberos, etc.

ShieldedInstanceConfig

Shielded Instance Config for clusters using Compute Engine Shielded VMs <https://cloud.google.com/security/shielded-cloud/shielded-vm>__.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

SoftwareConfig

Specifies the selection and config of software inside the cluster.

SparkBatch

A configuration for running an Apache Spark <https://spark.apache.org/>__ batch workload.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

SparkHistoryServerConfig

Spark History Server configuration for the workload.

SparkJob

A Dataproc job for running Apache Spark <https://spark.apache.org/>__ applications on YARN.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

SparkRBatch

A configuration for running an Apache SparkR <https://spark.apache.org/docs/latest/sparkr.html>__ batch workload.

SparkRJob

A Dataproc job for running Apache SparkR <https://spark.apache.org/docs/latest/sparkr.html>__ applications on YARN.

SparkSqlBatch

A configuration for running Apache Spark SQL <https://spark.apache.org/sql/>__ queries as a batch workload.

SparkSqlJob

A Dataproc job for running Apache Spark SQL <https://spark.apache.org/sql/>__ queries.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

StartClusterRequest

A request to start a cluster.

StartupConfig

Configuration to handle the startup of instances during cluster create and update process.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

StopClusterRequest

A request to stop a cluster.

SubmitJobRequest

A request to submit a job.

TemplateParameter

A configurable parameter that replaces one or more fields in the template. Parameterizable fields:

  • Labels
  • File uris
  • Job properties
  • Job arguments
  • Script variables
  • Main class (in HadoopJob and SparkJob)
  • Zone (in ClusterSelector)

TrinoJob

A Dataproc job for running Trino <https://trino.io/> queries. IMPORTANT: The Dataproc Trino Optional Component <https://cloud.google.com/dataproc/docs/concepts/components/trino> must be enabled when the cluster is created to submit a Trino job to the cluster.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

UpdateAutoscalingPolicyRequest

A request to update an autoscaling policy.

UpdateClusterRequest

A request to update a cluster.

UpdateJobRequest

A request to update a job.

UpdateWorkflowTemplateRequest

A request to update a workflow template.

UsageMetrics

Usage metrics represent approximate total resources consumed by a workload.

UsageSnapshot

The usage snaphot represents the resources consumed by a workload at a specified time.

ValueValidation

Validation based on a list of allowed values.

VirtualClusterConfig

The Dataproc cluster config for a cluster that does not directly control the underlying compute resources, such as a Dataproc-on-GKE cluster <https://cloud.google.com/dataproc/docs/guides/dpgke/dataproc-gke-overview>__.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

WorkflowGraph

The workflow graph.

WorkflowMetadata

A Dataproc workflow template resource.

WorkflowNode

The workflow node.

WorkflowTemplate

A Dataproc workflow template resource.

WorkflowTemplatePlacement

Specifies workflow execution target.

Either managed_cluster or cluster_selector is required.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

YarnApplication

A YARN application created by a job. Application information is a subset of org.apache.hadoop.yarn.proto.YarnProtos.ApplicationReportProto.

Beta Feature: This report is available for testing purposes only. It may be changed before final release.