API documentation for dataproc_v1.types
package.
Classes
AcceleratorConfig
Specifies the type and number of accelerator cards attached to the
instances of an instance. See GPUs on Compute
Engine <https://cloud.google.com/compute/docs/gpus/>
__.
AutoscalingConfig
Autoscaling Policy config associated with the cluster.
AutoscalingPolicy
Describes an autoscaling policy for Dataproc cluster autoscaler.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
AuxiliaryNodeGroup
Node group identification and configuration information.
AuxiliaryServicesConfig
Auxiliary services configuration for a Cluster.
BasicAutoscalingAlgorithm
Basic algorithm for autoscaling.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
BasicYarnAutoscalingConfig
Basic autoscaling configurations for YARN.
Batch
A representation of a batch workload in the service.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
BatchOperationMetadata
Metadata describing the Batch operation.
CancelJobRequest
A request to cancel a job.
Cluster
Describes the identifying information, config, and status of a Dataproc cluster
ClusterConfig
The cluster config.
ClusterMetrics
Contains cluster daemon metrics, such as HDFS and YARN stats.
Beta Feature: This report is available for testing purposes only. It may be changed before final release.
ClusterOperation
The cluster operation triggered by a workflow.
ClusterOperationMetadata
Metadata describing the operation.
ClusterOperationStatus
The status of the operation.
ClusterSelector
A selector that chooses target cluster for jobs based on metadata.
ClusterStatus
The status of a cluster and its instances.
Component
Cluster components that can be activated.
Values: COMPONENT_UNSPECIFIED (0): Unspecified component. Specifying this will cause Cluster creation to fail. ANACONDA (5): The Anaconda python distribution. The Anaconda component is not supported in the Dataproc 2.0 image. The 2.0 image is pre-installed with Miniconda. DOCKER (13): Docker DRUID (9): The Druid query engine. (alpha) FLINK (14): Flink HBASE (11): HBase. (beta) HIVE_WEBHCAT (3): The Hive Web HCatalog (the REST service for accessing HCatalog). HUDI (18): Hudi. JUPYTER (1): The Jupyter Notebook. PRESTO (6): The Presto query engine. TRINO (17): The Trino query engine. RANGER (12): The Ranger service. SOLR (10): The Solr service. ZEPPELIN (4): The Zeppelin notebook. ZOOKEEPER (8): The Zookeeper service.
ConfidentialInstanceConfig
Confidential Instance Config for clusters using Confidential
VMs <https://cloud.google.com/compute/confidential-vm/docs>
__
CreateAutoscalingPolicyRequest
A request to create an autoscaling policy.
CreateBatchRequest
A request to create a batch workload.
CreateClusterRequest
A request to create a cluster.
CreateNodeGroupRequest
A request to create a node group.
CreateWorkflowTemplateRequest
A request to create a workflow template.
DataprocMetricConfig
Dataproc metric config.
DeleteAutoscalingPolicyRequest
A request to delete an autoscaling policy. Autoscaling policies in use by one or more clusters will not be deleted.
DeleteBatchRequest
A request to delete a batch workload.
DeleteClusterRequest
A request to delete a cluster.
DeleteJobRequest
A request to delete a job.
DeleteWorkflowTemplateRequest
A request to delete a workflow template. Currently started workflows will remain running.
DiagnoseClusterRequest
A request to collect cluster diagnostic information.
DiagnoseClusterResults
The location of diagnostic output.
DiskConfig
Specifies the config of disk options for a group of VM instances.
DriverSchedulingConfig
Driver scheduling configuration.
EncryptionConfig
Encryption settings for the cluster.
EndpointConfig
Endpoint config for this cluster
EnvironmentConfig
Environment configuration for a workload.
ExecutionConfig
Execution configuration for a workload.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
FailureAction
Actions in response to failure of a resource associated with a cluster.
Values: FAILURE_ACTION_UNSPECIFIED (0): When FailureAction is unspecified, failure action defaults to NO_ACTION. NO_ACTION (1): Take no action on failure to create a cluster resource. NO_ACTION is the default. DELETE (2): Delete the failed cluster resource.
GceClusterConfig
Common config settings for resources of Compute Engine cluster instances, applicable to all instances in the cluster.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
GetAutoscalingPolicyRequest
A request to fetch an autoscaling policy.
GetBatchRequest
A request to get the resource representation for a batch workload.
GetClusterRequest
Request to get the resource representation for a cluster in a project.
GetJobRequest
A request to get the resource representation for a job in a project.
GetNodeGroupRequest
A request to get a node group .
GetWorkflowTemplateRequest
A request to fetch a workflow template.
GkeClusterConfig
The cluster's GKE config.
GkeNodePoolConfig
The configuration of a GKE node pool used by a Dataproc-on-GKE
cluster <https://cloud.google.com/dataproc/docs/concepts/jobs/dataproc-gke#create-a-dataproc-on-gke-cluster>
__.
GkeNodePoolTarget
GKE node pools that Dataproc workloads run on.
HadoopJob
A Dataproc job for running Apache Hadoop
MapReduce <https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html>
jobs on Apache Hadoop
YARN <https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/YARN.html>
.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
HiveJob
A Dataproc job for running Apache
Hive <https://hive.apache.org/>
__ queries on YARN.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
IdentityConfig
Identity related configuration, including service account based secure multi-tenancy user mappings.
InstanceFlexibilityPolicy
Instance flexibility Policy allowing a mixture of VM shapes and provisioning models.
InstanceGroupAutoscalingPolicyConfig
Configuration for the size bounds of an instance group, including its proportional size to other groups.
InstanceGroupConfig
The config settings for Compute Engine resources in an instance group, such as a master or worker group.
InstanceReference
A reference to a Compute Engine instance.
InstantiateInlineWorkflowTemplateRequest
A request to instantiate an inline workflow template.
InstantiateWorkflowTemplateRequest
A request to instantiate a workflow template.
Job
A Dataproc job resource.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
JobMetadata
Job Operation metadata.
JobPlacement
Dataproc job config.
JobReference
Encapsulates the full scoping used to reference a job.
JobScheduling
Job scheduling options.
JobStatus
Dataproc job status.
KerberosConfig
Specifies Kerberos related configuration.
KubernetesClusterConfig
The configuration for running the Dataproc cluster on Kubernetes.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
KubernetesSoftwareConfig
The software configuration for this Dataproc cluster running on Kubernetes.
LifecycleConfig
Specifies the cluster auto-delete schedule configuration.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
ListAutoscalingPoliciesRequest
A request to list autoscaling policies in a project.
ListAutoscalingPoliciesResponse
A response to a request to list autoscaling policies in a project.
ListBatchesRequest
A request to list batch workloads in a project.
ListBatchesResponse
A list of batch workloads.
ListClustersRequest
A request to list the clusters in a project.
ListClustersResponse
The list of all clusters in a project.
ListJobsRequest
A request to list jobs in a project.
ListJobsResponse
A list of jobs in a project.
ListWorkflowTemplatesRequest
A request to list workflow templates in a project.
ListWorkflowTemplatesResponse
A response to a request to list workflow templates in a project.
LoggingConfig
The runtime logging config of the job.
ManagedCluster
Cluster that is managed by the workflow.
ManagedGroupConfig
Specifies the resources used to actively manage an instance group.
MetastoreConfig
Specifies a Metastore configuration.
NodeGroup
Dataproc Node Group. The Dataproc NodeGroup
resource is not
related to the Dataproc
NodeGroupAffinity
resource.
NodeGroupAffinity
Node Group Affinity for clusters using sole-tenant node groups.
The Dataproc NodeGroupAffinity
resource is not related to the
Dataproc NodeGroup resource.
NodeGroupOperationMetadata
Metadata describing the node group operation.
NodeInitializationAction
Specifies an executable to run on a fully configured node and a timeout period for executable completion.
OrderedJob
A job executed by the workflow.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
ParameterValidation
Configuration for parameter validation.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
PeripheralsConfig
Auxiliary services configuration for a workload.
PigJob
A Dataproc job for running Apache Pig <https://pig.apache.org/>
__
queries on YARN.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
PrestoJob
A Dataproc job for running Presto <https://prestosql.io/>
queries. IMPORTANT: The Dataproc Presto Optional
Component <https://cloud.google.com/dataproc/docs/concepts/components/presto>
must be enabled when the cluster is created to submit a Presto job
to the cluster.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
PySparkBatch
A configuration for running an Apache
PySpark <https://spark.apache.org/docs/latest/api/python/getting_started/quickstart.html>
__
batch workload.
PySparkJob
A Dataproc job for running Apache
PySpark <https://spark.apache.org/docs/0.9.0/python-programming-guide.html>
__
applications on YARN.
QueryList
A list of queries to run on a cluster.
RegexValidation
Validation based on regular expressions.
ReservationAffinity
Reservation Affinity for consuming Zonal reservation.
ResizeNodeGroupRequest
A request to resize a node group.
RuntimeConfig
Runtime configuration for a workload.
RuntimeInfo
Runtime information about workload execution.
SecurityConfig
Security related configuration, including encryption, Kerberos, etc.
ShieldedInstanceConfig
Shielded Instance Config for clusters using Compute Engine Shielded
VMs <https://cloud.google.com/security/shielded-cloud/shielded-vm>
__.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
SoftwareConfig
Specifies the selection and config of software inside the cluster.
SparkBatch
A configuration for running an Apache
Spark <https://spark.apache.org/>
__ batch workload.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
SparkHistoryServerConfig
Spark History Server configuration for the workload.
SparkJob
A Dataproc job for running Apache
Spark <https://spark.apache.org/>
__ applications on YARN.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
SparkRBatch
A configuration for running an Apache
SparkR <https://spark.apache.org/docs/latest/sparkr.html>
__ batch
workload.
SparkRJob
A Dataproc job for running Apache
SparkR <https://spark.apache.org/docs/latest/sparkr.html>
__
applications on YARN.
SparkSqlBatch
A configuration for running Apache Spark
SQL <https://spark.apache.org/sql/>
__ queries as a batch workload.
SparkSqlJob
A Dataproc job for running Apache Spark
SQL <https://spark.apache.org/sql/>
__ queries.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
StartClusterRequest
A request to start a cluster.
StopClusterRequest
A request to stop a cluster.
SubmitJobRequest
A request to submit a job.
TemplateParameter
A configurable parameter that replaces one or more fields in the template. Parameterizable fields:
- Labels
- File uris
- Job properties
- Job arguments
- Script variables
- Main class (in HadoopJob and SparkJob)
- Zone (in ClusterSelector)
TrinoJob
A Dataproc job for running Trino <https://trino.io/>
queries.
IMPORTANT: The Dataproc Trino Optional
Component <https://cloud.google.com/dataproc/docs/concepts/components/trino>
must be enabled when the cluster is created to submit a Trino job to
the cluster.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
UpdateAutoscalingPolicyRequest
A request to update an autoscaling policy.
UpdateClusterRequest
A request to update a cluster.
UpdateJobRequest
A request to update a job.
UpdateWorkflowTemplateRequest
A request to update a workflow template.
UsageMetrics
Usage metrics represent approximate total resources consumed by a workload.
UsageSnapshot
The usage snaphot represents the resources consumed by a workload at a specified time.
ValueValidation
Validation based on a list of allowed values.
VirtualClusterConfig
The Dataproc cluster config for a cluster that does not directly
control the underlying compute resources, such as a Dataproc-on-GKE
cluster <https://cloud.google.com/dataproc/docs/guides/dpgke/dataproc-gke-overview>
__.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
WorkflowGraph
The workflow graph.
WorkflowMetadata
A Dataproc workflow template resource.
WorkflowNode
The workflow node.
WorkflowTemplate
A Dataproc workflow template resource.
WorkflowTemplatePlacement
Specifies workflow execution target.
Either managed_cluster
or cluster_selector
is required.
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
YarnApplication
A YARN application created by a job. Application information is a subset of org.apache.hadoop.yarn.proto.YarnProtos.ApplicationReportProto.
Beta Feature: This report is available for testing purposes only. It may be changed before final release.