Google Cloud Dataproc V1 Client - Class ClusterConfig (2.2.1)

Reference documentation and code samples for the Google Cloud Dataproc V1 Client class ClusterConfig.

The cluster config.

Generated from protobuf message google.cloud.dataproc.v1.ClusterConfig

Namespace

Google \ Cloud \ Dataproc \ V1

Methods

__construct

Constructor.

Parameters
Name Description
data array

Optional. Data for populating the Message object.

↳ config_bucket string

Optional. A Cloud Storage bucket used to stage job dependencies, config files, and job driver console output. If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's staging bucket according to the Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket (see Dataproc staging and temp buckets). This field requires a Cloud Storage bucket name, not a gs://... URI to a Cloud Storage bucket.

↳ temp_bucket string

Optional. A Cloud Storage bucket used to store ephemeral cluster and jobs data, such as Spark and MapReduce history files. If you do not specify a temp bucket, Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's temp bucket according to the Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket. The default bucket has a TTL of 90 days, but you can use any TTL (or none) if you specify a bucket (see Dataproc staging and temp buckets). This field requires a Cloud Storage bucket name, not a gs://... URI to a Cloud Storage bucket.

↳ gce_cluster_config GceClusterConfig

Optional. The shared Compute Engine config settings for all instances in a cluster.

↳ master_config InstanceGroupConfig

Optional. The Compute Engine config settings for the cluster's master instance.

↳ worker_config InstanceGroupConfig

Optional. The Compute Engine config settings for the cluster's worker instances.

↳ secondary_worker_config InstanceGroupConfig

Optional. The Compute Engine config settings for a cluster's secondary worker instances

↳ software_config SoftwareConfig

Optional. The config settings for cluster software.

↳ initialization_actions array<NodeInitializationAction>

Optional. Commands to execute on each node after config is completed. By default, executables are run on master and all worker nodes. You can test a node's role metadata to run an executable on a master or worker node, as shown below using curl (you can also use wget): ROLE=$(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/dataproc-role) if [[ "${ROLE}" == 'Master' ]]; then ... master specific actions ... else ... worker specific actions ... fi

↳ encryption_config EncryptionConfig

Optional. Encryption settings for the cluster.

↳ autoscaling_config AutoscalingConfig

Optional. Autoscaling config for the policy associated with the cluster. Cluster does not autoscale if this field is unset.

↳ security_config SecurityConfig

Optional. Security settings for the cluster.

↳ lifecycle_config LifecycleConfig

Optional. Lifecycle setting for the cluster.

↳ endpoint_config EndpointConfig

Optional. Port/endpoint configuration for this cluster

↳ metastore_config MetastoreConfig

Optional. Metastore configuration.

↳ dataproc_metric_config DataprocMetricConfig

Optional. The config for Dataproc metrics.

↳ auxiliary_node_groups array<AuxiliaryNodeGroup>

Optional. The node group settings.

getConfigBucket

Optional. A Cloud Storage bucket used to stage job dependencies, config files, and job driver console output.

If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's staging bucket according to the Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket (see Dataproc staging and temp buckets). This field requires a Cloud Storage bucket name, not a gs://... URI to a Cloud Storage bucket.

Returns
Type Description
string

setConfigBucket

Optional. A Cloud Storage bucket used to stage job dependencies, config files, and job driver console output.

If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's staging bucket according to the Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket (see Dataproc staging and temp buckets). This field requires a Cloud Storage bucket name, not a gs://... URI to a Cloud Storage bucket.

Parameter
Name Description
var string
Returns
Type Description
$this

getTempBucket

Optional. A Cloud Storage bucket used to store ephemeral cluster and jobs data, such as Spark and MapReduce history files. If you do not specify a temp bucket, Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's temp bucket according to the Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket. The default bucket has a TTL of 90 days, but you can use any TTL (or none) if you specify a bucket (see Dataproc staging and temp buckets).

This field requires a Cloud Storage bucket name, not a gs://... URI to a Cloud Storage bucket.

Returns
Type Description
string

setTempBucket

Optional. A Cloud Storage bucket used to store ephemeral cluster and jobs data, such as Spark and MapReduce history files. If you do not specify a temp bucket, Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's temp bucket according to the Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket. The default bucket has a TTL of 90 days, but you can use any TTL (or none) if you specify a bucket (see Dataproc staging and temp buckets).

This field requires a Cloud Storage bucket name, not a gs://... URI to a Cloud Storage bucket.

Parameter
Name Description
var string
Returns
Type Description
$this

getGceClusterConfig

Optional. The shared Compute Engine config settings for all instances in a cluster.

Returns
Type Description
GceClusterConfig|null

hasGceClusterConfig

clearGceClusterConfig

setGceClusterConfig

Optional. The shared Compute Engine config settings for all instances in a cluster.

Parameter
Name Description
var GceClusterConfig
Returns
Type Description
$this

getMasterConfig

Optional. The Compute Engine config settings for the cluster's master instance.

Returns
Type Description
InstanceGroupConfig|null

hasMasterConfig

clearMasterConfig

setMasterConfig

Optional. The Compute Engine config settings for the cluster's master instance.

Parameter
Name Description
var InstanceGroupConfig
Returns
Type Description
$this

getWorkerConfig

Optional. The Compute Engine config settings for the cluster's worker instances.

Returns
Type Description
InstanceGroupConfig|null

hasWorkerConfig

clearWorkerConfig

setWorkerConfig

Optional. The Compute Engine config settings for the cluster's worker instances.

Parameter
Name Description
var InstanceGroupConfig
Returns
Type Description
$this

getSecondaryWorkerConfig

Optional. The Compute Engine config settings for a cluster's secondary worker instances

Returns
Type Description
InstanceGroupConfig|null

hasSecondaryWorkerConfig

clearSecondaryWorkerConfig

setSecondaryWorkerConfig

Optional. The Compute Engine config settings for a cluster's secondary worker instances

Parameter
Name Description
var InstanceGroupConfig
Returns
Type Description
$this

getSoftwareConfig

Optional. The config settings for cluster software.

Returns
Type Description
SoftwareConfig|null

hasSoftwareConfig

clearSoftwareConfig

setSoftwareConfig

Optional. The config settings for cluster software.

Parameter
Name Description
var SoftwareConfig
Returns
Type Description
$this

getInitializationActions

Optional. Commands to execute on each node after config is completed. By default, executables are run on master and all worker nodes.

You can test a node's role metadata to run an executable on a master or worker node, as shown below using curl (you can also use wget): ROLE=$(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/dataproc-role) if [[ "${ROLE}" == 'Master' ]]; then ... master specific actions ... else ... worker specific actions ... fi

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setInitializationActions

Optional. Commands to execute on each node after config is completed. By default, executables are run on master and all worker nodes.

You can test a node's role metadata to run an executable on a master or worker node, as shown below using curl (you can also use wget): ROLE=$(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/dataproc-role) if [[ "${ROLE}" == 'Master' ]]; then ... master specific actions ... else ... worker specific actions ... fi

Parameter
Name Description
var array<NodeInitializationAction>
Returns
Type Description
$this

getEncryptionConfig

Optional. Encryption settings for the cluster.

Returns
Type Description
EncryptionConfig|null

hasEncryptionConfig

clearEncryptionConfig

setEncryptionConfig

Optional. Encryption settings for the cluster.

Parameter
Name Description
var EncryptionConfig
Returns
Type Description
$this

getAutoscalingConfig

Optional. Autoscaling config for the policy associated with the cluster.

Cluster does not autoscale if this field is unset.

Returns
Type Description
AutoscalingConfig|null

hasAutoscalingConfig

clearAutoscalingConfig

setAutoscalingConfig

Optional. Autoscaling config for the policy associated with the cluster.

Cluster does not autoscale if this field is unset.

Parameter
Name Description
var AutoscalingConfig
Returns
Type Description
$this

getSecurityConfig

Optional. Security settings for the cluster.

Returns
Type Description
SecurityConfig|null

hasSecurityConfig

clearSecurityConfig

setSecurityConfig

Optional. Security settings for the cluster.

Parameter
Name Description
var SecurityConfig
Returns
Type Description
$this

getLifecycleConfig

Optional. Lifecycle setting for the cluster.

Returns
Type Description
LifecycleConfig|null

hasLifecycleConfig

clearLifecycleConfig

setLifecycleConfig

Optional. Lifecycle setting for the cluster.

Parameter
Name Description
var LifecycleConfig
Returns
Type Description
$this

getEndpointConfig

Optional. Port/endpoint configuration for this cluster

Returns
Type Description
EndpointConfig|null

hasEndpointConfig

clearEndpointConfig

setEndpointConfig

Optional. Port/endpoint configuration for this cluster

Parameter
Name Description
var EndpointConfig
Returns
Type Description
$this

getMetastoreConfig

Optional. Metastore configuration.

Returns
Type Description
MetastoreConfig|null

hasMetastoreConfig

clearMetastoreConfig

setMetastoreConfig

Optional. Metastore configuration.

Parameter
Name Description
var MetastoreConfig
Returns
Type Description
$this

getDataprocMetricConfig

Optional. The config for Dataproc metrics.

Returns
Type Description
DataprocMetricConfig|null

hasDataprocMetricConfig

clearDataprocMetricConfig

setDataprocMetricConfig

Optional. The config for Dataproc metrics.

Parameter
Name Description
var DataprocMetricConfig
Returns
Type Description
$this

getAuxiliaryNodeGroups

Optional. The node group settings.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setAuxiliaryNodeGroups

Optional. The node group settings.

Parameter
Name Description
var array<AuxiliaryNodeGroup>
Returns
Type Description
$this