DataprocAutoscalingPolicy


Property Value
Google Cloud Service Name Dataproc
Google Cloud Service Documentation /dataproc/docs/
Google Cloud REST Resource Name v1.projects.locations.autoscalingPolicies
Google Cloud REST Resource Documentation /dataproc/docs/reference/rest/v1/projects.locations.autoscalingPolicies
Config Connector Resource Short Names gcpdataprocautoscalingpolicy
gcpdataprocautoscalingpolicies
dataprocautoscalingpolicy
Config Connector Service Name dataproc.googleapis.com
Config Connector Resource Fully Qualified Name dataprocautoscalingpolicies.dataproc.cnrm.cloud.google.com
Can Be Referenced by IAMPolicy/IAMPolicyMember No
Config Connector Default Average Reconcile Interval In Seconds 600

Custom Resource Definition Properties

Spec

Schema

basicAlgorithm:
  cooldownPeriod: string
  yarnConfig:
    gracefulDecommissionTimeout: string
    scaleDownFactor: float
    scaleDownMinWorkerFraction: float
    scaleUpFactor: float
    scaleUpMinWorkerFraction: float
location: string
projectRef:
  external: string
  name: string
  namespace: string
resourceID: string
secondaryWorkerConfig:
  maxInstances: integer
  minInstances: integer
  weight: integer
workerConfig:
  maxInstances: integer
  minInstances: integer
  weight: integer
Fields

basicAlgorithm

Required

object

basicAlgorithm.cooldownPeriod

Optional

string

Optional. Duration between scaling events. A scaling period starts after the update operation from the previous event has completed. Bounds: . Default: 2m.

basicAlgorithm.yarnConfig

Required

object

Required. YARN autoscaling configuration.

basicAlgorithm.yarnConfig.gracefulDecommissionTimeout

Required

string

Required. Timeout for YARN graceful decommissioning of Node Managers. Specifies the duration to wait for jobs to complete before forcefully removing workers (and potentially interrupting jobs). Only applicable to downscaling operations.

basicAlgorithm.yarnConfig.scaleDownFactor

Required

float

Required. Fraction of average YARN pending memory in the last cooldown period for which to remove workers. A scale-down factor of 1 will result in scaling down so that there is no available memory remaining after the update (more aggressive scaling). A scale-down factor of 0 disables removing workers, which can be beneficial for autoscaling a single job. See .

basicAlgorithm.yarnConfig.scaleDownMinWorkerFraction

Optional

float

Optional. Minimum scale-down threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2 worker scale-down for the cluster to scale. A threshold of 0 means the autoscaler will scale down on any recommended change. Bounds: . Default: 0.0.

basicAlgorithm.yarnConfig.scaleUpFactor

Required

float

Required. Fraction of average YARN pending memory in the last cooldown period for which to add workers. A scale-up factor of 1.0 will result in scaling up so that there is no pending memory remaining after the update (more aggressive scaling). A scale-up factor closer to 0 will result in a smaller magnitude of scaling up (less aggressive scaling). See .

basicAlgorithm.yarnConfig.scaleUpMinWorkerFraction

Optional

float

Optional. Minimum scale-up threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2-worker scale-up for the cluster to scale. A threshold of 0 means the autoscaler will scale up on any recommended change. Bounds: . Default: 0.0.

location

Required

string

Immutable. The location for the resource

projectRef

Optional

object

Immutable. The Project that this resource belongs to.

projectRef.external

Optional

string

The project for the resource Allowed value: The Google Cloud resource name of a `Project` resource (format: `projects/{{name}}`).

projectRef.name

Optional

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

projectRef.namespace

Optional

string

Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/

resourceID

Optional

string

Immutable. Optional. The name of the resource. Used for creation and acquisition. When unset, the value of `metadata.name` is used as the default.

secondaryWorkerConfig

Optional

object

Optional. Describes how the autoscaler will operate for secondary workers.

secondaryWorkerConfig.maxInstances

Optional

integer

Optional. Maximum number of instances for this group. Note that by default, clusters will not use secondary workers. Required for secondary workers if the minimum secondary instances is set. Primary workers - Bounds: [min_instances, ). Secondary workers - Bounds: [min_instances, ). Default: 0.

secondaryWorkerConfig.minInstances

Optional

integer

Optional. Minimum number of instances for this group. Primary workers - Bounds: . Default: 0.

secondaryWorkerConfig.weight

Optional

integer

Optional. Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if `max_instances` for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.

workerConfig

Required

object

Required. Describes how the autoscaler will operate for primary workers.

workerConfig.maxInstances

Required

integer

Required. Maximum number of instances for this group. Required for primary workers. Note that by default, clusters will not use secondary workers. Required for secondary workers if the minimum secondary instances is set. Primary workers - Bounds: [min_instances, ). Secondary workers - Bounds: [min_instances, ). Default: 0.

workerConfig.minInstances

Optional

integer

Optional. Minimum number of instances for this group. Primary workers - Bounds: . Default: 0.

workerConfig.weight

Optional

integer

Optional. Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if `max_instances` for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.

Status

Schema

conditions:
- lastTransitionTime: string
  message: string
  reason: string
  status: string
  type: string
observedGeneration: integer
Fields
conditions

list (object)

Conditions represent the latest available observation of the resource's current state.

conditions[]

object

conditions[].lastTransitionTime

string

Last time the condition transitioned from one status to another.

conditions[].message

string

Human-readable message indicating details about last transition.

conditions[].reason

string

Unique, one-word, CamelCase reason for the condition's last transition.

conditions[].status

string

Status is the status of the condition. Can be True, False, Unknown.

conditions[].type

string

Type is the type of the condition.

observedGeneration

integer

ObservedGeneration is the generation of the resource that was most recently observed by the Config Connector controller. If this is equal to metadata.generation, then that means that the current reported status reflects the most recent desired state of the resource.

Sample YAML(s)

Typical Use Case

# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: dataproc.cnrm.cloud.google.com/v1beta1
kind: DataprocAutoscalingPolicy
metadata:
  name: dataprocautoscalingpolicy-sample
spec:
  location: "us-central1"
  workerConfig:
    maxInstances: 2
  secondaryWorkerConfig:
    maxInstances: 2
  basicAlgorithm:
    yarnConfig:
      gracefulDecommissionTimeout: "60s"
      scaleDownFactor: 0.5
      scaleUpFactor: 0.5