DataflowJob

Property Value
Google Cloud Service Name Cloud Dataflow
Google Cloud Service Documentation /dataflow/docs/
Google Cloud REST Resource Name v1b3.projects.jobs
Google Cloud REST Resource Documentation /dataflow/docs/reference/rest/v1b3/projects.jobs
Config Connector Resource Short Names gcpdataflowjob
gcpdataflowjobs
dataflowjob
Config Connector Service Name dataflow.googleapis.com
Config Connector Resource Fully Qualified Name dataflowjobs.dataflow.cnrm.cloud.google.com
Can Be Referenced by IAMPolicy/IAMPolicyMember No

Custom Resource Definition Properties

Annotations

Fields
cnrm.cloud.google.com/on-delete
cnrm.cloud.google.com/project-id

Spec

Schema

additionalExperiments:
- string
ipConfiguration: string
machineType: string
maxWorkers: integer
networkRef:
  external: string
  name: string
  namespace: string
parameters: {}
region: string
serviceAccountRef:
  external: string
  name: string
  namespace: string
subnetworkRef:
  external: string
  name: string
  namespace: string
tempGcsLocation: string
templateGcsPath: string
zone: string
Fields

additionalExperiments

Optional

list (string)

additionalExperiments.[]

Optional

string

ipConfiguration

Optional

string

machineType

Optional

string

maxWorkers

Optional

integer

networkRef

Optional

object

networkRef.external

Optional

string

The selfLink of a ComputeNetwork.

networkRef.name

Optional

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

networkRef.namespace

Optional

string

Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/

parameters

Optional

object

region

Optional

string

serviceAccountRef

Optional

object

serviceAccountRef.external

Optional

string

The email of an IAMServiceAccount.

serviceAccountRef.name

Optional

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

serviceAccountRef.namespace

Optional

string

Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/

subnetworkRef

Optional

object

subnetworkRef.external

Optional

string

The selfLink of a ComputeSubnetwork.

subnetworkRef.name

Optional

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

subnetworkRef.namespace

Optional

string

Namespace of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/

tempGcsLocation

Required

string

templateGcsPath

Required

string

zone

Optional

string

Status

Schema

conditions:
- lastTransitionTime: string
  message: string
  reason: string
  status: string
  type: string
jobId: string
state: string
type: string
Fields
conditions

list (object)

conditions.[]

object

conditions.[].lastTransitionTime

string

Last time the condition transitioned from one status to another.

conditions.[].message

string

Human-readable message indicating details about last transition.

conditions.[].reason

string

Unique, one-word, CamelCase reason for the condition's last transition.

conditions.[].status

string

Status is the status of the condition. Can be True, False, Unknown.

conditions.[].type

string

Type is the type of the condition.

jobId

string

state

string

type

string

Sample YAML(s)

Batch Dataflow Job

# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: dataflow.cnrm.cloud.google.com/v1beta1
kind: DataflowJob
metadata:
  annotations:
    cnrm.cloud.google.com/on-delete: "cancel"
  labels:
    label-one: "value-one"
  name: dataflowjob-sample-batch
spec:
  tempGcsLocation: gs://${PROJECT_ID?}-dataflowjob-dep-batch/tmp
  # This is a public, Google-maintained Dataflow Job template of a batch job
  templateGcsPath: gs://dataflow-templates/2020-02-03-01_RC00/Word_Count
  parameters:
    # This is a public, Google-maintained text file
    inputFile: gs://dataflow-samples/shakespeare/various.txt
    output: gs://${PROJECT_ID?}-dataflowjob-dep-batch/output
  zone: us-central1-a
  machineType: "n1-standard-1"
  maxWorkers: 3
  ipConfiguration: "WORKER_IP_PUBLIC"
---
apiVersion: storage.cnrm.cloud.google.com/v1beta1
kind: StorageBucket
metadata:
  annotations:
    cnrm.cloud.google.com/force-destroy: "true"
  # StorageBucket names must be globally unique. Replace ${PROJECT_ID?} with your project ID.
  name: ${PROJECT_ID?}-dataflowjob-dep-batch

Streaming Dataflow Job

# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: dataflow.cnrm.cloud.google.com/v1beta1
kind: DataflowJob
metadata:
  annotations:
    cnrm.cloud.google.com/on-delete: "cancel"
  labels:
    label-one: "value-one"
  name: dataflowjob-sample-streaming
spec:
  tempGcsLocation: gs://${PROJECT_ID?}-dataflowjob-dep-streaming/tmp
  # This is a public, Google-maintained Dataflow Job template of a streaming job
  templateGcsPath: gs://dataflow-templates/2020-02-03-01_RC00/PubSub_to_BigQuery
  parameters:
    # replace ${PROJECT_ID?} with your project name
    inputTopic: projects/${PROJECT_ID?}/topics/dataflowjob-dep-streaming
    outputTableSpec: ${PROJECT_ID?}:dataflowjobdepstreaming.dataflowjobdepstreaming
  zone: us-central1-a
  machineType: "n1-standard-1"
  maxWorkers: 3
  ipConfiguration: "WORKER_IP_PUBLIC"
---
apiVersion: bigquery.cnrm.cloud.google.com/v1beta1
kind: BigQueryDataset
metadata:
  name: dataflowjobdepstreaming
---
apiVersion: bigquery.cnrm.cloud.google.com/v1beta1
kind: BigQueryTable
metadata:
  name: dataflowjobdepstreaming
spec:
  datasetRef:
    name: dataflowjobdepstreaming
---
apiVersion: pubsub.cnrm.cloud.google.com/v1beta1
kind: PubSubTopic
metadata:
  name: dataflowjob-dep-streaming
---
apiVersion: storage.cnrm.cloud.google.com/v1beta1
kind: StorageBucket
metadata:
  annotations:
    cnrm.cloud.google.com/force-destroy: "true"
  # StorageBucket names must be globally unique. Replace ${PROJECT_ID?} with your project ID.
  name: ${PROJECT_ID?}-dataflowjob-dep-streaming