Method: projects.locations.flexTemplates.launch

Launch a job with a FlexTemplate.

HTTP request

POST https://dataflow.googleapis.com/v1b3/projects/{projectId}/locations/{location}/flexTemplates:launch

The URL uses gRPC Transcoding syntax.

Path parameters

Parameters
projectId

string

Required. The ID of the Cloud Platform project that the job belongs to.

location

string

Required. The regional endpoint to which to direct the request. E.g., us-central1, us-west1.

Request body

The request body contains data with the following structure:

JSON representation
{
  "launchParameter": {
    object (LaunchFlexTemplateParameter)
  },
  "validateOnly": boolean
}
Fields
launchParameter

object (LaunchFlexTemplateParameter)

Required. Parameter to launch a job form Flex Template.

validateOnly

boolean

If true, the request is validated but not actually executed. Defaults to false.

Response body

Response to the request to launch a job from Flex Template.

If successful, the response body contains data with the following structure:

JSON representation
{
  "job": {
    object (Job)
  }
}
Fields
job

object (Job)

The job that was launched, if the request was not a dry run and the job was successfully launched.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/compute
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

LaunchFlexTemplateParameter

Launch FlexTemplate Parameter.

JSON representation
{
  "jobName": string,
  "parameters": {
    string: string,
    ...
  },
  "launchOptions": {
    string: string,
    ...
  },
  "environment": {
    object (FlexTemplateRuntimeEnvironment)
  },
  "update": boolean,
  "transformNameMappings": {
    string: string,
    ...
  },

  // Union field template can be only one of the following:
  "containerSpecGcsPath": string
  // End of list of possible types for union field template.
}
Fields
jobName

string

Required. The job name to use for the created job. For update job request, job name should be same as the existing running job.

parameters

map (key: string, value: string)

The parameters for FlexTemplate. Ex. {"numWorkers":"5"}

launchOptions

map (key: string, value: string)

Launch options for this flex template job. This is a common set of options across languages and templates. This should not be used to pass job parameters.

environment

object (FlexTemplateRuntimeEnvironment)

The runtime environment for the FlexTemplate job

update

boolean

Set this to true if you are sending a request to update a running streaming job. When set, the job name should be the same as the running job.

transformNameMappings

map (key: string, value: string)

Use this to pass transformNameMappings for streaming update jobs. Ex:{"oldTransformName":"newTransformName",...}'

Union field template. Launch Mechanism. template can be only one of the following:
containerSpecGcsPath

string

Cloud Storage path to a file with json serialized ContainerSpec as content.

FlexTemplateRuntimeEnvironment

The environment values to be set at runtime for flex template.

JSON representation
{
  "numWorkers": integer,
  "maxWorkers": integer,
  "zone": string,
  "serviceAccountEmail": string,
  "tempLocation": string,
  "machineType": string,
  "additionalExperiments": [
    string
  ],
  "network": string,
  "subnetwork": string,
  "additionalUserLabels": {
    string: string,
    ...
  },
  "kmsKeyName": string,
  "ipConfiguration": enum (WorkerIPAddressConfiguration),
  "workerRegion": string,
  "workerZone": string,
  "enableStreamingEngine": boolean,
  "flexrsGoal": enum (FlexResourceSchedulingGoal),
  "stagingLocation": string,
  "sdkContainerImage": string,
  "diskSizeGb": integer,
  "autoscalingAlgorithm": enum (AutoscalingAlgorithm),
  "dumpHeapOnOom": boolean,
  "saveHeapDumpsToGcsPath": string,
  "launcherMachineType": string,
  "enableLauncherVmSerialPortLogging": boolean,
  "streamingMode": enum (StreamingMode)
}
Fields
numWorkers

integer

The initial number of Google Compute Engine instances for the job.

maxWorkers

integer

The maximum number of Google Compute Engine instances to be made available to your pipeline during execution, from 1 to 1000.

zone

string

The Compute Engine availability zone for launching worker instances to run your pipeline. In the future, workerZone will take precedence.

serviceAccountEmail

string

The email address of the service account to run the job as.

tempLocation

string

The Cloud Storage path to use for temporary files. Must be a valid Cloud Storage URL, beginning with gs://.

machineType

string

The machine type to use for the job. Defaults to the value from the template if not specified.

additionalExperiments[]

string

Additional experiment flags for the job.

network

string

Network to which VMs will be assigned. If empty or unspecified, the service will use the network "default".

subnetwork

string

Subnetwork to which VMs will be assigned, if desired. You can specify a subnetwork using either a complete URL or an abbreviated path. Expected to be of the form "https://www.googleapis.com/compute/v1/projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNETWORK" or "regions/REGION/subnetworks/SUBNETWORK". If the subnetwork is located in a Shared VPC network, you must use the complete URL.

additionalUserLabels

map (key: string, value: string)

Additional user labels to be specified for the job. Keys and values must follow the restrictions specified in the labeling restrictions page. An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1kg", "count": "3" }.

kmsKeyName

string

Name for the Cloud KMS key for the job. Key format is: projects//locations//keyRings//cryptoKeys/

ipConfiguration

enum (WorkerIPAddressConfiguration)

Configuration for VM IPs.

workerRegion

string

The Compute Engine region (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1". Mutually exclusive with workerZone. If neither workerRegion nor workerZone is specified, default to the control plane's region.

workerZone

string

The Compute Engine zone (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1-a". Mutually exclusive with workerRegion. If neither workerRegion nor workerZone is specified, a zone in the control plane's region is chosen based on available capacity. If both workerZone and zone are set, workerZone takes precedence.

enableStreamingEngine

boolean

Whether to enable Streaming Engine for the job.

flexrsGoal

enum (FlexResourceSchedulingGoal)

Set FlexRS goal for the job. https://cloud.google.com/dataflow/docs/guides/flexrs

stagingLocation

string

The Cloud Storage path for staging local files. Must be a valid Cloud Storage URL, beginning with gs://.

sdkContainerImage

string

Docker registry location of container image to use for the 'worker harness. Default is the container for the version of the SDK. Note this field is only valid for portable pipelines.

diskSizeGb

integer

Worker disk size, in gigabytes.

autoscalingAlgorithm

enum (AutoscalingAlgorithm)

The algorithm to use for autoscaling

dumpHeapOnOom

boolean

If true, when processing time is spent almost entirely on garbage collection (GC), saves a heap dump before ending the thread or process. If false, ends the thread or process without saving a heap dump. Does not save a heap dump when the Java Virtual Machine (JVM) has an out of memory error during processing. The location of the heap file is either echoed back to the user, or the user is given the opportunity to download the heap file.

saveHeapDumpsToGcsPath

string

Cloud Storage bucket (directory) to upload heap dumps to. Enabling this field implies that dumpHeapOnOom is set to true.

launcherMachineType

string

The machine type to use for launching the job. The default is n1-standard-1.

enableLauncherVmSerialPortLogging

boolean

If true serial port logging will be enabled for the launcher VM.

streamingMode

enum (StreamingMode)

Optional. Specifies the Streaming Engine message processing guarantees. Reduces cost and latency but might result in duplicate messages committed to storage. Designed to run simple mapping streaming ETL jobs at the lowest cost. For example, Change Data Capture (CDC) to BigQuery is a canonical use case. For more information, see Set the pipeline streaming mode.