RuntimeEnvironment

The environment values to set at runtime.

JSON representation
{
  "numWorkers": integer,
  "maxWorkers": integer,
  "zone": string,
  "serviceAccountEmail": string,
  "tempLocation": string,
  "bypassTempDirValidation": boolean,
  "machineType": string,
  "additionalExperiments": [
    string
  ],
  "network": string,
  "subnetwork": string,
  "additionalUserLabels": {
    string: string,
    ...
  },
  "kmsKeyName": string,
  "ipConfiguration": enum (WorkerIPAddressConfiguration),
  "workerRegion": string,
  "workerZone": string,
  "enableStreamingEngine": boolean,
  "diskSizeGb": integer,
  "streamingMode": enum (StreamingMode)
}
Fields
numWorkers

integer

Optional. The initial number of Google Compute Engine instances for the job. The default value is 11.

maxWorkers

integer

Optional. The maximum number of Google Compute Engine instances to be made available to your pipeline during execution, from 1 to 1000. The default value is 1.

zone

string

Optional. The Compute Engine availability zone for launching worker instances to run your pipeline. In the future, workerZone will take precedence.

serviceAccountEmail

string

Optional. The email address of the service account to run the job as.

tempLocation

string

Required. The Cloud Storage path to use for temporary files. Must be a valid Cloud Storage URL, beginning with gs://.

bypassTempDirValidation

boolean

Optional. Whether to bypass the safety checks for the job's temporary directory. Use with caution.

machineType

string

Optional. The machine type to use for the job. Defaults to the value from the template if not specified.

additionalExperiments[]

string

Optional. Additional experiment flags for the job, specified with the --experiments option.

network

string

Optional. Network to which VMs will be assigned. If empty or unspecified, the service will use the network "default".

subnetwork

string

Optional. Subnetwork to which VMs will be assigned, if desired. You can specify a subnetwork using either a complete URL or an abbreviated path. Expected to be of the form "https://www.googleapis.com/compute/v1/projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNETWORK" or "regions/REGION/subnetworks/SUBNETWORK". If the subnetwork is located in a Shared VPC network, you must use the complete URL.

additionalUserLabels

map (key: string, value: string)

Optional. Additional user labels to be specified for the job. Keys and values should follow the restrictions specified in the labeling restrictions page. An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1kg", "count": "3" }.

kmsKeyName

string

Optional. Name for the Cloud KMS key for the job. Key format is: projects//locations//keyRings//cryptoKeys/

ipConfiguration

enum (WorkerIPAddressConfiguration)

Optional. Configuration for VM IPs.

workerRegion

string

Required. The Compute Engine region (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1". Mutually exclusive with workerZone. If neither workerRegion nor workerZone is specified, default to the control plane's region.

workerZone

string

Optional. The Compute Engine zone (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1-a". Mutually exclusive with workerRegion. If neither workerRegion nor workerZone is specified, a zone in the control plane's region is chosen based on available capacity. If both workerZone and zone are set, workerZone takes precedence.

enableStreamingEngine

boolean

Optional. Whether to enable Streaming Engine for the job.

diskSizeGb

integer

Optional. The disk size, in gigabytes, to use on each remote Compute Engine worker instance.

streamingMode

enum (StreamingMode)

Optional. Specifies the Streaming Engine message processing guarantees. Reduces cost and latency but might result in duplicate messages committed to storage. Designed to run simple mapping streaming ETL jobs at the lowest cost. For example, Change Data Capture (CDC) to BigQuery is a canonical use case. For more information, see Set the pipeline streaming mode.