DataflowPipelineWorkerPoolOptions (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

com.google.cloud.dataflow.sdk.options

Interface DataflowPipelineWorkerPoolOptions

    • Method Detail

      • getNumWorkers

        int getNumWorkers()
        Number of workers to use when executing the Dataflow job. Note that selection of an autoscaling algorithm other then NONE will affect the size of the worker pool. If left unspecified, the Dataflow service will determine the number of workers.
      • setNumWorkers

        void setNumWorkers(int value)
      • getAutoscalingAlgorithm

        DataflowPipelineWorkerPoolOptions.AutoscalingAlgorithmType getAutoscalingAlgorithm()
        The autoscaling algorithm to use for the workerpool.
        • NONE: does not change the size of the worker pool.
        • BASIC: autoscale the worker pool size up to maxNumWorkers until the job completes.
        • THROUGHPUT_BASED: autoscale the workerpool based on throughput (up to maxNumWorkers).
      • getMaxNumWorkers

        int getMaxNumWorkers()
        The maximum number of workers to use for the workerpool. This options limits the size of the workerpool for the lifetime of the job, including pipeline updates. If left unspecified, the Dataflow service will compute a ceiling.
      • setMaxNumWorkers

        void setMaxNumWorkers(int value)
      • getDiskSizeGb

        int getDiskSizeGb()
        Remote worker disk size, in gigabytes, or 0 to use the default size.
      • setDiskSizeGb

        void setDiskSizeGb(int value)
      • setWorkerHarnessContainerImage

        void setWorkerHarnessContainerImage(String value)
      • getNetwork

        String getNetwork()
        GCE network for launching workers.

        Default is up to the Dataflow service.

      • setNetwork

        void setNetwork(String value)
      • getSubnetwork

        String getSubnetwork()
        GCE subnetwork for launching workers.

        Default is up to the Dataflow service. Expected format is regions/REGION/subnetworks/SUBNETWORK.

        You may also need to specify network option.

      • setSubnetwork

        void setSubnetwork(String value)
      • setZone

        void setZone(String value)
      • getWorkerMachineType

        String getWorkerMachineType()
        Machine type to create Dataflow worker VMs as.

        See GCE machine types for a list of valid options.

        If unset, the Dataflow service will choose a reasonable default.

      • setWorkerMachineType

        void setWorkerMachineType(String value)
      • getFilesToStage

        List<String> getFilesToStage()
        List of local files to make available to workers.

        Files are placed on the worker's classpath.

        The default value is the list of jars from the main program's classpath.

      • setFilesToStage

        void setFilesToStage(List<String> value)
      • getWorkerDiskType

        String getWorkerDiskType()
        Specifies what type of persistent disk should be used. The value should be a full or partial URL of a disk type resource, e.g., zones/us-central1-f/disks/pd-standard. For more information, see the API reference documentation for DiskTypes.
      • setWorkerDiskType

        void setWorkerDiskType(String value)
      • getUsePublicIps

        @Experimental
         @Nullable
        Boolean getUsePublicIps()
        Specifies whether worker pools should be started with public IP addresses.

        WARNING: This feature is experimental. You must be whitelisted to use it.


Send feedback about...

Cloud Dataflow