DataflowPipelineRunner (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

Class DataflowPipelineRunner

  • public class DataflowPipelineRunner
    extends PipelineRunner<DataflowPipelineJob>
    A PipelineRunner that executes the operations in the pipeline by first translating them to the Dataflow representation using the DataflowPipelineTranslator and then submitting them to a Dataflow service for execution.


    When reading from a Dataflow source or writing to a Dataflow sink using DataflowPipelineRunner, the Google cloudservices account and the Google compute engine service account of the GCP project running the Dataflow Job will need access to the corresponding source/sink.

    Please see Google Cloud Dataflow Security and Permissions for more details.

    • Field Detail


        public static final String BATCH_WORKER_HARNESS_CONTAINER_IMAGE
        See Also:
        Constant Field Values

        public static final String STREAMING_WORKER_HARNESS_CONTAINER_IMAGE
        See Also:
        Constant Field Values

        public static final String PROJECT_ID_REGEXP
        Project IDs must contain lowercase letters, digits, or dashes. IDs must start with a letter and may not end with a dash. This regex isn't exact - this allows for patterns that would be rejected by the service, but this is sufficient for basic validation of project IDs.
        See Also:
        Constant Field Values
    • Method Detail

      • fromOptions

        public static DataflowPipelineRunner fromOptions(PipelineOptions options)
        Construct a runner from the provided options.
        options - Properties that configure the runner.
        The newly created runner.
      • apply

        public <OutputT extends POutput,InputT extends PInput> OutputT apply(PTransform<InputT,OutputT> transform,
                                                                             InputT input)
        Applies the given transform to the input. For transforms with customized definitions for the Dataflow pipeline runner, the application is intercepted and modified here.
        apply in class PipelineRunner<DataflowPipelineJob>
      • getTranslator

        public DataflowPipelineTranslator getTranslator()
        Returns the DataflowPipelineTranslator associated with this object.
      • detectClassPathResourcesToStage

        protected static List<String> detectClassPathResourcesToStage(ClassLoader classLoader)
        Attempts to detect all the resources the class loader has access to. This does not recurse to class loader parents stopping it from pulling in resources from the system class loader.
        classLoader - The URLClassLoader to use to detect resources to stage.
        A list of absolute paths to the resources the class loader uses.
        IllegalArgumentException - If either the class loader is not a URLClassLoader or one of the resources the class loader exposes is not a file resource.