Google Cloud Dataflow SDK for Java, version 1.9.1
Package com.google.cloud.dataflow.sdk.runners
DirectPipelineRunner
and
DataflowPipelineRunner
.See: Description
-
Interface Summary Interface Description DataflowPipelineTranslator.TransformTranslator<TransformT extends PTransform> ADataflowPipelineTranslator.TransformTranslator
knows how to translate a particular subclass ofPTransform
for the Cloud Dataflow service.DataflowPipelineTranslator.TranslationContext The interface provided to registered callbacks for interacting with theDataflowPipelineRunner
, including reading and writing the values ofPCollection
s and side inputs (PCollectionView
s).DirectPipelineRunner.EvaluationContext The interface provided to registered callbacks for interacting with theDirectPipelineRunner
, including reading and writing the values ofPCollection
s andPCollectionView
s.DirectPipelineRunner.EvaluationResults The interface provided to registered callbacks for interacting with theDirectPipelineRunner
, including reading and writing the values ofPCollection
s andPCollectionView
s.DirectPipelineRunner.TransformEvaluator<TransformT extends PTransform> An evaluator of a PTransform.PipelineRunnerRegistrar PipelineRunner
creators have the ability to automatically have theirPipelineRunner
registered with this SDK by creating aServiceLoader
entry and a concrete implementation of this interface. -
Class Summary Class Description AggregatorPipelineExtractor RetrievesAggregators
at eachParDo
and returns aMap
ofAggregator
to thePTransforms
in which it is present.AggregatorValues<T> A collection of values associated with anAggregator
.BlockingDataflowPipelineRunner APipelineRunner
that's likeDataflowPipelineRunner
but that waits for the launched job to finish.DataflowPipeline DataflowPipelineJob A DataflowPipelineJob represents a job submitted to Dataflow usingDataflowPipelineRunner
.DataflowPipelineRegistrar DataflowPipelineRegistrar.Options Register theDataflowPipelineOptions
andBlockingDataflowPipelineOptions
.DataflowPipelineRegistrar.Runner Register theDataflowPipelineRunner
andBlockingDataflowPipelineRunner
.DataflowPipelineRunner APipelineRunner
that executes the operations in the pipeline by first translating them to the Dataflow representation using theDataflowPipelineTranslator
and then submitting them to a Dataflow service for execution.DataflowPipelineRunner.BatchBigQueryIONativeReadTranslator Implements BigQueryIO Read translation for the Dataflow backend.DataflowPipelineRunnerHooks An instance of this class can be passed to theDataflowPipelineRunner
to add user defined hooks to be invoked at various times during pipeline execution.DataflowPipelineTranslator DataflowPipelineTranslator
knows how to translatePipeline
objects into Cloud Dataflow Service APIJob
s.DataflowPipelineTranslator.JobSpecification The result of a job translation.DirectPipeline ADirectPipeline
is aPipeline
that returnsDirectPipelineRunner.EvaluationResults
when it isPipeline.run()
.DirectPipelineRegistrar DirectPipelineRegistrar.Options Register theDirectPipelineOptions
.DirectPipelineRegistrar.Runner Register theDirectPipelineRunner
.DirectPipelineRunner Executes the operations in the pipeline directly, in this process, without any optimization.DirectPipelineRunner.TestCombineDoFn<K,InputT,AccumT,OutputT> The implementation may split theCombine.KeyedCombineFn
into ADD, MERGE and EXTRACT phases ( seecom.google.cloud.dataflow.sdk.runners.worker.CombineValuesFn
).DirectPipelineRunner.ValueWithMetadata<V> An immutable (value, timestamp) pair, along with other metadata necessary for the implementation ofDirectPipelineRunner
.PipelineRunner<ResultT extends PipelineResult> APipelineRunner
can execute, translate, or otherwise process aPipeline
.RecordingPipelineVisitor Provides a simplePipeline.PipelineVisitor
that records the transformation tree.TemplatingDataflowPipelineRunner APipelineRunner
that's likeDataflowPipelineRunner
but only stores a template of a job.TemplatingDataflowPipelineRunner.Runner Register theTemplatingDataflowPipelineRunner
.TransformHierarchy Captures information about a collection of transformations and their associatedPValue
s.TransformTreeNode Provides internal tracking of transform relationships with helper methods for initialization and ordered visitation. -
Exception Summary Exception Description AggregatorRetrievalException Signals that an exception has occurred while retrievingAggregator
s.DataflowJobAlreadyExistsException An exception that is thrown if the unique job name constraint of the Dataflow service is broken because an existing job with the same job name is currently active.DataflowJobAlreadyUpdatedException An exception that is thrown if the existing job has already been updated within the Dataflow service and is no longer able to be updated.DataflowJobCancelledException Signals that a job run by aBlockingDataflowPipelineRunner
was updated during execution.DataflowJobException ARuntimeException
that contains information about aDataflowPipelineJob
.DataflowJobExecutionException Signals that a job run by aBlockingDataflowPipelineRunner
fails during execution, and provides access to the failed job.DataflowJobUpdatedException Signals that a job run by aBlockingDataflowPipelineRunner
was updated during execution.DataflowServiceException Signals there was an error retrieving information about a job from the Cloud Dataflow Service.
Package com.google.cloud.dataflow.sdk.runners Description
DirectPipelineRunner
and
DataflowPipelineRunner
.
DirectPipelineRunner
executes a Pipeline
locally, without contacting the Dataflow service.
DataflowPipelineRunner
submits a
Pipeline
to the Dataflow service, which executes it on Dataflow-managed Compute Engine
instances. DataflowPipelineRunner
returns
as soon as the Pipeline
has been submitted. Use
BlockingDataflowPipelineRunner
to have execution
updates printed to the console.
The runner is specified as part PipelineOptions
.