Google Cloud Dataflow SDK for Java, version 1.9.1
Interface PipelineOptions
-
- All Superinterfaces:
- HasDisplayData
- All Known Subinterfaces:
- ApplicationNameOptions, BigQueryOptions, BlockingDataflowPipelineOptions, DataflowPipelineDebugOptions, DataflowPipelineOptions, DataflowPipelineWorkerPoolOptions, DataflowWorkerHarnessOptions, DataflowWorkerLoggingOptions, DirectPipelineOptions, GcpOptions, GcsOptions, GoogleApiDebugOptions, InProcessPipelineOptions, StreamingOptions, TestDataflowPipelineOptions
@ThreadSafe public interface PipelineOptions extends HasDisplayData
PipelineOptions are used to configure Pipelines. You can extendPipelineOptions
to create custom configuration options specific to yourPipeline
, for both local execution and execution via aPipelineRunner
.PipelineOptions
and their subinterfaces represent a collection of properties which can be manipulated in a type safe manner.PipelineOptions
is backed by a dynamicProxy
which allows for type safe manipulation of properties in an extensible fashion through plain old Java interfaces.PipelineOptions
can be created withPipelineOptionsFactory.create()
andPipelineOptionsFactory.as(Class)
. They can be created from command-line arguments withPipelineOptionsFactory.fromArgs(String[])
. They can be converted to another type by invokingas(Class)
and can be accessed from within aDoFn
by invokingDoFn.Context.getPipelineOptions()
.For example:
// The most common way to construct PipelineOptions is via command-line argument parsing: public static void main(String[] args) { // Will parse the arguments passed into the application and construct a PipelineOptions // Note that --help will print registered options, and --help=PipelineOptionsClassName // will print out usage for the specific class. PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create(); Pipeline p = Pipeline.create(options); ... p.run(); } // To create options for the DirectPipeline: DirectPipelineOptions directPipelineOptions = PipelineOptionsFactory.as(DirectPipelineOptions.class); directPipelineOptions.setStreaming(true); // To cast from one type to another using the as(Class) method: DataflowPipelineOptions dataflowPipelineOptions = directPipelineOptions.as(DataflowPipelineOptions.class); // Options for the same property are shared between types // The statement below will print out "true" System.out.println(dataflowPipelineOptions.isStreaming()); // Prints out registered options. PipelineOptionsFactory.printHelp(System.out); // Prints out options which are available to be set on DataflowPipelineOptions PipelineOptionsFactory.printHelp(System.out, DataflowPipelineOptions.class);
Defining Your Own PipelineOptions
Defining your own
PipelineOptions
is the way for you to make configuration options available for both local execution and execution via aPipelineRunner
. By having PipelineOptionsFactory as your command-line interpreter, you will provide a standardized way for users to interact with your application via the command-line.To define your own
PipelineOptions
, you create an interface which extendsPipelineOptions
and define getter/setter pairs. These getter/setter pairs define a collection of JavaBean properties.For example:
// Creates a user defined property called "myProperty" public interface MyOptions extends PipelineOptions { String getMyProperty(); void setMyProperty(String value); }
Note: Please see the section on Registration below when using custom property types.
Restrictions
Since PipelineOptions can be "cast" to multiple types dynamically using
as(Class)
, a property must conform to the following set of restrictions:- Any property with the same name must have the same return type for all derived
interfaces of
PipelineOptions
. - Every bean property of any interface derived from
PipelineOptions
must have a getter and setter method. - Every method must conform to being a getter or setter for a JavaBean.
- The derived interface of
PipelineOptions
must be composable with every interface part registered with the PipelineOptionsFactory. - Only getters may be annotated with
@JsonIgnore
. - If any getter is annotated with
@JsonIgnore
, then all getters for this property must be annotated with@JsonIgnore
.
Annotations For PipelineOptions
@Description
can be used to annotate an interface or a getter with useful information which is output when--help
is invoked viaPipelineOptionsFactory.fromArgs(String[])
.@Default
represents a set of annotations that can be used to annotate getter properties onPipelineOptions
with information representing the default value to be returned if no value is specified. Any default implementation (using thedefault
keyword) is ignored.@Hidden
hides an option from being listed when--help
is invoked viaPipelineOptionsFactory.fromArgs(String[])
.@Validation
represents a set of annotations that can be used to annotate getter properties onPipelineOptions
with information representing the validation criteria to be used when validating with thePipelineOptionsValidator
. Validation will be performed if during construction of thePipelineOptions
,PipelineOptionsFactory.withValidation()
is invoked.@JsonIgnore
is used to prevent a property from being serialized and available during execution ofDoFn
. See the Serialization section below for more details.Registration Of PipelineOptions
Registration of
PipelineOptions
by an application guarantees that thePipelineOptions
is composable during execution of theirPipeline
and meets the restrictions listed above or will fail during registration. Registration also lists the registeredPipelineOptions
when--help
is invoked viaPipelineOptionsFactory.fromArgs(String[])
.Registration can be performed by invoking
PipelineOptionsFactory.register(java.lang.Class<? extends com.google.cloud.dataflow.sdk.options.PipelineOptions>)
within a users application or via automatic registration by creating aServiceLoader
entry and a concrete implementation of thePipelineOptionsRegistrar
interface.It is optional but recommended to use one of the many build time tools such as
AutoService
to generate the necessary META-INF files automatically.A list of registered options can be fetched from
PipelineOptionsFactory.getRegisteredOptions()
.Serialization Of PipelineOptions
PipelineRunner
s require support for options to be serialized. Each property withinPipelineOptions
must be able to be serialized using Jackson'sObjectMapper
or the getter method for the property annotated with@JsonIgnore
.Jackson supports serialization of many types and supports a useful set of annotations to aid in serialization of custom types. We point you to the public Jackson documentation when attempting to add serialization support for your custom types. See
GoogleApiDebugOptions.GoogleApiTracer
for an example using the Jackson annotations to serialize and deserialize a custom type.Note: It is an error to have the same property available in multiple interfaces with only some of them being annotated with
@JsonIgnore
. It is also an error to mark a setter for a property with@JsonIgnore
.
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface and Description static class
PipelineOptions.AtomicLongFactory
DefaultValueFactory
which supplies an ID that is guaranteed to be unique within the given process.static class
PipelineOptions.CheckEnabled
Enumeration of the possible states for a given check.
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method and Description <T extends PipelineOptions>
Tas(Class<T> kls)
Transforms this object into an object of type<T>
saving each property that has been manipulated.<T extends PipelineOptions>
TcloneAs(Class<T> kls)
Makes a deep clone of this object, and transforms the cloned object into the specified typekls
.Long
getOptionsId()
Provides a unique ID for thisPipelineOptions
object, assigned at graph construction time.Class<? extends PipelineRunner<?>>
getRunner()
The pipeline runner that will be used to execute the pipeline.PipelineOptions.CheckEnabled
getStableUniqueNames()
Whether to check for stable unique names on each transform.String
getTempLocation()
A pipeline level default location for storing temporary files.Map<String,Map<String,Object>>
outputRuntimeOptions()
Returns a map of properties which correspond toRuntimeValueProvider
, keyed by the property name.void
setOptionsId(Long id)
void
setRunner(Class<? extends PipelineRunner<?>> kls)
void
setStableUniqueNames(PipelineOptions.CheckEnabled enabled)
void
setTempLocation(String value)
-
Methods inherited from interface com.google.cloud.dataflow.sdk.transforms.display.HasDisplayData
populateDisplayData
-
-
-
-
Method Detail
-
as
<T extends PipelineOptions> T as(Class<T> kls)
Transforms this object into an object of type<T>
saving each property that has been manipulated.<T>
must extendPipelineOptions
.If
<T>
is not registered with thePipelineOptionsFactory
, then we attempt to verify that<T>
is composable with every interface that this instance of thePipelineOptions
has seen.- Parameters:
kls
- The class of the type to transform to.- Returns:
- An object of type kls.
-
cloneAs
<T extends PipelineOptions> T cloneAs(Class<T> kls)
Makes a deep clone of this object, and transforms the cloned object into the specified typekls
. Seeas(java.lang.Class<T>)
for more information about the conversion.Properties that are marked with
@JsonIgnore
will not be cloned.
-
getRunner
@Validation.Required @Default.Class(value=DirectPipelineRunner.class) Class<? extends PipelineRunner<?>> getRunner()
The pipeline runner that will be used to execute the pipeline. For registered runners, the class name can be specified, otherwise the fully qualified name needs to be specified.
-
setRunner
void setRunner(Class<? extends PipelineRunner<?>> kls)
-
getStableUniqueNames
@Validation.Required @Default.Enum(value="WARNING") PipelineOptions.CheckEnabled getStableUniqueNames()
Whether to check for stable unique names on each transform. This is necessary to support updating of pipelines.
-
setStableUniqueNames
void setStableUniqueNames(PipelineOptions.CheckEnabled enabled)
-
getTempLocation
String getTempLocation()
A pipeline level default location for storing temporary files.This can be a path of any file system.
getTempLocation()
can be used as a default location in otherPipelineOptions
.If it is unset,
PipelineRunner
can override it.
-
setTempLocation
void setTempLocation(String value)
-
outputRuntimeOptions
Map<String,Map<String,Object>> outputRuntimeOptions()
Returns a map of properties which correspond toRuntimeValueProvider
, keyed by the property name. The value is a map containing type and default information.
-
getOptionsId
@Hidden @Default.InstanceFactory(value=PipelineOptions.AtomicLongFactory.class) Long getOptionsId()
Provides a unique ID for thisPipelineOptions
object, assigned at graph construction time.
-
setOptionsId
void setOptionsId(Long id)
-
-