Google Cloud Dataflow SDK for Java, version 1.9.1
Class AvroIO.Write.Bound<T>
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.transforms.PTransform<PCollection<T>,PDone>
-
- com.google.cloud.dataflow.sdk.io.AvroIO.Write.Bound<T>
-
- Type Parameters:
T
- the type of each of the elements of the input PCollection
- All Implemented Interfaces:
- HasDisplayData, Serializable
- Enclosing class:
- AvroIO.Write
public static class AvroIO.Write.Bound<T> extends PTransform<PCollection<T>,PDone>
APTransform
that writes a boundedPCollection
to an Avro file (or multiple Avro files matching a sharding pattern).- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class com.google.cloud.dataflow.sdk.transforms.PTransform
name
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method and Description PDone
apply(PCollection<T> input)
Applies thisPTransform
on the givenInputT
, and returns itsOutput
.protected Coder<Void>
getDefaultOutputCoder()
Returns the defaultCoder
to use for the output of this single-outputPTransform
.String
getFilenamePrefix()
String
getFilenameSuffix()
int
getNumShards()
Schema
getSchema()
String
getShardNameTemplate()
Returns the current shard name template string.String
getShardTemplate()
Class<T>
getType()
AvroIO.Write.Bound<T>
named(String name)
Returns a newPTransform
that's like this one but with the given step name.boolean
needsValidation()
void
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.AvroIO.Write.Bound<T>
to(String filenamePrefix)
Returns a newPTransform
that's like this one but that writes to the file(s) with the given filename prefix.AvroIO.Write.Bound<T>
withNumShards(int numShards)
Returns a newPTransform
that's like this one but that uses the provided shard count.AvroIO.Write.Bound<T>
withoutSharding()
Returns a newPTransform
that's like this one but that forces a single file as output.AvroIO.Write.Bound<T>
withoutValidation()
Returns a newPTransform
that's like this one but that has GCS output path validation on pipeline creation disabled.<X> AvroIO.Write.Bound<X>
withSchema(Class<X> type)
Returns a newPTransform
that's like this one but that writes to Avro file(s) containing records whose type is the specified Avro-generated class.AvroIO.Write.Bound<GenericRecord>
withSchema(Schema schema)
Returns a newPTransform
that's like this one but that writes to Avro file(s) containing records of the specified schema.AvroIO.Write.Bound<GenericRecord>
withSchema(String schema)
Returns a newPTransform
that's like this one but that writes to Avro file(s) containing records of the specified schema in a JSON-encoded string form.AvroIO.Write.Bound<T>
withShardNameTemplate(String shardTemplate)
Returns a newPTransform
that's like this one but that uses the given shard name template.AvroIO.Write.Bound<T>
withSuffix(String filenameSuffix)
Returns a newPTransform
that's like this one but that writes to the file(s) with the given filename suffix.-
Methods inherited from class com.google.cloud.dataflow.sdk.transforms.PTransform
getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validate
-
-
-
-
Method Detail
-
named
public AvroIO.Write.Bound<T> named(String name)
Returns a newPTransform
that's like this one but with the given step name.Does not modify this object.
-
to
public AvroIO.Write.Bound<T> to(String filenamePrefix)
Returns a newPTransform
that's like this one but that writes to the file(s) with the given filename prefix.See
AvroIO.Write.to(String)
for more information about filenames.Does not modify this object.
-
withSuffix
public AvroIO.Write.Bound<T> withSuffix(String filenameSuffix)
Returns a newPTransform
that's like this one but that writes to the file(s) with the given filename suffix.See
ShardNameTemplate
for a description of shard templates.Does not modify this object.
-
withNumShards
public AvroIO.Write.Bound<T> withNumShards(int numShards)
Returns a newPTransform
that's like this one but that uses the provided shard count.Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
Does not modify this object.
- Parameters:
numShards
- the number of shards to use, or 0 to let the system decide.- See Also:
ShardNameTemplate
-
withShardNameTemplate
public AvroIO.Write.Bound<T> withShardNameTemplate(String shardTemplate)
Returns a newPTransform
that's like this one but that uses the given shard name template.Does not modify this object.
- See Also:
ShardNameTemplate
-
withoutSharding
public AvroIO.Write.Bound<T> withoutSharding()
Returns a newPTransform
that's like this one but that forces a single file as output.This is a shortcut for
.withNumShards(1).withShardNameTemplate("")
Does not modify this object.
-
withSchema
public <X> AvroIO.Write.Bound<X> withSchema(Class<X> type)
Returns a newPTransform
that's like this one but that writes to Avro file(s) containing records whose type is the specified Avro-generated class.Does not modify this object.
- Type Parameters:
X
- the type of the elements of the input PCollection
-
withSchema
public AvroIO.Write.Bound<GenericRecord> withSchema(Schema schema)
Returns a newPTransform
that's like this one but that writes to Avro file(s) containing records of the specified schema.Does not modify this object.
-
withSchema
public AvroIO.Write.Bound<GenericRecord> withSchema(String schema)
Returns a newPTransform
that's like this one but that writes to Avro file(s) containing records of the specified schema in a JSON-encoded string form.Does not modify this object.
-
withoutValidation
public AvroIO.Write.Bound<T> withoutValidation()
Returns a newPTransform
that's like this one but that has GCS output path validation on pipeline creation disabled.Does not modify this object.
This can be useful in the case where the GCS output location does not exist at the pipeline creation time, but is expected to be available at execution time.
-
apply
public PDone apply(PCollection<T> input)
Description copied from class:PTransform
Applies thisPTransform
on the givenInputT
, and returns itsOutput
.Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
The default implementation throws an exception. A derived class must either implement apply, or else each runner must supply a custom implementation via
PipelineRunner.apply(com.google.cloud.dataflow.sdk.transforms.PTransform<InputT, OutputT>, InputT)
.- Overrides:
apply
in classPTransform<PCollection<T>,PDone>
-
populateDisplayData
public void populateDisplayData(DisplayData.Builder builder)
Description copied from class:PTransform
Register display data for the given transform or component.populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData)
. Implementations may callsuper.populateDisplayData(builder)
in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)
to use the namespace of the subcomponent.By default, does not register any display data. Implementors may override this method to provide their own display data.
- Specified by:
populateDisplayData
in interfaceHasDisplayData
- Overrides:
populateDisplayData
in classPTransform<PCollection<T>,PDone>
- Parameters:
builder
- The builder to populate with display data.- See Also:
HasDisplayData
-
getShardNameTemplate
public String getShardNameTemplate()
Returns the current shard name template string.
-
getDefaultOutputCoder
protected Coder<Void> getDefaultOutputCoder()
Description copied from class:PTransform
Returns the defaultCoder
to use for the output of this single-outputPTransform
.By default, always throws
- Overrides:
getDefaultOutputCoder
in classPTransform<PCollection<T>,PDone>
-
getFilenamePrefix
public String getFilenamePrefix()
-
getShardTemplate
public String getShardTemplate()
-
getNumShards
public int getNumShards()
-
getFilenameSuffix
public String getFilenameSuffix()
-
getSchema
public Schema getSchema()
-
needsValidation
public boolean needsValidation()
-
-