Google Cloud Dataflow SDK for Java, version 1.9.1
Class AvroIO.Write
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.io.AvroIO.Write
-
- Enclosing class:
- AvroIO
public static class AvroIO.Write extends Object
A rootPTransform
that writes aPCollection
to an Avro file (or multiple Avro files matching a sharding pattern).
-
-
Nested Class Summary
Nested Classes Modifier and Type Class and Description static class
AvroIO.Write.Bound<T>
APTransform
that writes a boundedPCollection
to an Avro file (or multiple Avro files matching a sharding pattern).
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method and Description static AvroIO.Write.Bound<GenericRecord>
named(String name)
Returns aPTransform
with the given step name.static AvroIO.Write.Bound<GenericRecord>
to(String prefix)
Returns aPTransform
that writes to the file(s) with the given prefix.static AvroIO.Write.Bound<GenericRecord>
withNumShards(int numShards)
Returns aPTransform
that uses the provided shard count.static AvroIO.Write.Bound<GenericRecord>
withoutSharding()
Returns aPTransform
that forces a single file as output.static AvroIO.Write.Bound<GenericRecord>
withoutValidation()
Returns aPTransform
that writes Avro file(s) that has GCS path validation on pipeline creation disabled.static <T> AvroIO.Write.Bound<T>
withSchema(Class<T> type)
Returns aPTransform
that writes Avro file(s) containing records whose type is the specified Avro-generated class.static AvroIO.Write.Bound<GenericRecord>
withSchema(Schema schema)
Returns aPTransform
that writes Avro file(s) containing records of the specified schema.static AvroIO.Write.Bound<GenericRecord>
withSchema(String schema)
Returns aPTransform
that writes Avro file(s) containing records of the specified schema in a JSON-encoded string form.static AvroIO.Write.Bound<GenericRecord>
withShardNameTemplate(String shardTemplate)
Returns aPTransform
that uses the given shard name template.static AvroIO.Write.Bound<GenericRecord>
withSuffix(String filenameSuffix)
Returns aPTransform
that writes to the file(s) with the given filename suffix.
-
-
-
Method Detail
-
named
public static AvroIO.Write.Bound<GenericRecord> named(String name)
Returns aPTransform
with the given step name.
-
to
public static AvroIO.Write.Bound<GenericRecord> to(String prefix)
Returns aPTransform
that writes to the file(s) with the given prefix. This can be a local filename (if running locally), or a Google Cloud Storage filename of the form"gs://<bucket>/<filepath>"
(if running locally or via the Google Cloud Dataflow service).The files written will begin with this prefix, followed by a shard identifier (see
AvroIO.Write.Bound.withNumShards(int)
, and end in a common extension, if given byAvroIO.Write.Bound.withSuffix(java.lang.String)
.
-
withSuffix
public static AvroIO.Write.Bound<GenericRecord> withSuffix(String filenameSuffix)
Returns aPTransform
that writes to the file(s) with the given filename suffix.
-
withNumShards
public static AvroIO.Write.Bound<GenericRecord> withNumShards(int numShards)
Returns aPTransform
that uses the provided shard count.Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
- Parameters:
numShards
- the number of shards to use, or 0 to let the system decide.
-
withShardNameTemplate
public static AvroIO.Write.Bound<GenericRecord> withShardNameTemplate(String shardTemplate)
Returns aPTransform
that uses the given shard name template.See
ShardNameTemplate
for a description of shard templates.
-
withoutSharding
public static AvroIO.Write.Bound<GenericRecord> withoutSharding()
Returns aPTransform
that forces a single file as output.Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
-
withSchema
public static <T> AvroIO.Write.Bound<T> withSchema(Class<T> type)
Returns aPTransform
that writes Avro file(s) containing records whose type is the specified Avro-generated class.- Type Parameters:
T
- the type of the elements of the input PCollection
-
withSchema
public static AvroIO.Write.Bound<GenericRecord> withSchema(Schema schema)
Returns aPTransform
that writes Avro file(s) containing records of the specified schema.
-
withSchema
public static AvroIO.Write.Bound<GenericRecord> withSchema(String schema)
Returns aPTransform
that writes Avro file(s) containing records of the specified schema in a JSON-encoded string form.
-
withoutValidation
public static AvroIO.Write.Bound<GenericRecord> withoutValidation()
Returns aPTransform
that writes Avro file(s) that has GCS path validation on pipeline creation disabled.This can be useful in the case where the GCS output location does not exist at the pipeline creation time, but is expected to be available at execution time.
-
-