AvroIO.Write (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

com.google.cloud.dataflow.sdk.io

Class AvroIO.Write

  • Enclosing class:
    AvroIO


    public static class AvroIO.Write
    extends Object
    A root PTransform that writes a PCollection to an Avro file (or multiple Avro files matching a sharding pattern).
    • Method Detail

      • withNumShards

        public static AvroIO.Write.Bound<GenericRecord> withNumShards(int numShards)
        Returns a PTransform that uses the provided shard count.

        Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.

        Parameters:
        numShards - the number of shards to use, or 0 to let the system decide.
      • withoutSharding

        public static AvroIO.Write.Bound<GenericRecord> withoutSharding()
        Returns a PTransform that forces a single file as output.

        Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.

      • withSchema

        public static <T> AvroIO.Write.Bound<T> withSchema(Class<T> type)
        Returns a PTransform that writes Avro file(s) containing records whose type is the specified Avro-generated class.
        Type Parameters:
        T - the type of the elements of the input PCollection
      • withoutValidation

        public static AvroIO.Write.Bound<GenericRecord> withoutValidation()
        Returns a PTransform that writes Avro file(s) that has GCS path validation on pipeline creation disabled.

        This can be useful in the case where the GCS output location does not exist at the pipeline creation time, but is expected to be available at execution time.


Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Dataflow
Need help? Visit our support page.