Google Cloud Dataflow SDK for Java, version 1.9.1
Class AvroCoder<T>
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.coders.StandardCoder<T>
-
- com.google.cloud.dataflow.sdk.coders.AvroCoder<T>
-
- Type Parameters:
T
- the type of elements handled by this coder
- All Implemented Interfaces:
- Coder<T>, Serializable
public class AvroCoder<T> extends StandardCoder<T>
ACoder
using Avro binary format.Each instance of
AvroCoder<T>
encapsulates an Avro schema for objects of typeT
.The Avro schema may be provided explicitly via
of(Class, Schema)
or omitted viaof(Class)
, in which case it will be inferred using Avro'sReflectData
.For complete details about schema generation and how it can be controlled please see the
org.apache.avro.reflect
package. Only concrete classes with a no-argument constructor can be mapped to Avro records. All inherited fields that are not static or transient are included. Fields are not permitted to be null unless annotated byNullable
or aUnion
schema containing"null"
.To use, specify the
Coder
type on a PCollection:PCollection<MyCustomElement> records = input.apply(...) .setCoder(AvroCoder.of(MyCustomElement.class);
or annotate the element class using
@DefaultCoder
.@DefaultCoder(AvroCoder.class) public class MyCustomElement { ... }
The implementation attempts to determine if the Avro encoding of the given type will satisfy the criteria of
Coder.verifyDeterministic()
by inspecting both the type and the Schema provided or generated by Avro. Only coders that are deterministic can be used inGroupByKey
operations.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface com.google.cloud.dataflow.sdk.coders.Coder
Coder.Context, Coder.NonDeterministicException
-
-
Field Summary
Fields Modifier and Type Field and Description static CoderProvider
PROVIDER
-
Constructor Summary
Constructors Modifier Constructor and Description protected
AvroCoder(Class<T> type, Schema schema)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method and Description com.google.cloud.dataflow.sdk.util.CloudObject
asCloudObject()
Returns theCloudObject
that represents thisCoder
.DatumReader<T>
createDatumReader()
Deprecated.ForAvroCoder
internal use only.DatumWriter<T>
createDatumWriter()
Deprecated.ForAvroCoder
internal use only.T
decode(InputStream inStream, Coder.Context context)
Decodes a value of typeT
from the given input stream in the given context.void
encode(T value, OutputStream outStream, Coder.Context context)
Encodes the given value of typeT
onto the given output stream in the given context.List<? extends Coder<?>>
getCoderArguments()
If this is aCoder
for a parameterized type, returns the list ofCoder
s being used for each of the parameters, or returnsnull
if this cannot be done or this is not a parameterized type.String
getEncodingId()
The encoding identifier is designed to support evolution as per the design of Avro In order to use this class effectively, carefully read the Avro documentation at Schema Resolution to ensure that the old and new schema match.Schema
getSchema()
Returns the schema used by this coder.Class<T>
getType()
Returns the type this coder encodes/decodes.static <T> AvroCoder<T>
of(Class<T> clazz)
Returns anAvroCoder
instance for the provided element class.static <T> AvroCoder<T>
of(Class<T> type, Schema schema)
Returns anAvroCoder
instance for the provided element type using the provided Avro schema.static AvroCoder<GenericRecord>
of(Schema schema)
Returns anAvroCoder
instance for the Avro schema.static AvroCoder<?>
of(String classType, String schema)
static <T> AvroCoder<T>
of(TypeDescriptor<T> type)
Returns anAvroCoder
instance for the provided element type.void
verifyDeterministic()
ThrowCoder.NonDeterministicException
if the coding is not deterministic.-
Methods inherited from class com.google.cloud.dataflow.sdk.coders.StandardCoder
consistentWithEquals, equals, getAllowedEncodings, getComponents, getEncodedElementByteSize, hashCode, isRegisterByteSizeObserverCheap, registerByteSizeObserver, structuralValue, toString, verifyDeterministic, verifyDeterministic
-
-
-
-
Field Detail
-
PROVIDER
public static final CoderProvider PROVIDER
-
-
Method Detail
-
of
public static <T> AvroCoder<T> of(TypeDescriptor<T> type)
Returns anAvroCoder
instance for the provided element type.- Type Parameters:
T
- the element type
-
of
public static <T> AvroCoder<T> of(Class<T> clazz)
Returns anAvroCoder
instance for the provided element class.- Type Parameters:
T
- the element type
-
of
public static AvroCoder<GenericRecord> of(Schema schema)
Returns anAvroCoder
instance for the Avro schema. The implicit type is GenericRecord.
-
of
public static <T> AvroCoder<T> of(Class<T> type, Schema schema)
Returns anAvroCoder
instance for the provided element type using the provided Avro schema.If the type argument is GenericRecord, the schema may be arbitrary. Otherwise, the schema must correspond to the type provided.
- Type Parameters:
T
- the element type
-
of
public static AvroCoder<?> of(String classType, String schema) throws ClassNotFoundException
- Throws:
ClassNotFoundException
-
getEncodingId
public String getEncodingId()
The encoding identifier is designed to support evolution as per the design of Avro In order to use this class effectively, carefully read the Avro documentation at Schema Resolution to ensure that the old and new schema match.In particular, this encoding identifier is guaranteed to be the same for
AvroCoder
instances of the same principal class, and otherwise distinct. The schema is not included in the identifier.When modifying a class to be encoded as Avro, here are some guidelines; see the above link for greater detail.
- Avoid changing field names.
- Never remove a
required
field. - Only add
optional
fields, with sensible defaults. - When changing the type of a field, consult the Avro documentation to ensure the new and old types are interchangeable.
Code consuming this message class should be prepared to support all versions of the class until it is certain that no remaining serialized instances exist.
If backwards incompatible changes must be made, the best recourse is to change the name of your class.
- Specified by:
getEncodingId
in interfaceCoder<T>
- Overrides:
getEncodingId
in classStandardCoder<T>
-
encode
public void encode(T value, OutputStream outStream, Coder.Context context) throws IOException
Description copied from interface:Coder
Encodes the given value of typeT
onto the given output stream in the given context.- Throws:
IOException
- if writing to theOutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reason
-
decode
public T decode(InputStream inStream, Coder.Context context) throws IOException
Description copied from interface:Coder
Decodes a value of typeT
from the given input stream in the given context. Returns the decoded value.- Throws:
IOException
- if reading from theInputStream
fails for some reasonCoderException
- if the value could not be decoded for some reason
-
getCoderArguments
public List<? extends Coder<?>> getCoderArguments()
Description copied from interface:Coder
If this is aCoder
for a parameterized type, returns the list ofCoder
s being used for each of the parameters, or returnsnull
if this cannot be done or this is not a parameterized type.
-
asCloudObject
public com.google.cloud.dataflow.sdk.util.CloudObject asCloudObject()
Description copied from interface:Coder
Returns theCloudObject
that represents thisCoder
.- Specified by:
asCloudObject
in interfaceCoder<T>
- Overrides:
asCloudObject
in classStandardCoder<T>
-
verifyDeterministic
public void verifyDeterministic() throws Coder.NonDeterministicException
Description copied from interface:Coder
ThrowCoder.NonDeterministicException
if the coding is not deterministic.In order for a
Coder
to be considered deterministic, the following must be true:- two values that compare as equal (via
Object.equals()
orComparable.compareTo()
, if supported) have the same encoding. - the
Coder
always produces a canonical encoding, which is the same for an instance of an object even if produced on different computers at different times.
- Throws:
NonDeterministicException
- when the type may not be deterministically encoded using the givenSchema
, thedirectBinaryEncoder
, and theReflectDatumWriter
orGenericDatumWriter
.Coder.NonDeterministicException
- if this coder is not deterministic.
- two values that compare as equal (via
-
createDatumReader
@Deprecated public DatumReader<T> createDatumReader()
Deprecated. ForAvroCoder
internal use only.Returns a newDatumReader
that can be used to read from an Avro file directly. Assumes the schema used to read is the same as the schema that was used when writing.
-
createDatumWriter
@Deprecated public DatumWriter<T> createDatumWriter()
Deprecated. ForAvroCoder
internal use only.Returns a newDatumWriter
that can be used to write to an Avro file directly.
-
getSchema
public Schema getSchema()
Returns the schema used by this coder.
-
-