Google Cloud Dataflow SDK for Java, version 1.9.1
Interface Coder<T>
-
- Type Parameters:
T
- the type of the values being transcoded
- All Superinterfaces:
- Serializable
- All Known Implementing Classes:
- AtomicCoder, AvroCoder, BigEndianIntegerCoder, BigEndianLongCoder, ByteArrayCoder, ByteCoder, ByteStringCoder, CoGbkResult.CoGbkResultCoder, CollectionCoder, CustomCoder, DelegateCoder, DeterministicStandardCoder, DoubleCoder, DurationCoder, EntityCoder, GlobalWindow.Coder, InstantCoder, IterableCoder, IterableLikeCoder, JAXBCoder, KvCoder, KvCoderBase, ListCoder, MapCoder, MapCoderBase, NullableCoder, PaneInfo.PaneInfoCoder, Proto2Coder, ProtoCoder, SerializableCoder, SetCoder, StandardCoder, StringDelegateCoder, StringUtf8Coder, TableRowJsonCoder, TextualIntegerCoder, TimestampedValue.TimestampedValueCoder, UnionCoder, VarIntCoder, VarLongCoder, VoidCoder
public interface Coder<T> extends Serializable
ACoder<T>
defines how to encode and decode values of typeT
into byte streams.Coder
instances are serialized during job creation and deserialized before use, via JSON serialization. SeeSerializableCoder
for an example of aCoder
that adds a custom field to theCoder
serialization. It provides a constructor annotated withJsonCreator
, which is a factory method used when deserializing aCoder
instance.Coder
classes for compound types are often composed from coder classes for types contains therein. The composition ofCoder
instances into a coder for the compound class is the subject of theCoderFactory
type, which enables automatic generic composition ofCoder
classes within theCoderRegistry
. With particular static methods on a compoundCoder
class, aCoderFactory
can be automatically inferred. SeeKvCoder
for an example of a simple compoundCoder
that supports automatic composition in theCoderRegistry
.The binary format of a
Coder
is identified bygetEncodingId()
; be sure to understand the requirements for evolving coder formats.All methods of a
Coder
are required to be thread safe.
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface and Description static class
Coder.Context
The context in which encoding or decoding is being done.static class
Coder.NonDeterministicException
Exception thrown byverifyDeterministic()
if the encoding is not deterministic, including details of why the encoding is not deterministic.
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method and Description com.google.cloud.dataflow.sdk.util.CloudObject
asCloudObject()
Returns theCloudObject
that represents thisCoder
.boolean
consistentWithEquals()
T
decode(InputStream inStream, Coder.Context context)
Decodes a value of typeT
from the given input stream in the given context.void
encode(T value, OutputStream outStream, Coder.Context context)
Encodes the given value of typeT
onto the given output stream in the given context.Collection<String>
getAllowedEncodings()
A collection of encodings supported bydecode(java.io.InputStream, com.google.cloud.dataflow.sdk.coders.Coder.Context)
in addition to the encoding fromgetEncodingId()
(which is assumed supported).List<? extends Coder<?>>
getCoderArguments()
If this is aCoder
for a parameterized type, returns the list ofCoder
s being used for each of the parameters, or returnsnull
if this cannot be done or this is not a parameterized type.String
getEncodingId()
An identifier for the binary format written byencode(T, java.io.OutputStream, com.google.cloud.dataflow.sdk.coders.Coder.Context)
.boolean
isRegisterByteSizeObserverCheap(T value, Coder.Context context)
Returns whetherregisterByteSizeObserver(T, com.google.cloud.dataflow.sdk.util.common.ElementByteSizeObserver, com.google.cloud.dataflow.sdk.coders.Coder.Context)
cheap enough to call for every element, that is, if thisCoder
can calculate the byte size of the element to be coded in roughly constant time (or lazily).void
registerByteSizeObserver(T value, com.google.cloud.dataflow.sdk.util.common.ElementByteSizeObserver observer, Coder.Context context)
Notifies theElementByteSizeObserver
about the byte size of the encoded value using thisCoder
.Object
structuralValue(T value)
Returns an object with anObject.equals()
method that represents structural equality on the argument.void
verifyDeterministic()
ThrowCoder.NonDeterministicException
if the coding is not deterministic.
-
-
-
Method Detail
-
encode
void encode(T value, OutputStream outStream, Coder.Context context) throws CoderException, IOException
Encodes the given value of typeT
onto the given output stream in the given context.- Throws:
IOException
- if writing to theOutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reason
-
decode
T decode(InputStream inStream, Coder.Context context) throws CoderException, IOException
Decodes a value of typeT
from the given input stream in the given context. Returns the decoded value.- Throws:
IOException
- if reading from theInputStream
fails for some reasonCoderException
- if the value could not be decoded for some reason
-
getCoderArguments
List<? extends Coder<?>> getCoderArguments()
If this is aCoder
for a parameterized type, returns the list ofCoder
s being used for each of the parameters, or returnsnull
if this cannot be done or this is not a parameterized type.
-
asCloudObject
com.google.cloud.dataflow.sdk.util.CloudObject asCloudObject()
Returns theCloudObject
that represents thisCoder
.
-
verifyDeterministic
void verifyDeterministic() throws Coder.NonDeterministicException
ThrowCoder.NonDeterministicException
if the coding is not deterministic.In order for a
Coder
to be considered deterministic, the following must be true:- two values that compare as equal (via
Object.equals()
orComparable.compareTo()
, if supported) have the same encoding. - the
Coder
always produces a canonical encoding, which is the same for an instance of an object even if produced on different computers at different times.
- Throws:
Coder.NonDeterministicException
- if this coder is not deterministic.
- two values that compare as equal (via
-
consistentWithEquals
boolean consistentWithEquals()
Returnstrue
if thisCoder
is injective with respect toObject.equals(java.lang.Object)
.Whenever the encoded bytes of two values are equal, then the original values are equal according to
Objects.equals()
. Note that this is well-defined fornull
.This condition is most notably false for arrays. More generally, this condition is false whenever
equals()
compares object identity, rather than performing a semantic/structural comparison.
-
structuralValue
Object structuralValue(T value) throws Exception
Returns an object with anObject.equals()
method that represents structural equality on the argument.For any two values
x
andy
of typeT
, if their encoded bytes are the same, then it must be the case thatstructuralValue(x).equals(structuralValue(y))
.Most notably:
- The structural value for an array coder should perform a structural comparison of the contents of the arrays, rather than the default behavior of comparing according to object identity.
- The structural value for a coder accepting
null
should be a proper object with anequals()
method, even if the input value isnull
.
See also
consistentWithEquals()
.- Throws:
Exception
-
isRegisterByteSizeObserverCheap
boolean isRegisterByteSizeObserverCheap(T value, Coder.Context context)
Returns whetherregisterByteSizeObserver(T, com.google.cloud.dataflow.sdk.util.common.ElementByteSizeObserver, com.google.cloud.dataflow.sdk.coders.Coder.Context)
cheap enough to call for every element, that is, if thisCoder
can calculate the byte size of the element to be coded in roughly constant time (or lazily).Not intended to be called by user code, but instead by
PipelineRunner
implementations.
-
registerByteSizeObserver
void registerByteSizeObserver(T value, com.google.cloud.dataflow.sdk.util.common.ElementByteSizeObserver observer, Coder.Context context) throws Exception
Notifies theElementByteSizeObserver
about the byte size of the encoded value using thisCoder
.Not intended to be called by user code, but instead by
PipelineRunner
implementations.- Throws:
Exception
-
getEncodingId
@Experimental(value=CODER_ENCODING_ID) String getEncodingId()
An identifier for the binary format written byencode(T, java.io.OutputStream, com.google.cloud.dataflow.sdk.coders.Coder.Context)
.This value, along with the fully qualified class name, forms an identifier for the binary format of this coder. Whenever this value changes, the new encoding is considered incompatible with the prior format: It is presumed that the prior version of the coder will be unable to correctly read the new format and the new version of the coder will be unable to correctly read the old format.
If the format is changed in a backwards-compatible way (the Coder can still accept data from the prior format), such as by adding optional fields to a Protocol Buffer or Avro definition, and you want Dataflow to understand that the new coder is compatible with the prior coder, this value must remain unchanged. It is then the responsibility of
decode(java.io.InputStream, com.google.cloud.dataflow.sdk.coders.Coder.Context)
to correctly read data from the prior format.
-
getAllowedEncodings
@Experimental(value=CODER_ENCODING_ID) Collection<String> getAllowedEncodings()
A collection of encodings supported bydecode(java.io.InputStream, com.google.cloud.dataflow.sdk.coders.Coder.Context)
in addition to the encoding fromgetEncodingId()
(which is assumed supported).This information is not currently used for any purpose. It is descriptive only, and this method is subject to change.
- See Also:
getEncodingId()
-
-