Google Cloud Dataflow SDK for Java, version 1.9.1
Class ProtoCoder<T extends com.google.protobuf.Message>
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.coders.StandardCoder<T>
-
- com.google.cloud.dataflow.sdk.coders.DeterministicStandardCoder<T>
-
- com.google.cloud.dataflow.sdk.coders.AtomicCoder<T>
-
- com.google.cloud.dataflow.sdk.coders.protobuf.ProtoCoder<T>
-
- Type Parameters:
T
- the Protocol BuffersMessage
handled by thisCoder
.
- All Implemented Interfaces:
- Coder<T>, Serializable
public class ProtoCoder<T extends com.google.protobuf.Message> extends AtomicCoder<T>
ACoder
using Google Protocol Buffers binary format.ProtoCoder
supports both Protocol Buffers syntax versions 2 and 3.To learn more about Protocol Buffers, visit: https://developers.google.com/protocol-buffers
ProtoCoder
is registered in the globalCoderRegistry
as the defaultCoder
for anyMessage
object. Custom message extensions are also supported, but these extensions must be registered for a particularProtoCoder
instance and that instance must be registered on thePCollection
that needs the extensions:import MyProtoFile; import MyProtoFile.MyMessage; Coder<MyMessage> coder = ProtoCoder.of(MyMessage.class).withExtensionsFrom(MyProtoFile.class); PCollection<MyMessage> records = input.apply(...).setCoder(coder);
Versioning
ProtoCoder
supports both versions 2 and 3 of the Protocol Buffers syntax. However, the Java runtime version of thegoogle.com.protobuf
library must match exactly the version ofprotoc
that was used to produce the JAR files containing the compiled.proto
messages.For more information, see the Protocol Buffers documentation.
ProtoCoder
and DeterminismIn general, Protocol Buffers messages can be encoded deterministically within a single pipeline as long as:
- The encoded messages (and any transitively linked messages) do not use
map
fields. - Every Java VM that encodes or decodes the messages use the same runtime version of the
Protocol Buffers library and the same compiled
.proto
file JAR.
ProtoCoder
and Encoding StabilityWhen changing Protocol Buffers messages, follow the rules in the Protocol Buffers language guides for
proto2
andproto3
syntaxes, depending on your message type. Following these guidelines will ensure that the old encoded data can be read by new versions of the code.Generally, any change to the message type, registered extensions, runtime library, or compiled proto JARs may change the encoding. Thus even if both the original and updated messages can be encoded deterministically within a single job, these deterministic encodings may not be the same across jobs.
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface com.google.cloud.dataflow.sdk.coders.Coder
Coder.Context, Coder.NonDeterministicException
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method and Description com.google.cloud.dataflow.sdk.util.CloudObject
asCloudObject()
Returns theCloudObject
that represents thisCoder
.static CoderProvider
coderProvider()
T
decode(InputStream inStream, Coder.Context context)
Decodes a value of typeT
from the given input stream in the given context.void
encode(T value, OutputStream outStream, Coder.Context context)
Encodes the given value of typeT
onto the given output stream in the given context.boolean
equals(Object other)
String
getEncodingId()
The encoding identifier is designed to support evolution as per the design of Protocol Buffers.com.google.protobuf.ExtensionRegistry
getExtensionRegistry()
Returns theExtensionRegistry
listing all known Protocol Buffers extension messages toT
registered with thisProtoCoder
.Class<T>
getMessageType()
Returns the Protocol BuffersMessage
type thisProtoCoder
supports.int
hashCode()
static <T extends com.google.protobuf.Message>
ProtoCoder<T>of(Class<T> protoMessageClass)
Returns aProtoCoder
for the given Protocol BuffersMessage
.static <T extends com.google.protobuf.Message>
ProtoCoder<T>of(String protoMessageClassName, List<String> extensionHostClassNames)
Deprecated.For JSON deserialization only.static <T extends com.google.protobuf.Message>
ProtoCoder<T>of(TypeDescriptor<T> protoMessageType)
void
verifyDeterministic()
ThrowCoder.NonDeterministicException
if the coding is not deterministic.ProtoCoder<T>
withExtensionsFrom(Class<?>... moreExtensionHosts)
ProtoCoder<T>
withExtensionsFrom(Iterable<Class<?>> moreExtensionHosts)
Returns aProtoCoder
like this one, but with the extensions from the given classes registered.-
Methods inherited from class com.google.cloud.dataflow.sdk.coders.AtomicCoder
getCoderArguments, getInstanceComponents
-
Methods inherited from class com.google.cloud.dataflow.sdk.coders.StandardCoder
consistentWithEquals, getAllowedEncodings, getComponents, getEncodedElementByteSize, isRegisterByteSizeObserverCheap, registerByteSizeObserver, structuralValue, toString, verifyDeterministic, verifyDeterministic
-
-
-
-
Method Detail
-
coderProvider
public static CoderProvider coderProvider()
-
of
public static <T extends com.google.protobuf.Message> ProtoCoder<T> of(Class<T> protoMessageClass)
Returns aProtoCoder
for the given Protocol BuffersMessage
.
-
of
public static <T extends com.google.protobuf.Message> ProtoCoder<T> of(TypeDescriptor<T> protoMessageType)
-
withExtensionsFrom
public ProtoCoder<T> withExtensionsFrom(Iterable<Class<?>> moreExtensionHosts)
Returns aProtoCoder
like this one, but with the extensions from the given classes registered.Each of the extension host classes must be an class automatically generated by the Protocol Buffers compiler,
protoc
, that contains messages.Does not modify this object.
-
withExtensionsFrom
public ProtoCoder<T> withExtensionsFrom(Class<?>... moreExtensionHosts)
SeewithExtensionsFrom(Iterable)
.Does not modify this object.
-
encode
public void encode(T value, OutputStream outStream, Coder.Context context) throws IOException
Description copied from interface:Coder
Encodes the given value of typeT
onto the given output stream in the given context.- Throws:
IOException
- if writing to theOutputStream
fails for some reasonCoderException
- if the value could not be encoded for some reason
-
decode
public T decode(InputStream inStream, Coder.Context context) throws IOException
Description copied from interface:Coder
Decodes a value of typeT
from the given input stream in the given context. Returns the decoded value.- Throws:
IOException
- if reading from theInputStream
fails for some reasonCoderException
- if the value could not be decoded for some reason
-
equals
public boolean equals(Object other)
Description copied from class:StandardCoder
- Overrides:
equals
in classStandardCoder<T extends com.google.protobuf.Message>
- Returns:
true
if the twoStandardCoder
instances have the same class and equal components.
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classStandardCoder<T extends com.google.protobuf.Message>
-
getEncodingId
public String getEncodingId()
The encoding identifier is designed to support evolution as per the design of Protocol Buffers. In order to use this class effectively, carefully follow the advice in the Protocol Buffers documentation at Updating A Message Type.In particular, the encoding identifier is guaranteed to be the same for
ProtoCoder
instances of the same principal message class, with the same registered extension host classes, and otherwise distinct. Note that the encoding ID does not encode any version of the message or extensions, nor does it include the message schema.When modifying a message class, here are the broadest guidelines; see the above link for greater detail.
- Do not change the numeric tags for any fields.
- Never remove a
required
field. - Only add
optional
orrepeated
fields, with sensible defaults. - When changing the type of a field, consult the Protocol Buffers documentation to ensure the new and old types are interchangeable.
Code consuming this message class should be prepared to support all versions of the class until it is certain that no remaining serialized instances exist.
If backwards incompatible changes must be made, the best recourse is to change the name of your Protocol Buffers message class.
- Specified by:
getEncodingId
in interfaceCoder<T extends com.google.protobuf.Message>
- Overrides:
getEncodingId
in classStandardCoder<T extends com.google.protobuf.Message>
-
verifyDeterministic
public void verifyDeterministic() throws Coder.NonDeterministicException
Description copied from class:DeterministicStandardCoder
ThrowCoder.NonDeterministicException
if the coding is not deterministic.In order for a
Coder
to be considered deterministic, the following must be true:- two values that compare as equal (via
Object.equals()
orComparable.compareTo()
, if supported) have the same encoding. - the
Coder
always produces a canonical encoding, which is the same for an instance of an object even if produced on different computers at different times.
- Specified by:
verifyDeterministic
in interfaceCoder<T extends com.google.protobuf.Message>
- Overrides:
verifyDeterministic
in classDeterministicStandardCoder<T extends com.google.protobuf.Message>
- Throws:
Coder.NonDeterministicException
- if this coder is not deterministic.
- two values that compare as equal (via
-
getMessageType
public Class<T> getMessageType()
Returns the Protocol BuffersMessage
type thisProtoCoder
supports.
-
getExtensionRegistry
public com.google.protobuf.ExtensionRegistry getExtensionRegistry()
Returns theExtensionRegistry
listing all known Protocol Buffers extension messages toT
registered with thisProtoCoder
.
-
of
@Deprecated public static <T extends com.google.protobuf.Message> ProtoCoder<T> of(String protoMessageClassName, @Nullable List<String> extensionHostClassNames)
Deprecated. For JSON deserialization only.
-
asCloudObject
public com.google.cloud.dataflow.sdk.util.CloudObject asCloudObject()
Description copied from interface:Coder
Returns theCloudObject
that represents thisCoder
.- Specified by:
asCloudObject
in interfaceCoder<T extends com.google.protobuf.Message>
- Overrides:
asCloudObject
in classStandardCoder<T extends com.google.protobuf.Message>
-
-