com.google.cloud.dataflow.sdk.coders (Google Cloud Dataflow SDK 1.9.0 API)

Package com.google.cloud.dataflow.sdk.coders

Defines Coders to specify how data is encoded to and decoded from byte strings.

See: Description

Package com.google.cloud.dataflow.sdk.coders Description

Defines Coders to specify how data is encoded to and decoded from byte strings.

During execution of a Pipeline, elements in a PCollection may need to be encoded into byte strings. This happens both at the beginning and end of a pipeline when data is read from and written to persistent storage and also during execution of a pipeline when elements are communicated between machines.

Exactly when PCollection elements are encoded during execution depends on which PipelineRunner is being used and how that runner chooses to execute the pipeline. As such, Dataflow requires that all PCollections have an appropriate Coder in case it becomes necessary. In many cases, the Coder can be inferred from the available Java type information and the Pipeline's CoderRegistry. It can be specified per PCollection via PCollection.setCoder(Coder) or per type using the DefaultCoder annotation.

This package provides a number of coders for common types like Integer, String, and List, as well as coders like AvroCoder that can be used to encode many custom types.


Send feedback about...

Cloud Dataflow