Overview (Google Cloud Dataflow SDK 1.9.1 API)

Google Cloud Dataflow SDK for Java, version 1.9.1

The Google Cloud Dataflow SDK for Java provides a simple and elegant programming model to express your data processing pipelines; see our product page for more information and getting started instructions.

See: Description

Packages 
Package Description
com.google.cloud.dataflow.sdk
Provides a simple, powerful model for building both batch and streaming parallel data processing Pipelines.
com.google.cloud.dataflow.sdk.annotations
Defines annotations used across the SDK.
com.google.cloud.dataflow.sdk.coders
Defines Coders to specify how data is encoded to and decoded from byte strings.
com.google.cloud.dataflow.sdk.coders.protobuf
Defines a Coder for Protocol Buffers messages, ProtoCoder.
com.google.cloud.dataflow.sdk.io
Defines transforms for reading and writing common storage formats, including AvroIO, BigQueryIO, and TextIO.
com.google.cloud.dataflow.sdk.io.bigtable
Defines transforms for reading and writing from Google Cloud Bigtable.
com.google.cloud.dataflow.sdk.io.datastore
Provides an API for reading from and writing to Google Cloud Datastore over different versions of the Cloud Datastore Client libraries.
com.google.cloud.dataflow.sdk.io.range
Provides thread-safe helpers for implementing dynamic work rebalancing in position-based bounded sources.
com.google.cloud.dataflow.sdk.options
Defines PipelineOptions for configuring pipeline execution.
com.google.cloud.dataflow.sdk.runners
Defines runners for executing Pipelines in different modes, including DirectPipelineRunner and DataflowPipelineRunner.
com.google.cloud.dataflow.sdk.runners.inprocess
Defines the InProcessPipelineRunner, which executes both Bounded and Unbounded Pipelines on the local machine.
com.google.cloud.dataflow.sdk.testing
Defines utilities for unit testing Dataflow pipelines.
com.google.cloud.dataflow.sdk.transforms
Defines PTransforms for transforming data in a pipeline.
com.google.cloud.dataflow.sdk.transforms.display
Defines HasDisplayData for annotating components which provide display data used within UIs and diagnostic tools.
com.google.cloud.dataflow.sdk.transforms.join
Defines the CoGroupByKey transform for joining multiple PCollections.
com.google.cloud.dataflow.sdk.transforms.windowing
Defines the Window transform for dividing the elements in a PCollection into windows, and the Trigger for controlling when those elements are output.
com.google.cloud.dataflow.sdk.values
Defines PCollection and other classes for representing data in a Pipeline.

The Google Cloud Dataflow SDK for Java provides a simple and elegant programming model to express your data processing pipelines; see our product page for more information and getting started instructions.

The easiest way to use the Google Cloud Dataflow SDK for Java is via one of the released artifacts from the Maven Central Repository. See our release notes for more information about each released version.

Version numbers use the form major.minor.incremental and are incremented as follows:

  • major version for incompatible API changes
  • minor version for new functionality added in a backward-compatible manner
  • incremental version for forward-compatible bug fixes

Please note that APIs marked @Experimental may change at any point and are not guaranteed to remain compatible across versions.


Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Cloud Dataflow