Google Cloud Dataflow SDK for Java, version 1.9.1
Class DatastoreIO.Source
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.io.Source<T>
-
- com.google.cloud.dataflow.sdk.io.BoundedSource<DatastoreV1.Entity>
-
- com.google.cloud.dataflow.sdk.io.DatastoreIO.Source
-
- All Implemented Interfaces:
- HasDisplayData, Serializable
- Enclosing class:
- DatastoreIO
public static class DatastoreIO.Source extends BoundedSource<DatastoreV1.Entity>
ASource
that reads the result rows of a Datastore query asEntity
objects.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.google.cloud.dataflow.sdk.io.BoundedSource
BoundedSource.BoundedReader<T>
-
Nested classes/interfaces inherited from class com.google.cloud.dataflow.sdk.io.Source
Source.Reader<T>
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method and Description BoundedSource.BoundedReader<DatastoreV1.Entity>
createReader(PipelineOptions pipelineOptions)
Returns a newBoundedSource.BoundedReader
that reads from this source.String
getDataset()
Coder<DatastoreV1.Entity>
getDefaultOutputCoder()
Returns the defaultCoder
to use for the data read from this source.long
getEstimatedSizeBytes(PipelineOptions options)
An estimate of the total size (in bytes) of the data that would be read from this source.String
getHost()
String
getNamespace()
DatastoreV1.Query
getQuery()
void
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.boolean
producesSortedKeys(PipelineOptions options)
Whether this source is known to produce key/value pairs sorted by lexicographic order on the bytes of the encoded key.List<DatastoreIO.Source>
splitIntoBundles(long desiredBundleSizeBytes, PipelineOptions options)
Splits the source into bundles of approximatelydesiredBundleSizeBytes
.String
toString()
void
validate()
Checks that this source is valid, before it can be used in a pipeline.DatastoreIO.Source
withDataset(String datasetId)
DatastoreIO.Source
withHost(String host)
DatastoreIO.Source
withNamespace(String namespace)
DatastoreIO.Source
withQuery(DatastoreV1.Query query)
Returns a newSource
that reads the results of the specified query.
-
-
-
Method Detail
-
getHost
public String getHost()
-
getDataset
public String getDataset()
-
getQuery
public DatastoreV1.Query getQuery()
-
getNamespace
@Nullable public String getNamespace()
-
withDataset
public DatastoreIO.Source withDataset(String datasetId)
-
withQuery
public DatastoreIO.Source withQuery(DatastoreV1.Query query)
Returns a newSource
that reads the results of the specified query.Does not modify this object.
Note: Normally, a Cloud Dataflow job will read from Cloud Datastore in parallel across many workers. However, when the
DatastoreV1.Query
is configured with a limit usingDatastoreV1.Query.Builder.setLimit(int)
, then all returned results will be read by a single Dataflow worker in order to ensure correct data.
-
withHost
public DatastoreIO.Source withHost(String host)
-
withNamespace
public DatastoreIO.Source withNamespace(@Nullable String namespace)
-
getDefaultOutputCoder
public Coder<DatastoreV1.Entity> getDefaultOutputCoder()
Description copied from class:Source
Returns the defaultCoder
to use for the data read from this source.- Specified by:
getDefaultOutputCoder
in classSource<DatastoreV1.Entity>
-
producesSortedKeys
public boolean producesSortedKeys(PipelineOptions options)
Description copied from class:BoundedSource
Whether this source is known to produce key/value pairs sorted by lexicographic order on the bytes of the encoded key.- Specified by:
producesSortedKeys
in classBoundedSource<DatastoreV1.Entity>
-
splitIntoBundles
public List<DatastoreIO.Source> splitIntoBundles(long desiredBundleSizeBytes, PipelineOptions options) throws Exception
Description copied from class:BoundedSource
Splits the source into bundles of approximatelydesiredBundleSizeBytes
.- Specified by:
splitIntoBundles
in classBoundedSource<DatastoreV1.Entity>
- Throws:
Exception
-
createReader
public BoundedSource.BoundedReader<DatastoreV1.Entity> createReader(PipelineOptions pipelineOptions) throws IOException
Description copied from class:BoundedSource
Returns a newBoundedSource.BoundedReader
that reads from this source.- Specified by:
createReader
in classBoundedSource<DatastoreV1.Entity>
- Throws:
IOException
-
validate
public void validate()
Description copied from class:Source
Checks that this source is valid, before it can be used in a pipeline.It is recommended to use
Preconditions
for implementing this method.- Specified by:
validate
in classSource<DatastoreV1.Entity>
-
getEstimatedSizeBytes
public long getEstimatedSizeBytes(PipelineOptions options) throws Exception
Description copied from class:BoundedSource
An estimate of the total size (in bytes) of the data that would be read from this source. This estimate is in terms of external storage size, before any decompression or other processing done by the reader.- Specified by:
getEstimatedSizeBytes
in classBoundedSource<DatastoreV1.Entity>
- Throws:
Exception
-
populateDisplayData
public void populateDisplayData(DisplayData.Builder builder)
Description copied from class:Source
Register display data for the given transform or component.populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData)
. Implementations may callsuper.populateDisplayData(builder)
in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)
to use the namespace of the subcomponent.By default, does not register any display data. Implementors may override this method to provide their own display data.
- Specified by:
populateDisplayData
in interfaceHasDisplayData
- Overrides:
populateDisplayData
in classSource<DatastoreV1.Entity>
- Parameters:
builder
- The builder to populate with display data.- See Also:
HasDisplayData
-
-