Google Cloud Dataflow SDK for Java, version 1.9.1
Class DatastoreIO
- java.lang.Object
-
- com.google.cloud.dataflow.sdk.io.DatastoreIO
-
Deprecated.replaced byDatastoreIO
@Deprecated @Experimental(value=SOURCE_SINK) public class DatastoreIO extends Object
DatastoreIO
provides an API to Read and WritePCollections
of Google Cloud DatastoreDatastoreV1.Entity
objects.Google Cloud Datastore is a fully managed NoSQL data storage service. An
Entity
is an object in Datastore, analogous to a row in traditional database table.This API currently requires an authentication workaround. To use
DatastoreIO
, users must use thegcloud
command line tool to get credentials for Datastore:$ gcloud auth login
To read a
PCollection
from a query to Datastore, usesource()
and its methodsDatastoreIO.Source.withDataset(java.lang.String)
andDatastoreIO.Source.withQuery(com.google.api.services.datastore.DatastoreV1.Query)
to specify the dataset to query and the query to read from. You can optionally provide a namespace to query within usingDatastoreIO.Source.withNamespace(java.lang.String)
or a Datastore host usingDatastoreIO.Source.withHost(java.lang.String)
.For example:
// Read a query from Datastore PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create(); Query query = ...; String dataset = "..."; Pipeline p = Pipeline.create(options); PCollection<Entity> entities = p.apply( Read.from(DatastoreIO.source() .withDataset(datasetId) .withQuery(query) .withHost(host)));
or:
// Read a query from Datastore using the default namespace and host PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create(); Query query = ...; String dataset = "..."; Pipeline p = Pipeline.create(options); PCollection<Entity> entities = p.apply(DatastoreIO.readFrom(datasetId, query)); p.run();
Note: Normally, a Cloud Dataflow job will read from Cloud Datastore in parallel across many workers. However, when the
DatastoreV1.Query
is configured with a limit usingDatastoreV1.Query.Builder.setLimit(int)
, then all returned results will be read by a single Dataflow worker in order to ensure correct data.To write a
PCollection
to a Datastore, usewriteTo(java.lang.String)
, specifying the datastore to write to:PCollection<Entity> entities = ...; entities.apply(DatastoreIO.writeTo(dataset)); p.run();
To optionally change the host that is used to write to the Datastore, use
sink()
to build aDatastoreIO.Sink
and write to it using theWrite
transform:PCollection<Entity> entities = ...; entities.apply(Write.to(DatastoreIO.sink().withDataset(dataset).withHost(host)));
Entities
in thePCollection
to be written must have completeKeys
. CompleteKeys
specify thename
andid
of theEntity
, where incompleteKeys
do not. Anamespace
other than the project default may be written to by specifying it in theEntity
Keys
.Key.Builder keyBuilder = DatastoreHelper.makeKey(...); keyBuilder.getPartitionIdBuilder().setNamespace(namespace);
Entities
will be committed as upsert (update or insert) mutations. Please read Entities, Properties, and Keys for more information aboutEntity
keys.Permissions
Permission requirements depend on thePipelineRunner
that is used to execute the Dataflow job. Please refer to the documentation of correspondingPipelineRunner
s for more details.Please see Cloud Datastore Sign Up for security and permission related information specific to Datastore.
- See Also:
PipelineRunner
-
-
Nested Class Summary
Nested Classes Modifier and Type Class and Description static class
DatastoreIO.DatastoreReader
Deprecated.ASource.Reader
over the records from a query of the datastore.static class
DatastoreIO.Sink
Deprecated.static class
DatastoreIO.Source
Deprecated.ASource
that reads the result rows of a Datastore query asEntity
objects.
-
Field Summary
Fields Modifier and Type Field and Description static int
DATASTORE_BATCH_UPDATE_LIMIT
Deprecated.Datastore has a limit of 500 mutations per batch operation, so we flush changes to Datastore every 500 entities.static String
DEFAULT_HOST
Deprecated.
-
Constructor Summary
Constructors Constructor and Description DatastoreIO()
Deprecated.
-
Method Summary
All Methods Static Methods Concrete Methods Deprecated Methods Modifier and Type Method and Description static DatastoreIO.Source
read()
Deprecated.the name and return type do not match. Usesource()
.static Read.Bounded<DatastoreV1.Entity>
readFrom(String datasetId, DatastoreV1.Query query)
Deprecated.Returns aPTransform
that reads Datastore entities from the query against the given dataset.static Read.Bounded<DatastoreV1.Entity>
readFrom(String host, String datasetId, DatastoreV1.Query query)
static DatastoreIO.Sink
sink()
Deprecated.Returns a newDatastoreIO.Sink
builder using the default host.static DatastoreIO.Source
source()
Deprecated.Returns an emptyDatastoreIO.Source
builder with the defaulthost
.static Write.Bound<DatastoreV1.Entity>
writeTo(String datasetId)
Deprecated.
-
-
-
Field Detail
-
DEFAULT_HOST
public static final String DEFAULT_HOST
Deprecated.- See Also:
- Constant Field Values
-
DATASTORE_BATCH_UPDATE_LIMIT
public static final int DATASTORE_BATCH_UPDATE_LIMIT
Deprecated.Datastore has a limit of 500 mutations per batch operation, so we flush changes to Datastore every 500 entities.- See Also:
- Constant Field Values
-
-
Method Detail
-
read
@Deprecated public static DatastoreIO.Source read()
Deprecated. the name and return type do not match. Usesource()
.Returns an emptyDatastoreIO.Source
builder with the defaulthost
. Configure thedataset
,query
, andnamespace
usingDatastoreIO.Source.withDataset(java.lang.String)
,DatastoreIO.Source.withQuery(com.google.api.services.datastore.DatastoreV1.Query)
, andDatastoreIO.Source.withNamespace(java.lang.String)
.
-
source
public static DatastoreIO.Source source()
Deprecated.Returns an emptyDatastoreIO.Source
builder with the defaulthost
. Configure thedataset
,query
, andnamespace
usingDatastoreIO.Source.withDataset(java.lang.String)
,DatastoreIO.Source.withQuery(com.google.api.services.datastore.DatastoreV1.Query)
, andDatastoreIO.Source.withNamespace(java.lang.String)
.The resulting
DatastoreIO.Source
object can be passed toRead
to create aPTransform
that will read from Datastore.
-
readFrom
public static Read.Bounded<DatastoreV1.Entity> readFrom(String datasetId, DatastoreV1.Query query)
Deprecated.Returns aPTransform
that reads Datastore entities from the query against the given dataset.
-
readFrom
@Deprecated public static Read.Bounded<DatastoreV1.Entity> readFrom(String host, String datasetId, DatastoreV1.Query query)
Deprecated. prefersource()
withDatastoreIO.Source.withHost(java.lang.String)
,DatastoreIO.Source.withDataset(java.lang.String)
,DatastoreIO.Source.withQuery(com.google.api.services.datastore.DatastoreV1.Query)
s.Returns aPTransform
that reads Datastore entities from the query against the given dataset and host.
-
sink
public static DatastoreIO.Sink sink()
Deprecated.Returns a newDatastoreIO.Sink
builder using the default host. You need to further configure it usingDatastoreIO.Sink.withDataset(java.lang.String)
, and optionallyDatastoreIO.Sink.withHost(java.lang.String)
before using it in aWrite
transform.For example:
p.apply(Write.to(DatastoreIO.sink().withDataset(dataset)));
-
writeTo
public static Write.Bound<DatastoreV1.Entity> writeTo(String datasetId)
Deprecated.
-
-