Class CloudBigtableIO
- java.lang.Object
-
- com.google.cloud.bigtable.beam.CloudBigtableIO
-
@Experimental public class CloudBigtableIO extends Object
Utilities to createPTransform
s for reading and writing Google Cloud Bigtable entities in a Beam pipeline.Google Cloud Bigtable offers you a fast, fully managed, massively scalable NoSQL database service that's ideal for web, mobile, and Internet of Things applications requiring terabytes to petabytes of data. Unlike comparable market offerings, Cloud Bigtable doesn't require you to sacrifice speed, scale, or cost efficiency when your applications grow. Cloud Bigtable has been battle-tested at Google for more than 10 years--it's the database driving major applications such as Google Analytics and Gmail.
To use
CloudBigtableIO
, users must use gcloud to get a credential for Cloud Bigtable:$ gcloud auth login
To read a
PCollection
from a table, with an optionalScan
, useread(CloudBigtableScanConfiguration)
:PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create(); Pipeline p = Pipeline.create(options); PCollection<Result> = p.apply( Read.from(CloudBigtableIO.read( new CloudBigtableScanConfiguration.Builder() .withProjectId("project-id") .withInstanceId("instance-id") .withTableId("table-id") .build())));
To write a
PCollection
to a table, usewriteToTable(CloudBigtableTableConfiguration)
:PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create(); Pipeline p = Pipeline.create(options); PCollection<Mutation> mutationCollection = ...; mutationCollection.apply( CloudBigtableIO.writeToTable( new CloudBigtableScanConfiguration.Builder() .withProjectId("project-id") .withInstanceId("instance-id") .withTableId("table-id") .build()));
-
-
Nested Class Summary
Nested Classes Modifier and Type Class and Description static class
CloudBigtableIO.CloudBigtableMultiTableWriteFn
ADoFn
that can write either a bounded or unboundedPCollection
ofKV
of (String tableName, List ofMutation
s) to the specified table.static class
CloudBigtableIO.CloudBigtableSingleTableBufferedWriteFn
ADoFn
that can write either a bounded or unboundedPCollection
ofMutation
s to a table specified via aCloudBigtableTableConfiguration
using the BufferedMutator.static class
CloudBigtableIO.Source
protected static class
CloudBigtableIO.SourceWithKeys
ABoundedSource
for a Cloud BigtableTable
with a start/stop key range, along with a potential filter via aScan
.
-
Constructor Summary
Constructors Constructor and Description CloudBigtableIO()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method and Description static BoundedSource<Result>
read(CloudBigtableScanConfiguration config)
static PTransform<PCollection<KV<String,Iterable<Mutation>>>,PDone>
writeToMultipleTables(CloudBigtableConfiguration config)
Creates aPTransform
that can write either a bounded or unboundedPCollection
ofKV
of (String tableName, List ofMutation
s) to the specified table.static PTransform<PCollection<Mutation>,PDone>
writeToTable(CloudBigtableTableConfiguration config)
Creates aPTransform
that can write either a bounded or unboundedPCollection
ofMutation
s to a table specified via aCloudBigtableTableConfiguration
.
-
-
-
Method Detail
-
writeToTable
public static PTransform<PCollection<Mutation>,PDone> writeToTable(CloudBigtableTableConfiguration config)
Creates aPTransform
that can write either a bounded or unboundedPCollection
ofMutation
s to a table specified via aCloudBigtableTableConfiguration
.NOTE: This
PTransform
will writePut
s andDelete
s, notAppend
s andIncrement
s. This limitation exists because if the batch fails partway through, Appends/Increments might be re-run, causing theMutation
to be executed twice, which is never the user's intent. Re-running a Delete will not cause any differences. Re-running a Put isn't normally a problem, but might cause problems in some cases when the number of versions supported by the column family is greater than one. In a case where multiple versions could be a problem, it's best to add a timestamp to thePut
.
-
writeToMultipleTables
public static PTransform<PCollection<KV<String,Iterable<Mutation>>>,PDone> writeToMultipleTables(CloudBigtableConfiguration config)
Creates aPTransform
that can write either a bounded or unboundedPCollection
ofKV
of (String tableName, List ofMutation
s) to the specified table.NOTE: This
PTransform
will writePut
s andDelete
s, notAppend
s andIncrement
s. This limitation exists because if the batch fails partway through, Appends/Increments might be re-run, causing theMutation
to be executed twice, which is never the user's intent. Re-running a Delete will not cause any differences. Re-running a Put isn't normally a problem, but might cause problems in some cases when the number of versions supported by the column family is greater than one. In a case where multiple versions could be a problem, it's best to add a timestamp to thePut
.
-
read
public static BoundedSource<Result> read(CloudBigtableScanConfiguration config)
-
-