Create a change stream-enabled table and capture changes
Learn how to set up a Bigtable table with a change stream enabled, run a change stream pipeline, make changes to your table, and then see the changes streamed.
Before you begin
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Dataflow, Cloud Bigtable API, and Cloud Bigtable Admin API APIs.
-
In the Google Cloud console, activate Cloud Shell.
Create a table with a change stream enabled
In the Google Cloud console, go to the Bigtable Instances page.
Click the ID of the instance that you are using for this quickstart.
If you don't have an instance available, create an instance with the default configurations in a region near you.
In the left navigation pane, click Tables.
Click Create a table.
Name the table
change-streams-quickstart
.Add a column family named
cf
.Select Enable change stream.
Click Create.
Initialize a data pipeline to capture the change stream
In the Cloud Shell, run the following commands to download the code and run it.
git clone https://github.com/GoogleCloudPlatform/java-docs-samples.git cd java-docs-samples/bigtable/beam/change-streams mvn compile exec:java -Dexec.mainClass=ChangeStreamsHelloWorld \ "-Dexec.args=--project=PROJECT_ID --bigtableProjectId=PROJECT_ID \ --bigtableInstanceId=BIGTABLE_INSTANCE_ID --bigtableTableId=change-streams-quickstart \ --runner=dataflow --region=BIGTABLE_REGION --experiments=use_runner_v2"
Replace the following:
- PROJECT_ID: the ID of the project that you are using
- BIGTABLE_INSTANCE_ID: the ID of the instance to contain the new table
- BIGTABLE_REGION: the region that your Bigtable instance is in, such as
us-east5
In the Google Cloud console, go to the Dataflow page.
Click the job with a name that begins with changestreamquickstart.
At the bottom of the screen, click Show to open the logs panel.
Click Worker logs to monitor the output of the change stream.
In the Cloud Shell, write some data to Bigtable to see the change stream process.
cbt -instance=BIGTABLE_INSTANCE_ID -project=PROJECT_ID \ import change-streams-quickstart quickstart-data.csv column-family=cf
In the Google Cloud console, make sure that Severity is set to at least
Info
.The worker log output logs contain this:
Change captured: user123#2023,USER,SetCell,cf,col1,abc Change captured: user546#2023,USER,SetCell,cf,col1,def Change captured: user789#2023,USER,SetCell,cf,col1,ghi
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
Disable the change stream on the table
gcloud bigtable instances tables update change-streams-quickstart --instance=BIGTABLE_INSTANCE_ID \ --clear-change-stream-retention-period
Delete the table
change-streams-quickstart
:cbt -instance=BIGTABLE_INSTANCE_ID -project=PROJECT_ID deletetable change-streams-quickstart
Stop the change stream pipeline:
In the Google Cloud console, go to the Dataflow Jobs page.
Select your streaming job from the job list.
In the navigation, click Stop.
In the Stop job dialog, cancel your pipeline, and then click Stop job.
Optional: Delete the instance if you created a new one for this quickstart:
cbt deleteinstance BIGTABLE_INSTANCE_ID