Create a change stream-enabled table and capture changes

Learn how to set up a Bigtable table with a change stream enabled, run a change stream pipeline, make changes to your table, and then see the changes streamed.

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Make sure that billing is enabled for your Google Cloud project.

  3. Enable the Dataflow, Cloud Bigtable API, and Cloud Bigtable Admin API APIs.

    Enable the APIs

  4. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

Create a table with a change stream enabled

  1. In the Google Cloud console, go to the Bigtable Instances page.

    Go to Instances

  2. Click the ID of the instance that you are using for this quickstart.

    If you don't have an instance available, create an instance with the default configurations in a region near you.

  3. In the left navigation pane, click Tables.

  4. Click Create a table.

  5. Name the table change-streams-quickstart.

  6. Add a column family named cf.

  7. Select Enable change stream.

  8. Click Create.

Initialize a data pipeline to capture the change stream

  1. In the Cloud Shell, run the following commands to download the code and run it.

    git clone https://github.com/GoogleCloudPlatform/java-docs-samples.git
    cd java-docs-samples/bigtable/beam/change-streams
    mvn compile exec:java -Dexec.mainClass=ChangeStreamsHelloWorld \
    "-Dexec.args=--project=PROJECT_ID --bigtableProjectId=PROJECT_ID \
    --bigtableInstanceId=BIGTABLE_INSTANCE_ID --bigtableTableId=change-streams-quickstart \
    --runner=dataflow --region=BIGTABLE_REGION --experiments=use_runner_v2"
    

    Replace the following:

    • PROJECT_ID: the ID of the project that you are using
    • BIGTABLE_INSTANCE_ID: the ID of the instance to contain the new table
    • BIGTABLE_REGION: the region that your Bigtable instance is in, such as us-east5
  2. In the Google Cloud console, go to the Dataflow page.

    Go to Dataflow

  3. Click the job with a name that begins with changestreamquickstart.

  4. At the bottom of the screen, click Show to open the logs panel.

  5. Click Worker logs to monitor the output of the change stream.

  6. In the Cloud Shell, write some data to Bigtable to see the change stream process.

    cbt -instance=BIGTABLE_INSTANCE_ID -project=PROJECT_ID \
    import change-streams-quickstart quickstart-data.csv column-family=cf
    
  7. In the Google Cloud console, make sure that Severity is set to at least Info.

  8. The worker log output logs contain this:

    Change captured: user123#2023,USER,SetCell,cf,col1,abc
    Change captured: user546#2023,USER,SetCell,cf,col1,def
    Change captured: user789#2023,USER,SetCell,cf,col1,ghi
    

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

  1. Disable the change stream on the table

    gcloud bigtable instances tables update change-streams-quickstart --instance=BIGTABLE_INSTANCE_ID \
    --clear-change-stream-retention-period
    
  2. Delete the table change-streams-quickstart:

    cbt -instance=BIGTABLE_INSTANCE_ID -project=PROJECT_ID deletetable change-streams-quickstart
    
  3. Stop the change stream pipeline:

    1. In the Google Cloud console, go to the Dataflow Jobs page.

      Go to Jobs

    2. Select your streaming job from the job list.

    3. In the navigation, click Stop.

    4. In the Stop job dialog, cancel your pipeline, and then click Stop job.

  4. Optional: Delete the instance if you created a new one for this quickstart:

    cbt deleteinstance BIGTABLE_INSTANCE_ID
    

What's next