Copy messages to and from Pub/Sub Lite
This page shows how to use the Pub/Sub Copy Pipeline to copy messages between Pub/Sub Lite and other messaging systems.
The Pub/Sub Copy Pipeline is a Dataflow Flex Template that copies all data from a Pub/Sub, Pub/Sub Lite, or Apache Kafka topic to a Pub/Sub, Pub/Sub Lite, or Apache Kafka topic, or BigQuery table.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the Pub/Sub Lite, Pub/Sub, Dataflow, Cloud Storage APIs.
- Install and initialize the Google Cloud CLI.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the Pub/Sub Lite, Pub/Sub, Dataflow, Cloud Storage APIs.
- Install and initialize the Google Cloud CLI.
Copying data from Pub/Sub to Pub/Sub Lite
Create a Lite topic
Create a Lite topic using the following steps:
In the Google Cloud console, go to the Lite Topics page.
Click Create Lite topic.
Select a region and a zone.
In the Name section, enter
your-lite-topic
as the Lite topic ID. The Lite topic name includes the Lite topic ID, the zone, and the project number.Click Create.
Create a Lite subscription
Create a Lite subscription using the following steps:
In the Google Cloud console, go to the Lite Subscriptions page.
Click Create Lite subscription.
In the Lite subscription ID field, enter
your-lite-subscription
.Select a Lite topic to receive messages from.
In the Delivery requirement section, select Deliver messages after stored.
Click Create.
The Lite subscription is in the same zone as the Lite topic.
Run the pipeline
Run the following gcloud command to copy taxi ride data from a public Pub/Sub topic to your new Lite topic.
gcloud dataflow flex-template run "copy-taxirides-to-pubsub-lite-`date +%Y%m%d-%H%M%S`" \ --template-file-gcs-location "gs://pubsub-streaming-sql-copier/template/copier.json" \ --region "REGION" \ --parameters sourceType=pubsub \ --parameters sourceLocation="projects/pubsub-public-data/topics/taxirides-realtime" \ --parameters sinkType=pubsublite \ --parameters sinkLocation="projects/PROJECT_ID/locations/ZONE/topics/your-lite-topic"
Copying data from Pub/Sub Lite to Pub/Sub
In this section, you will run a pipeline to copy the data from your newly populated Pub/Sub Lite subscription to a new Pub/Sub topic.
Create a Pub/Sub topic
Go to the Pub/Sub topics page in the Google Cloud console.
Click Create a topic.
In the Topic ID field, provide a unique topic name, for example,
your-pubsub-topic
.Click Save.
Run the pipeline
Run the following gcloud command to copy the data from your Pub/Sub Lite subscription (created in the section above) to the Pub/Sub topic you just created.
gcloud dataflow flex-template run "copy-pubsublite-to-pubsub-`date +%Y%m%d-%H%M%S`" \ --template-file-gcs-location "gs://pubsub-streaming-sql-copier/template/copier.json" \ --region "REGION" \ --parameters sourceType=pubsublite \ --parameters sourceLocation="projects/PROJECT_ID/locations/ZONE/subscriptions/your-lite-subscription" \ --parameters sinkType=pubsub \ --parameters sinkLocation="projects/PROJECT_ID/topics/your-pubsub-topic"
Other available data sources (optional)
The same pipeline can be used to copy to and from other data sources.
Copying data to/from Apache Kafka
Set the relevant sourceType
or sinkType
parameter to kafka
, and set the
sourceLocation
or sinkLocation
parameter to <host:port>/<topic name>
(e.g.
111.128.2.22:8000/my-topic
), with the host and port of a broker to bootstrap
with.
The IP must be accessible on the same network as the dataflow pipeline. If
running on Compute Engine, this is the internal IP address; if running
elsewhere, you will need to configure Cloud Interconnect to expose your broker
IP addresses to virtual machines running within Google Cloud. If the VPC
is not the default
network, you will need to configure the network parameter
when running the template.
Copying data to BigQuery
The pipeline can also copy data to (but not from) a BigQuery table. To do this,
set the sinkType
parameter to bigquery
, and the sinkLocation
parameter to
your table identifier in bq
command line tool format. To use a BigQuery table
as a sink, it must have the following format:
CREATE TABLE <tablename>( message_key BYTES, event_timestamp TIMESTAMP, attributes ARRAY<STRUCT<key STRING, values ARRAY<BYTES>>>, payload BYTES, )
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
In the Google Cloud console, go to the Lite Topics page.
Click your-lite-topic.
In the Lite topic details page, click Delete.
In the field that appears, enter
delete
to confirm that you want to delete the Lite topic.Click Delete.
Repeat these steps for your Pub/Sub topic.
What's next
- Learn more about Lite topics and Lite subscriptions.
- Learn more about sending and receiving messages.
- Look through code samples for the client library.