Pub/Sub Lite is a zonal, real-time messaging service that decouples services that produce events from services that process events. You can manually configure Pub/Sub Lite system throughput and storage capacity.
The Pub/Sub Lite Spark Connector supports Pub/Sub Lite as an input source to Apache Spark Structured Streaming in the default micro-batch processing and experimental continuous processing modes.
Using Pub/Sub Lite with Dataproc
samples directory in the
java-pubsublite-spark repository on
a Spark example in Java that uses Pub/Sub Lite with
Dataproc. To run the example, follow the
directions in the Spark example.
- To get started, clone the
git clone https://github.com/googleapis/java-pubsublite-spark cd java-pubsublite-spark/samples
Python / Scala
The connector is available from the Maven Central repository.
You can download and provide it via the
--packages option when using the
spark-submit command or set it via the spark.jars.packages
For more information
- See the Pub/Sub Lite documentation.
- Select the version of the Pub/Sub Lite Spark Connector here, then download its JAR on the linked page.