Stream analytics solution

Ingest, process, and analyze event streams in real-time on fully-managed infrastructure

Try it Free Contact sales

Integrated and open stream analytics

Stream analytics has emerged as a simpler, faster alternative to batch ETL for getting maximum value from user-interaction events and application and machine logs. Ingesting, processing, and analyzing these data streams quickly and efficiently is critical in fraud detection, clickstream analysis, and online recommendations, among many examples. For such use cases, Google Cloud offers an integrated and open stream analytics solution that is easy to adopt, scale, and manage.

Respond to events as they happen

Ingest millions of streaming events per second from anywhere in the world with Cloud Pub/Sub, powered by Google's unique, high-speed private network. Process the streams with Cloud Dataflow to ensure reliable, exactly-once, low-latency data transformation. Stream the transformed data into BigQuery, the cloud-native data warehousing service, for immediate analysis via SQL or popular visualization tools. Finally, bring predictive analytics to fraud detection, real-time personalization and similar use cases by integrating TensorFlow-based Cloud Machine Learning models and APIs into your streaming data pipelines.

Accelerate development, with no compromises

Stream analytics on GCP simplifies ETL pipelines without compromising robustness, accuracy, or functionality. Cloud Dataflow supports fast pipeline development via expressive Java and Python APIs in the Apache Beam SDK, which provides a rich set of windowing and session analysis primitives as well as an ecosystem of source and sink connectors. Plus, Beam’s unique, unified development model lets you reuse more code across streaming and batch pipelines.

Simplify operations and management

Once your streaming data processing pipelines are deployed, GCP’s serverless approach removes operational overhead with performance, scaling, availability, security and compliance handled automatically. Integration with Stackdriver, GCP’s unified logging and monitoring solution, lets you monitor and troubleshoot your pipelines as they are running. Rich visualization, logging, and advanced alerting help you identify and respond to potential issues.

Keep your favorite tools and systems

Stream analytics on GCP is open and interoperable by design. Cloud Pub/Sub’s open API and multiple clients enable multi-cloud and hybrid deployments. For Apache Kafka users, a Cloud Dataflow connector makes integration with GCP easy, and BigQuery works seamlessly with the ETL and BI tools you know and love via standard SQL. Data processing pipelines written with the Beam-based Cloud Dataflow 2.x SDK are portable across Cloud Dataflow, Apache Spark, and Apache Flink. Finally, Spark support is available via Cloud Dataproc for streaming and batch workloads.

SOLUTION COMPONENTS

Service Use Case for Stream Analytics
Cloud Pub/Sub For large-scale ingestion of streaming data originating anywhere in the world. (Open source alternative in this solution: Apache Kafka)
Cloud Dataflow For transforming and enriching ingested data in streaming and batch modes with equal reliability and expressiveness. (Open source alternative in this solution: Spark on Cloud Dataproc)
BigQuery Fully-managed data warehouse service that supports 100,000 streaming row inserts per second and allows ad hoc analysis on real-time data with standard SQL.
Apache Beam Unified development framework for programming streaming and batch pipelines. Shipped by Google as Cloud Dataflow SDK 2.x.
Cloud Machine Learning Add an extra layer of intelligence to your pipeline by running the event streams through custom (Cloud Machine Learning Engine) or pre-built (Cloud APIs) TensorFlow-based machine-learning models.
Cloud Bigtable Low-latency wide-column key-value store, ideal for high-volume time series and read latency-sensitive applications.

Additional Resources

Exactly-once Processing

Learn the meaning of “exactly once” processing in Cloud Dataflow.

View Blog Post

Cloud Dataflow: Sample Pipelines

Understand how pipelines work through mobile gaming examples.

View Documentation

Codelab: NYC Taxi Tycoon

Step through a guided hands-on coding experience on how to process streaming data with Dataflow and Pub/Sub.

Explore Sample App

Financial Services Solution

Build a near real-time analytics system that can scale to thousands of simultaneous data streams.

Read Solution Paper

Architecture Diagram

Review the architecture for optimizing large-scale analytics ingestion on Google Cloud Platform.

Read Article

Streaming 101

Read Tyler Akidau’s seminal paper on the world beyond batch.

Read Paper

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.