This page provides links to public articles, videos, and podcasts related to Dataflow.
Announcements
To read about announcements and updates, see the following resources:
- Dataflow news: Google Cloud blog
- Dataflow updates: Dataflow release notes
- Apache Beam updates: Apache Beam SDK release notes
Dataflow ecosystem
- Dataflow, the backbone of data analytics
- Dataflow Under the Hood: the origin story: Part 1, Part 2, Part 3
- Visit the Apache Beam open source docs to learn more about developing a unified model, defining pipelines, and running pipelines in Dataflow or one of the Apache Beam supported distributed backends.
Customer stories
Public articles
- Collection of Dataflow stories - Medium Publication
- Dataflow a Leader in 2021 Streaming Analytics - Forrester Wave
- Building a tool which provides real-time feedback on live audiences - ITV
- Monitoring your Dataflow pipelines - Medium Publication
- Execution Model for Highly Scalable, Low-Latency Data Processing - Medium Publication
- Accelerating Machine Learning Model Inference on Dataflow with GPUs - Nvidia
- Executing computation against large datasets - Pandora
- Optimized the Largest Dataflow Job Ever for Wrapped 2020 - Spotify
- Processing billions of events in real time using Dataflow - Twitter
- Dataflow In the Smart Home Data Pipeline - Nest
- Streaming JSON messages into BigQuery JSON-type column - Medium
Case studies
- AXA Switzerland: Using Google Cloud analytics solutions to boost internal processes and develop services
- Bayer Crop: Improving soil health and crop management with geospatial analytics on BigQuery and Dataflow
- Dow Jones: Built a knowledge Graph of key events documented in over 30 years of news content
- HSBC: Embracing the cloud to lower risk exposure through rapid insight and analysis capabilities
- Nintendo: Using Dataflow and Pub/Sub to collect and analyze game usage logs in BigQuery
- Quantiphi: Built a serverless real-time credit card fraud detection solution
- SoFi Stadium: Building a fan ready personal concierge app to tailor game day experiences for every user
- Spotify: Experiments with stream processing on Dataflow
- Subaru Corporation: Using Google Cloud AI and Machine learning to accelerate the development
- Telus: Accelerate modernization with data science
- Tokopedia: Created a Customer Data Platform on Google Cloud
- Tyson Foods: Reimagined their Data Platform by developing Ingestion as a Service
- Vodafone: Using Google Cloud to safely share mobile phone data
Videos
Technical guidance
Intro Videos
Articles
- Building the data engineering driven organization
- Create templates from any Dataflow pipeline
- Dataflow Templates for Elastic Cloud
- Dataflow Pipelines: deploy and manage data pipelines at scale
- Dataflow Auto Sharding for BigQuery delivers 3x performance
- Export Google Cloud data into Elastic Stack with Dataflow templates
- Extend your Dataflow template with UDF
- Exactly once processing in Dataflow: Part 1, Part 2, Part 3
- Give your data processing a boost with Dataflow GPU
- Handling duplicate data in streaming pipeline using Dataflow & Pub/Sub
- Learn Apache Beam patterns with Clickstream processing of Google Tag Manager data
- Machine learning patterns with Apache Beam and the Dataflow Runner
- Streaming data into BigQuery using BigQuery Storage Write API
- Simplify and automate data processing with Dataflow Prime
- Three ways Dataflow delivers ROI to Customers
- Use real-time anomaly detection reference patterns to combat fraud
- Using TFX inference with Dataflow for large scale ML inference patterns
- Why you should be using Flex templates for your Dataflow deployments
- Writing Dataflow pipelines with scalability in mind
- Guide to common Dataflow use-case patterns: Part 1, Part 2
Data & Analytics Videos
Troubleshooting and monitoring
Videos
Articles
Podcasts
- Google Cloud Podcast Episode 81 - Dataflow with Frances Perry
- Software Engineering Daily Podcast - Dataflow with Eric Anderson
- Software Engineering Radio Podcast Episode 272: Apache Beam