Built with BigQuery: How Connected-Stories leverages Google Data Cloud and AI/ML for creating personalized Ad Experiences
Dr. Ali Arsanjani
Director AI/ML Partner Engineering, Google
Data Product Director, Connected-Stories
Try Google Cloud
Start building on Google Cloud with $300 in free credits and 20+ always free products.Free trial
Editor’s note: The post is part of a series highlighting our awesome partners, and their solutions, that are Built with BigQuery
In the field of producing engaging video content such as ads, many marketers ignore the power of data to improve their creative efforts to meet the consumers' need for personalized messages. The demand for creative tech to efficiently personalize is real as marketers need personalized video Ads to reach their audience with the right message at the right time. Data, Insights and Technology are the main ingredients to deliver this value while ensuring security and privacy requirements are met. The Connected-Stories team partnered with Google Cloud to build a platform for Ad personalization. Google Data Cloud and BigQuery are at the forefront to assimilate data, leverage ML models, create personalized ads, and capitalize on real-time intelligence as the core features of the Connected-Stories NEXT platform.
Connected-Stories NEXT is an end-to-end creative management platform to develop, serve, and optimize interactive video and display ads that scale across any channel. The platform ingests first-party data to create custom ML models, measure numerous third-party data points to help brands develop unique customer journeys and create videos that their data signals can drive. An intelligent feedback loop passes real-time data back, enabling brands to make data-driven and actionable video ads that take the brand’s campaigns to the next level.
The core use case of the NEXT platform revolves around collecting user’s interaction data and optimizing for precision and speed to create an actionable Ad experience that is personalized for each user. The platform processes complex data points to create interactive data visualizations that allow for accurate analysis. The platform uses Vertex AI to access managed tools, workflows, and infrastructure to build, deploy, and scale ML models that have improved the accuracy to identify segments for further analysis.
The platform ingests 200M data events with peaks and valleys of activity. These events are processed to generate dashboards that enable users to visualize metrics based on filters in real-time. These dashboards have high performance requirements in terms of a responsive user interface under constantly changing data dimensions.
Google Cloud’s serverless stack coupled with limitless data cloud infrastructure has been the core to the NEXT platform’s data-driven innovation. The growing volume of data ingested, streamed and processed were scaled uniformly across the compute, storage and analytical layers of solution. A lean development team at Connected-Stories were able to focus all-in on the solution, while the serverless stack scaled, lowered attack service in terms of security and optimized the cost footprint through pay-as-you-go features.
BigQuery has been the backbone to support the vast amounts of data spreading over multiple geos resulting in workloads running at petabyte scale. BigQuery’s fully managed serverless architecture, real-time streaming, built-in machine learning and rich business intelligence capabilities distinguishes itself from a cloud data warehouse. It is the foundation needed to approach data and serve users in an unlimited number of ways. For an application with zero tolerance for failure, given its fully managed nature, BigQuery handles replication, recovery, data distributed optimization and management.
The platform's requirements include the need for low maintenance, constantly ingesting and refreshing data and smart-tuning of aggregated data. These capabilities can be implemented by BigQuery’s materialized views feature. Materialized views are useful for precomputed views that regularly cache query results for better performance. These views possess the innate feature to read only the delta change from base tables and calculate the up-to-date aggregations. Materialized views impart faster outputs and consume fewer resources while reducing the cost footprint.
Some key considerations in using Google cloud and focusing on the Serverless stack include: quick onboarding to development, prototyping in short sprints and ease of preparing data in a rapidly changing environment. Typical considerations around low code / no code include data transformation, aggregation and reduced deployment time. These considerations are fulfilled through using serverless capabilities within Google Cloud such as PubSub, Cloud Storage, Cloud Run, Cloud Composer, Dataflow and BigQuery as described in the Architecture diagram below. The use of each of these components and services are described below.
Input/Ingest: At a high-level, microservices hosted in Cloud Run collect and aggregate incoming Ads events.
Enrichment: The output of this stage is a Pub-Sub message enriched with more attributes based on a pre-configured campaign.
Store: a Cloud Dataflow streaming job to create text files in Cloud Storage buckets.
Trigger: Cloud Composer triggers the spark jobs based on text files to process and group them to produce desired output as one record per impression, a logical group of events.
Deploy: Cloud Build is then used to automate all deployments.
Thus far, all Google cloud managed services work together to ingest, store and trigger the orchestration, all of which are scalable based on configurations including autoscaling capabilities.
Visualization: A visualization tool reads data from BigQuery to compute pre-aggregations required for each dashboard.
Data Model Evolution considerations: Though the solution served the purpose of creating pre-aggregations, as the data model evolved by adding a column or creating a new table, it led to recreating pre-aggregations and querying the data again. Alternatively, creating aggregate tables as an extra output of current ETLs seemed like a viable option. However, this would increase the cost and complexity of jobs. A similar situation to reprocess or update aggregated tables would occur as data is updated.
Precomputed views of data that is periodically cached are critical to reach the audience with the right message at the right time.
Performance: In order to increase the performance of the platform, we need to have regularly precomputed views of the data, cached .
Materialized Views: Consumers of these views needed faster response times, to consume fewer resources and output only the changes in comparison to a base table. BigQuery Materialized views were used to solve this very requirement. Materialized views have been highly leveraged to optimize the design resulting in lesser maintenance and access to fresh data with high performance with a relatively low technical investment in creating and maintaining SQL code.
Dashboards: Application dashboards pointing to the Materialized views are highly performant and provide a view into fresh data.
Custom Reports with Vertex AI Notebooks: Vertex AI notebooks directly read data from BigQuery to produce custom reports for a subset of customers. Vertex AI has been hugely beneficial to data analysts, where an environment with pre-installed libraries simplifies the readiness to use. Vertex AI Workbench notebooks are used to share these reports within the team allowing them to work always on the cloud without having the need to download data at any time. Besides, it increases the velocity to develop and test ML models faster.
The NEXT platform has yielded benefits such as customers having the ability to create unique consumer journeys powered by AI / ML personalization triggers, using first-party data and business intelligence tools to capitalize on real-time creative intelligence, which is a dashboard to measure campaign performance for cross-functional teams to analyze the impact of Ad content experience at a granular level. All of these while ensuring controlled access to data to enrich data without moving across clouds. The NEXT platform can keep up with increased demands for agility, scalability and reliability through the underlying usage of Google Cloud.
Partnering with Google, in the context of the Google Built with BigQuery program has surfaced the differentiated value in areas of creating interactive personalized Ads by using real-time data. In addition, by sharing this data across organizations as assets, ML models have fueled higher levels of innovation. Connected-Stories plan to deepen the penetration into the entire spectrum of services offered in the AI/ML area to enhance core functionality and provide newer capabilities to the platform.
Click here to learn more about Connected-Stories NEXT Platform capabilities.
The Built with BigQuery Advantage for ISVs
Through Built with BigQuery, launched in April ‘22 as part of Google Data Cloud Summit, Google is helping tech companies like Connected-Stories co-innovate in building applications that leverage Google’s data cloud with simplified access to technology, helpful and dedicated engineering support, and joint go-to-market programs. Participating companies can:
Get started fast with a Google-funded, pre-configured sandbox.
Accelerate product design and architecture through access to designated technical experts from the ISV Center of Excellence who can share insights from key use cases, architectural patterns, and best practices encountered in the field.
Amplify success with joint marketing programs to drive awareness, generate demand, and increase adoption.
The Google Data Cloud spectrum of products and specifically BigQuery give ISVs the advantage of a powerful, highly scalable data warehouse that’s integrated with Google Cloud’s open, secure, sustainable platform. And with a huge and expanding partner ecosystem and support for multi-cloud, open source tools and APIs, Google provides technology companies the portability and extensibility they need to avoid data lock-in and exercise choice.
We thank the Google Cloud and Connected-Stories team members who co-authored the blog: Connected-Stories: Luna Catini, Marketing Director, Google: Sujit Khasnis, Cloud Partner Engineering