Cross-cloud materialized views in BigQuery Omni enable multi-cloud analytics at scale
Vidya Shanmugam
Group Product Manager
Justin Levandoski
Director, Engineering
As more and more organizations embrace multi-cloud data architectures, a top request we constantly receive from customers is how to make cross-cloud analytics super simple and cost-effective in BigQuery. To help customers on their cross-cloud analytics journey, today we are thrilled to announce the public preview of BigQuery Omni cross-cloud materialized views (aka cross-cloud MVs). Cross-cloud MVs allow customers to very easily create a summary materialized view on GCP from base data assets available on another cloud. Cross-cloud MVs are automatically and incrementally maintained as base tables change, meaning only a minimal data transfer is necessary to keep the materialized view on GCP in sync. The result is an industry-first, cost-effective and scalable capability that empowers customers to perform frictionless, efficient, and economical cross-cloud analytics.
Why do organizations need cross-cloud materialized views?
The demand for cross-cloud MVs has been growing, driven by customers wanting to do more with their data across cloud platforms while leaving large data assets intact in separate clouds. Today, analytics on data assets across clouds is cumbersome, as it usually involves copying or replicating large datasets across cloud providers. This process is not only burdensome to manage, but also incur substantial data transfer costs. By integrating cross-cloud MVs, customers are looking to optimize these processes, seeking both efficiency and cost-effectiveness in their data operations.
Some of the key customer use cases where cross-cloud MVs can greatly simplify workflows while reducing costs include:
- Predictive analytics: Organizations are increasingly eager to harness Google Cloud’s cutting-edge AI/ML capabilities with Vertex AI integration. With the ability to effortlessly build ML models on GCP using cross-cloud MVs, — leveraging Google’s large language foundational models like PaLM 2 and Gemini — customers are excited to discover new ways of interacting with their data. To leverage the power of Vertex AI and Google Cloud’s large language models (LLMs), cross-cloud MVs seamlessly ingest and combine data across a customer’s multi-cloud environments.
- Cross-cloud or cross-region data summarization for compliance: There’s an emerging set of privacy use cases where raw data cannot leave the source region, as it must adhere to stringent data sovereignty regulations. However, there is a viable workaround for cross-regional or cross-cloud data sharing and collaboration: aggregating, summarizing, and roll-ups of the data. This processed data, which complies with privacy standards, can be replicated across regions for data sharing and consumption with other team and partner organizations and kept up to date incrementally through cross-cloud MVs.
- Marketing analytics: Organizations often find themselves combining data sources from various cloud platforms. A common scenario involves the integration of CRM, user profile, or transaction data on one cloud with campaign management or ads-related data in Google Ads Data Hub. This integration is critical to ensure a privacy-safe method of segmenting customers, managing campaigns, and other marketing analytics requirements. Some of the user profile and transaction data is available in another cloud and oftentimes only a subset or summary of this data is required to be brought through these cross-cloud MVs to join with Ads or campaign data available on Google platform. Customers also want to ensure these integrations meet their high-levels of efficiency, and provide governance controls over their data.
- Near real-time business analytics: Real-time insights rely on powerful business intelligence (BI) dashboards and reporting tools. These analytical applications are crucial as they aggregate and integrate data from multiple sources. To reflect the most up-to-date business info, these dashboards require regular updates at intervals aligned with business needs — whether that’s hourly, daily, or weekly. Cross-cloud MVs enable consistent updates with the latest data regardless of where data assets live, ensuring that derived insights are relevant and timely. Combining these capabilities with GCP’s powerful Looker platform and semantic models further provides value and updated insights to end users.
Benefits
BigQuery Omni’s cross-Cloud MV solution has a unique set of features and benefits:
- Ease of use: Cross-cloud MVs simplify the process of combining and analyzing data regardless of whether the data assets live on different clouds. They minimize the complexity of running and managing complex analytics pipelines, large scale duplication of data especially when dealing with frequently changing data.
- Significant cost reduction: It significantly reduces egress costs of bringing data across clouds by only transferring the incremental data when needed.
- Automatic refresh: Designed for convenience, cross-cloud MVs work out of the box, automatically refreshing and incrementally updating based on user specifications.
- Unified governance: BigQuery Omni provides secure and governed access to materialized views in both clouds. This feature is crucial for both local analytics and cross-cloud analytics needs.
- Single pane of glass: The solution provides seamless access through the familiar BigQuery user interface for defining, querying and managing cross-cloud MVs.
Industries & Customer scenarios
Cross-cloud MVs offer significant benefits across a variety of industries and customer scenarios as illustrated below:
- Healthcare: Data scientists in one department want to bring summaries of their data in regular intervals (daily or weekly) from AWS to Google Cloud (BigQuery) for aggregate analytics and model building.
- Media and entertainment: A marketing analyst wants to join, de-duplicate, and segment AdsWhiz data from AWS with listener and audience data on a weekly basis from Google Cloud to expand audience reach.
- Telecom: A data analyst seeks to centralize log level data from AWS and streaming data from Ads server for revenue targeting periodically.
- Education: Data analysts need to join product instrumentation data on AWS with enterprise-level data on Google Cloud. As new products are added to their platform, they want to simplify their company ETL pipeline and cost challenges by using cross-cloud MVs.
- Retail: A marketing analyst needs to join their user profile data from Azure with campaign data in Ads Data Hub in a privacy-safe manner. With new retail users coming into the system daily, they rely on cross-cloud MVs for regular combined analysis, ensuring up-to-date data processing.
Embrace the future of cross-cloud analytics
With cross-cloud MVs, we are empowering organizations to break down cloud silos and harness the power of their rich, changing data, in near real-time in Google Cloud. This breakthrough capability is not only shaping the future of cross-cloud analytics, but also multi-cloud architectures, enabling customers to achieve new levels of flexibility, cost-effectiveness, and actionable insights. With a powerful combo of cross-cloud analytics with BigQuery Omni and agile semantics with Looker, we are able to bring rich and actionable insights faster and more easily to data consumers.
Ability to create Cross-cloud MV in BQ using SQL:
Ability to perform effective and cost efficient cross-cloud analytics with cross-cloud MVs
Learn more
To learn more about cross-cloud MVs and how they can transform your organization's cross-cloud analytics capabilities, watch the demo, explore the public documentation and try the product in action now.