Break free from data silos with Dataplex’s intelligent data fabric that enables organizations to centrally discover, manage, monitor, and govern their data across data lakes, data warehouses, and data marts with consistent controls, providing access to trusted data and powering analytics at scale.
Freedom of choice
Get the freedom to store data where you want for the best price and performance while choosing the best analytics tools, open source or cloud native, to accelerate the entire analytics lifecycle.
Built-in data intelligence using Google’s best-in-class AI/ML capabilities that automate data discovery, metadata harvesting, data lifecycle management, data quality and lineage to reduce management costs.
Enable standardization and unification of metadata, security policies, governance, and data classification for consistency across distributed data.
"We have PBs of data stored in Google Cloud, accessed by 1,000s of internal users daily. Dataplex enables us to deliver a business domain-specific, self-service data platform across distributed data, with decentralized data ownership but centralized governance and visibility. We are very excited to adopt Dataplex as a central component for building a unified data mesh across our analytics data."
Saral Jain, Director of Engineering, Snap Inc
How Dataplex works
As you identify new data sources, Dataplex harvests the metadata for both structured and unstructured data, using built-in data quality checks to enhance integrity.
Overview of Data Catalog
Find out how Data Catalog powers the efficient use of your data.
How to get started with Dataplex
Logically organize your data stored into lakes and zones, and automate data management and governance across that data to power analytics at scale.
How to search with Data Catalog
Use Data Catalog to perform a search of data assets, such as datasets, tables, views, and Pub/Sub topics in your Google Cloud projects.
Dataplex best practices
Follow these best practices to optimize your Dataplex experience.
Use Dataplex APIs to centrally manage and govern distributed data.
Data Catalog API
Use Data Catalog APIs to centrally manage and enrich metadata for your distributed data.
Build a business domain-specific data mesh architecture across data in Cloud Storage and BigQuery using Dataplex. Enable decentralized ownership of data while still centrally managing, monitoring, and governing data across your enterprise and making this data securely accessible to a variety of analytics and data science tools.
Easily search and discover your data assets across data silos using a fully-managed, serverless Data Catalog within Dataplex. Data Catalog provides built-in capabilities to automatically ingest technical metadata, enrich metadata with relevant business context, and empower every user in your organization to easily find and understand their data using a powerful faceted search interface.
Google Cloud basics