Dataplex Universal Catalog

Intelligent data to AI governance

Centrally discover, manage, monitor, and govern data and AI artifacts across your data platform, providing access to trusted data and powering analytics and AI at scale.

Features

Data to AI governance with Vertex AI integration

Instantly discover AI models, datasets, features, and related data artifacts you need in a single search experience, spanning projects and regions while adhering to IAM permissions. Use Dataplex Universal Catalog to enrich AI artifacts with critical business metadata for informed decision-making, such as ownership, key attributes, and relevant context.



Data governance in BigQuery

To support the end-to-end data lifecycle and meet you where you are, we are making it easier to manage, discover, and govern data in BigQuery. We’re bringing Dataplex Universal Catalog capabilities directly into BigQuery, including data quality, data lineage, and profiling, to provide contextual governance.

Gen AI powered insights and semantic search

Jumpstart your analytics with a curated list of questions that you can ask of your data. Harnessing the power of metadata and cutting-edge Gemini models, Data Insights generates tailored queries to uncover hidden patterns and valuable insights from your data. Semantic metadata search for data helps you discover data using the language of your choice. Users have the ability to search for data assets using natural language queries, eliminating the requirement to recall search syntax and qualifiers. 

Simplified data discovery with universal semantic search

Automate data discovery, classification, and metadata enrichment of structured, semi-structured, and unstructured data, stored in Google Cloud and beyond, with built-in data search capabilities. Manage technical, operational, and business metadata, for all your data, in a unified, flexible, and powerful catalog. Enrich metadata with relevant business context using a built-in business glossary. Easily search, find, and understand your data with a built-in global, faceted search using natural language.

End-to-end data lineage

Easily understand where your data comes from and the transformations it goes through with end-to-end data lineage. Automatically processed for Google Cloud data sources and extensible to third-party data sources.

Automated data quality

Use automatically captured data lineage and built-in data profiling to better understand your data, trace dependencies, and effectively troubleshoot data issues. Automate data quality across distributed data and enable access to data you can trust.

How It Works

Dataplex Universal Catalog enables you to manage, monitor, and govern data and AI artifacts across data lakes, warehouses and databases. It helps users intelligently establish data profiles, assess data quality, determine data lineage, classify data, organize data into domains, and manage and govern the data to AI life cycle.

Common Uses

Data to AI governance

Data to AI governance with Dataplex and Vertex AI

In a single search experience, you can discover data and AI assets org-wide and instantly discover AI models, datasets, and related data artifacts, spanning projects and regions while adhering to IAM permissions. You can also augment assets with business context and enrich AI artifacts with business metadata for informed decision-making, such as ownership, key attributes, and relevant context.

    Data to AI governance with Dataplex and Vertex AI

    In a single search experience, you can discover data and AI assets org-wide and instantly discover AI models, datasets, and related data artifacts, spanning projects and regions while adhering to IAM permissions. You can also augment assets with business context and enrich AI artifacts with business metadata for informed decision-making, such as ownership, key attributes, and relevant context.

      Build a data mesh

      Use Dataplex Universal Catalog to build a data mesh

      A data mesh is a strategy where data ownership is decentralized and handled by domain data owners, where distributed datasets across locations can improve data accessibility and operational efficiency. Dataplex helps logically organize your data and related artifacts into data domains, that enable you to unify distributed data and organize it based on the business context.

      data mesh architecture

        Use Dataplex Universal Catalog to build a data mesh

        A data mesh is a strategy where data ownership is decentralized and handled by domain data owners, where distributed datasets across locations can improve data accessibility and operational efficiency. Dataplex helps logically organize your data and related artifacts into data domains, that enable you to unify distributed data and organize it based on the business context.

        data mesh architecture

          Democratize data insights

          Democratize data insights with Dataplex Universal Catalog

          Search and discover your data and AI artifacts across silos using a fully managed, serverless Dataplex Universal Catalog. The catalog has built-in capabilities to automatically ingest technical metadata, enrich metadata with relevant business context, and empower every user in your organization to easily find and understand their data and AI artifacts with a powerful, faceted search interface.

            Democratize data insights with Dataplex Universal Catalog

            Search and discover your data and AI artifacts across silos using a fully managed, serverless Dataplex Universal Catalog. The catalog has built-in capabilities to automatically ingest technical metadata, enrich metadata with relevant business context, and empower every user in your organization to easily find and understand their data and AI artifacts with a powerful, faceted search interface.

              Govern your open lakehouse

              Unified governance for Apache Iceberg and open source engines

              Dataplex Universal Catalog is deeply integrated with BigLake, Google Cloud's native Apache Iceberg storage engine. BigLake Metastore is natively supported in Dataplex Universal Catalog, ensuring that governance policies defined centrally are enforced across multiple engines. BigLake’s support within Dataplex Universal Catalog also enriches governance across the platform by supporting semantic search, data lineage, profiling, and quality checks, providing a managed foundation for your open data lakehouse.

                Unified governance for Apache Iceberg and open source engines

                Dataplex Universal Catalog is deeply integrated with BigLake, Google Cloud's native Apache Iceberg storage engine. BigLake Metastore is natively supported in Dataplex Universal Catalog, ensuring that governance policies defined centrally are enforced across multiple engines. BigLake’s support within Dataplex Universal Catalog also enriches governance across the platform by supporting semantic search, data lineage, profiling, and quality checks, providing a managed foundation for your open data lakehouse.

                  Automate Apache Spark workflows

                  Schedule custom Spark and Spark SQL tasks

                  Dataplex Universal Catalog allows you to automate and manage your data lifecycle by scheduling custom Spark and Spark SQL tasks directly within your data mesh or lakehouse. This capability streamlines common data processing operations, such as data ingestion, complex transformations, and data quality checks, ensuring your data is always accurate, up-to-date, and ready for analytics and AI workloads.

                    Schedule custom Spark and Spark SQL tasks

                    Dataplex Universal Catalog allows you to automate and manage your data lifecycle by scheduling custom Spark and Spark SQL tasks directly within your data mesh or lakehouse. This capability streamlines common data processing operations, such as data ingestion, complex transformations, and data quality checks, ensuring your data is always accurate, up-to-date, and ready for analytics and AI workloads.

                      Generate a solution
                      What problem are you trying to solve?
                      What you'll get:
                      Step-by-step guide
                      Reference architecture
                      Available pre-built solutions
                      This service was built with Vertex AI. You must be 18 or older to use it. Do not enter sensitive, confidential, or personal info.

                      Pricing

                      Dataplex Universal Catalog pricingDataplex Universal Catalog pricing is based on pay-as-you-go usage.
                      Service and usageDescriptionPrice (USD)

                      Dataplex Universal Catalog processing

                      Dataplex Universal Catalog standard and premium processing are metered by the Data Compute Unit (DCU). DCU-hour is an abstract billing unit for Dataplex and the actual metering depends on the individual features you use.

                      Free tier Dataplex Universal Catalog processing

                      First 100 DCU-hour per month for Dataplex Universal Catalog standard processing.

                      No charge

                      Standard Dataplex Universal Catalog processing

                      Dataplex Universal Catalog standard tier covers the data discovery functionality that automatically discovers table and fileset metadata from Cloud Storage.

                      Starting at

                      $0.060

                      per DCU-hour

                      Premium Dataplex Universal Catalog processing

                      The Dataplex Universal Catalog premium processing tier covers the data exploration workbench, data lineage, data quality, and data profiling capabilities of Dataplex.

                      Starting at

                      $0.089

                      per DCU-hour

                      Dataplex Universal Catalog metadata and API pricing

                      Metadata storage pricing

                      Dataplex Universal Catalog measures the average amount of the stored metadata during a short time interval. For billing, these measurements are combined into a one-month average, which is multiplied by the monthly rate.


                      Dataplex Universal Catalog free tier

                      First 1 MiB monthly average storage.

                      No charge

                      Metadata storage

                      Over 1 MiB monthly average storage.

                      Starting at

                      $2

                      per GiB per month

                      API charges

                      Dataplex Universal Catalog charges for API calls made to the Data Catalog API and Data Lineage API.

                      API calls

                      First 1 million in a month.

                      No charge

                      API calls

                      Over 1 million in a month.

                      Starting at

                      $10

                      per 100,000 API calls

                      Dataplex Universal Catalog shuffle storage pricing

                      Shuffle storage pricing covers any disk storage specified in the environments configured for the data exploration workbench.

                      Starting at

                      $0.040

                      per GB-month

                      Other usage

                      Data organization features in Dataplex Universal Catalog (lake, zone, or asset setup) and security policy application and propagation, are provided free of charge.

                      Some Dataplex Universal Catalog functionalities trigger job execution using Dataproc, BigQuery, and Dataflow. Usages for those services are charged according to their respective pricing models, and charges will show up under those services as such.

                      Dataplex Universal Catalog pricing

                      Dataplex Universal Catalog pricing is based on pay-as-you-go usage.

                      Dataplex Universal Catalog processing

                      Description

                      Dataplex Universal Catalog standard and premium processing are metered by the Data Compute Unit (DCU). DCU-hour is an abstract billing unit for Dataplex and the actual metering depends on the individual features you use.

                      Price (USD)

                      Free tier Dataplex Universal Catalog processing

                      First 100 DCU-hour per month for Dataplex Universal Catalog standard processing.

                      Description

                      No charge

                      Standard Dataplex Universal Catalog processing

                      Dataplex Universal Catalog standard tier covers the data discovery functionality that automatically discovers table and fileset metadata from Cloud Storage.

                      Description

                      Starting at

                      $0.060

                      per DCU-hour

                      Premium Dataplex Universal Catalog processing

                      The Dataplex Universal Catalog premium processing tier covers the data exploration workbench, data lineage, data quality, and data profiling capabilities of Dataplex.

                      Description

                      Starting at

                      $0.089

                      per DCU-hour

                      Dataplex Universal Catalog metadata and API pricing

                      Description

                      Metadata storage pricing

                      Dataplex Universal Catalog measures the average amount of the stored metadata during a short time interval. For billing, these measurements are combined into a one-month average, which is multiplied by the monthly rate.


                      Price (USD)

                      Dataplex Universal Catalog free tier

                      First 1 MiB monthly average storage.

                      Description

                      No charge

                      Metadata storage

                      Over 1 MiB monthly average storage.

                      Description

                      Starting at

                      $2

                      per GiB per month

                      API charges

                      Dataplex Universal Catalog charges for API calls made to the Data Catalog API and Data Lineage API.

                      Description

                      API calls

                      First 1 million in a month.

                      Description

                      No charge

                      API calls

                      Over 1 million in a month.

                      Description

                      Starting at

                      $10

                      per 100,000 API calls

                      Dataplex Universal Catalog shuffle storage pricing

                      Description

                      Shuffle storage pricing covers any disk storage specified in the environments configured for the data exploration workbench.

                      Price (USD)

                      Starting at

                      $0.040

                      per GB-month

                      Other usage

                      Description

                      Data organization features in Dataplex Universal Catalog (lake, zone, or asset setup) and security policy application and propagation, are provided free of charge.

                      Price (USD)

                      Some Dataplex Universal Catalog functionalities trigger job execution using Dataproc, BigQuery, and Dataflow. Usages for those services are charged according to their respective pricing models, and charges will show up under those services as such.

                      Description

                      Explore pricing

                      Visit the Dataplex Universal Catalog pricing to see pricing per region and more.

                      Custom Quote

                      Connect with our sales team to get a custom quote for your organization.

                      Start your proof of concept

                      New customers get $300 in free credits

                      What is data governance?

                      How Dataplex Universal Catalog works

                      Dataplex Universal Catalog best practices

                      Learn more about governance for your lakehouse

                      Partners & Integration

                      Partnering with industry leaders
                      • accenture
                      • Confluent
                      • Collibra logo
                      • hcl
                      • informatica logo
                      • nvidia logo
                      • starburst logo
                      • tableau
                      • accenture
                      • Confluent
                      • Collibra logo
                      • hcl
                      • informatica logo
                      • nvidia logo
                      • starburst logo
                      • tableau

                      Explore all partners in Google Cloud Partner Center.

                      Dataplex
                      Google Cloud