Dataplex

Intelligent data governance

Centrally discover, manage, monitor, and govern data and AI artifacts across your data platform, providing access to trusted data and powering analytics and AI at scale.

Features

Simplified data discovery with Data Catalog

Automate data discovery, classification, and metadata enrichment of structured, semi-structured, and unstructured data, stored in Google Cloud and beyond, with built-in data intelligence. Manage technical, operational, and business metadata, for all your data, in a unified, flexible, and powerful Data Catalog. Enrich metadata with relevant business context using a built-in business glossary. Easily search, find, and understand your data with a built-in global, faceted search.

End-to-end data lineage

Easily understand where your data comes from and the transformations it goes through with end-to-end data lineage. Automatically processed for Google Cloud data sources and extendible to third-party data sources.

Automated data quality

Use automatically captured data lineage and built-in data profiling to better understand your data, trace dependencies, and effectively troubleshoot data issues. Automate data quality across distributed data and enable access to data you can trust.

Data governance in BigQuery

To support the end-to-end data life cycle and make it easier for customers to manage, discover, and govern data, we’re bringing Dataplex capabilities directly into BigQuery, including data quality, data lineage, and profiling. Now you can apply data governance directly to your data without leaving BigQuery.

Gen-AI powered insights and semantic search

Jumpstart your analytics with a curated list of questions that you can ask of your data. Harnessing the power of metadata and cutting-edge Gemini models, Data Insights generates tailored queries to uncover hidden patterns and valuable insights from your data. Semantic Metadata search for data helps you discover data using the language of your choice. Users have the ability to search for data assets using natural language queries, eliminating the requirement to recall search syntax and qualifiers. 

Data to AI governance with Vertex AI and Dataplex

Instantly discover AI models, datasets, features, and related data artifacts you need in a single search experience, spanning projects and regions while adhering to IAM permissions. Use Dataplex to enrich AI artifacts with critical business metadata for informed decision-making, such as ownership, key attributes, and relevant context.



How It Works

Dataplex enables you manage, monitor, and govern data and AI artifacts across data lakes, warehouses and databases. It helps users intelligently establish data profiles, assess data quality, determine data lineage, classify data, organize data into domains, and manage and govern the data life cycle.

govern data with Dataplex
Watch: Manage and govern distributed data with Dataplex

Common Uses

Data to AI governance

Data to AI governance with Dataplex and Vertex AI

In a single search experience, you can discover data and AI assets org-wide and instantly discover AI models, datasets, and related data artifacts, spanning projects and regions while adhering to IAM permissions. You can also augment assets with business context and enrich AI artifacts with business metadata for informed decision-making, such as ownership, key attributes, and relevant context.

Use Data Catalog to search for Vertex AI model and dataset resources

    Data to AI governance with Dataplex and Vertex AI

    In a single search experience, you can discover data and AI assets org-wide and instantly discover AI models, datasets, and related data artifacts, spanning projects and regions while adhering to IAM permissions. You can also augment assets with business context and enrich AI artifacts with business metadata for informed decision-making, such as ownership, key attributes, and relevant context.

    Use Data Catalog to search for Vertex AI model and dataset resources

      Build a data mesh

      Use Dataplex to build a data mesh

      A data mesh is a strategy where data ownership is decentralized and handled by domain data owners, where distributed datasets across locations can improve data accessibility and operational efficiency. Dataplex helps logically organize your data and related artifacts into a Dataplex Lake, or a data domain, that enables you to unify distributed data and organize it based on the business context.

      Read the guide on how to build a data mesh with Dataplex
      data mesh architecture

      Use Dataplex to build a data mesh

      A data mesh is a strategy where data ownership is decentralized and handled by domain data owners, where distributed datasets across locations can improve data accessibility and operational efficiency. Dataplex helps logically organize your data and related artifacts into a Dataplex Lake, or a data domain, that enables you to unify distributed data and organize it based on the business context.

      Read the guide on how to build a data mesh with Dataplex
      data mesh architecture

      Democratize data insights

      Democratize data insights with Dataplex Data Catalog

      Search and discover your data and AI artifacts across silos using a fully managed, serverless Data Catalog within Dataplex. Data Catalog has built-in capabilities to automatically ingest technical metadata, enrich metadata with relevant business context, and empower every user in your organization to easily find and understand their data and AI artifacts with a powerful, faceted search interface.

      Read the guide on using Data Catalog for better data discovery, metadata management, and more

      Democratize data insights with Dataplex Data Catalog

      Search and discover your data and AI artifacts across silos using a fully managed, serverless Data Catalog within Dataplex. Data Catalog has built-in capabilities to automatically ingest technical metadata, enrich metadata with relevant business context, and empower every user in your organization to easily find and understand their data and AI artifacts with a powerful, faceted search interface.

      Read the guide on using Data Catalog for better data discovery, metadata management, and more

      Pricing

      Dataplex pricingDataplex pricing is based on pay-as-you-go usage.
      Service and usageDescriptionPrice (USD)

      Dataplex processing

      Dataplex standard and premium processing are metered by the Data Compute Unit (DCU). DCU-hour is an abstract billing unit for Dataplex and the actual metering depends on the individual features you use.

      Free tier Dataplex processing

      First 100 DCU-hour per month for Dataplex standard processing.

      No charge

      Standard Dataplex processing

      Dataplex standard tier covers the data discovery functionality that automatically discovers table and fileset metadata from Cloud Storage.

      Starting at

      $0.060

      per DCU-hour

      Premium Dataplex processing

      The Dataplex premium processing tier covers the data exploration workbench, data lineage, data quality, and data profiling capabilities of Dataplex.

      $0.089

      per DCU-hour

      Data Catalog pricing

      Metadata storage pricing

      Data Catalog measures the average amount of the stored metadata during a short time interval. For billing, these measurements are combined into a one-month average, which is multiplied by the monthly rate.


      Dataplex free tier

      Up to 1 MiB monthly average storage.

      No charge

      Metadata storage

      Over 1 MiB monthly average storage.

      Starting at

      $2

      per GiB per month

      API charges

      Data Catalog charges for API calls made to the Data Catalog API and Data Lineage API.

      API calls

      1 million in a month.

      No charge

      API calls

      Over 1 million in a month.

      Starting at

      $10

      per 100,000 API calls

      Dataplex shuffle storage pricing

      Shuffle storage pricing covers any disk storage specified in the environments configured for the data exploration workbench.

      Starting at

      $0.040

      per GB-month

      Other usage

      Data organization features in Dataplex (lake, zone, or asset setup) and security policy application and propagation, are provided free of charge.

      Some Dataplex functionalities trigger job execution via Dataproc, BigQuery, and Dataflow. Those usages are charged according to each pricing model respectively, and charges will show up as such.

      Dataplex pricing

      Dataplex pricing is based on pay-as-you-go usage.

      Dataplex processing

      Description

      Dataplex standard and premium processing are metered by the Data Compute Unit (DCU). DCU-hour is an abstract billing unit for Dataplex and the actual metering depends on the individual features you use.

      Price (USD)

      Free tier Dataplex processing

      First 100 DCU-hour per month for Dataplex standard processing.

      Description

      No charge

      Standard Dataplex processing

      Dataplex standard tier covers the data discovery functionality that automatically discovers table and fileset metadata from Cloud Storage.

      Description

      Starting at

      $0.060

      per DCU-hour

      Premium Dataplex processing

      The Dataplex premium processing tier covers the data exploration workbench, data lineage, data quality, and data profiling capabilities of Dataplex.

      Description

      $0.089

      per DCU-hour

      Data Catalog pricing

      Description

      Metadata storage pricing

      Data Catalog measures the average amount of the stored metadata during a short time interval. For billing, these measurements are combined into a one-month average, which is multiplied by the monthly rate.


      Price (USD)

      Dataplex free tier

      Up to 1 MiB monthly average storage.

      Description

      No charge

      Metadata storage

      Over 1 MiB monthly average storage.

      Description

      Starting at

      $2

      per GiB per month

      API charges

      Data Catalog charges for API calls made to the Data Catalog API and Data Lineage API.

      Description

      API calls

      1 million in a month.

      Description

      No charge

      API calls

      Over 1 million in a month.

      Description

      Starting at

      $10

      per 100,000 API calls

      Dataplex shuffle storage pricing

      Description

      Shuffle storage pricing covers any disk storage specified in the environments configured for the data exploration workbench.

      Price (USD)

      Starting at

      $0.040

      per GB-month

      Other usage

      Description

      Data organization features in Dataplex (lake, zone, or asset setup) and security policy application and propagation, are provided free of charge.

      Price (USD)

      Some Dataplex functionalities trigger job execution via Dataproc, BigQuery, and Dataflow. Those usages are charged according to each pricing model respectively, and charges will show up as such.

      Description

      Get pricing by region

      Estimate your monthly Dataplex costs, including region-specific pricing and fees.

      Custom Quote

      Connect with our sales team to get a custom quote for your organization.

      Start your proof of concept

      New customers get $300 in free credits

      What is data governance?

      How Dataplex works

      Dataplex best practices

      Explore Data Catalog code samples

      Partners & Integration

      Partnering with industry leaders
      • accenture
      • Confluent
      • Collibra logo
      • hcl
      • informatica logo
      • nvidia logo
      • starburst logo
      • tableau
      • accenture
      • Confluent
      • Collibra logo
      • hcl
      • informatica logo
      • nvidia logo
      • starburst logo
      • tableau

      Explore all partners in Google Cloud Partner Center.

      Google Cloud
      • ‪English‬
      • ‪Deutsch‬
      • ‪Español‬
      • ‪Español (Latinoamérica)‬
      • ‪Français‬
      • ‪Indonesia‬
      • ‪Italiano‬
      • ‪Português (Brasil)‬
      • ‪简体中文‬
      • ‪繁體中文‬
      • ‪日本語‬
      • ‪한국어‬
      Console
      Google Cloud