Knowledge Catalog (formerly Dataplex)

Always-on context and governance for your agents

Knowledge Catalog is a dynamic, always-on, universal context engine for agents. It unifies structured, unstructured, and SaaS data into a governed, agent-ready truth.

Features

Automated governance across your entire data ecosystem

Deliver scalable, automated governance by transforming raw metadata into a trusted foundation for AI. Enforce policy-based quality checks and anomaly detection across distributed data sources and multimodal data. By combining auto-captured lineage with automated metadata generation, Knowledge Catalog uses Gemini to translate all of these signals into clear business context. This unified control plane standardizes definitions and ensures every data asset is governed, traceable, and ready for agentic workflows.

Multimodal metadata extraction and context curation

Always on cataloging and metadata harvesting is automatic across Google's Data Cloud, including technical metadata and other information like relationships and profiling. It supports 3P databases and partners like Atlan, Collibra, Datahub, Ab Initio and Anomalo. Knowledge Catalog doesn't just list files; it actively reads unstructured data on Google Cloud Storage and can connect unstructured data to structured data to provide unified context across all of your multimodal data.

Defined business semantics

Sync business logic into a single, governed layer. Knowledge Catalog automatically synthesizes context from schemas, query logs, and Looker models to define the relationships and measures AI agents need to reason effectively. Whether you are vibe coding semantic models or extracting logic from a sheet, everything flows into a unified glossary. This ensures consistent, policy-based governance across your entire enterprise.

Context retrieval for your agents

Enable your agents to retrieve holistic context from your enterprise data. Through semantic search, Context APIs, and MCP tools, agents can instantly discover data assets, extract pre-generated enriched metadata. By utilizing these pre-vetted "golden queries" and semantic insights, agents can execute complex tasks, retrieve accurate information, and navigate your data ecosystem with unprecedented precision and scale.

How It Works

Knowledge Catalog unifies fragmented data into agent-ready context. It provides always on cataloging and metadata harvesting across Google's Data Cloud, 3P databases and partners. This real-time enterprise context enables AI agents to move safely from insights to autonomous, governed action.

Common Uses

Data to AI governance

Build your data to AI governance foundation

In a single search experience, you can discover data and AI assets org-wide and instantly discover AI models, datasets, and related data artifacts, spanning projects and regions while adhering to IAM permissions. You can also augment assets with business context and enrich AI artifacts with business metadata for informed decision-making, such as ownership, key attributes, and relevant context.

    Build your data to AI governance foundation

    In a single search experience, you can discover data and AI assets org-wide and instantly discover AI models, datasets, and related data artifacts, spanning projects and regions while adhering to IAM permissions. You can also augment assets with business context and enrich AI artifacts with business metadata for informed decision-making, such as ownership, key attributes, and relevant context.

      Multimodal data discovery

      Make multimodal data insights easily discoverable by agents

      Critical business knowledge often remains locked in unstructured sources like design docs, wikis, PDFs and images. You can instantly turn thousands of PDF contracts, design docs, and wikis into a structured knowledge graph. AI agents can then query this to answer complex questions—like "What are the common liability clauses in our 2025 vendor agreements?"—with grounded, traceable facts.

        Make multimodal data insights easily discoverable by agents

        Critical business knowledge often remains locked in unstructured sources like design docs, wikis, PDFs and images. You can instantly turn thousands of PDF contracts, design docs, and wikis into a structured knowledge graph. AI agents can then query this to answer complex questions—like "What are the common liability clauses in our 2025 vendor agreements?"—with grounded, traceable facts.

          Automated data product creation

          Automatically create governed data products for agents

          Move beyond simple tables to create data products: self-contained units of intelligence that include built-in intent, SLAs, and governance constraints. By automatically inferring relationships across the data estate, Knowledge Catalog packages these assets into data products so they can be easily distributed and scaled across cross-functional AI teams and agents.

            Automatically create governed data products for agents

            Move beyond simple tables to create data products: self-contained units of intelligence that include built-in intent, SLAs, and governance constraints. By automatically inferring relationships across the data estate, Knowledge Catalog packages these assets into data products so they can be easily distributed and scaled across cross-functional AI teams and agents.

              Shared semantics for humans and agents

              Define business semantics across your organization

              Data engineers define structure in technical schemas while analysts define meaning in BI tools, creating a gap for AI agents that can lead to untrusted results. Knowledge Catalog helps you build a shared semantic layer with strict enterprise governance. Whether a human asks a question or an agent performs a task via API, the answer is always based on the same trusted enterprise truth.

                Define business semantics across your organization

                Data engineers define structure in technical schemas while analysts define meaning in BI tools, creating a gap for AI agents that can lead to untrusted results. Knowledge Catalog helps you build a shared semantic layer with strict enterprise governance. Whether a human asks a question or an agent performs a task via API, the answer is always based on the same trusted enterprise truth.

                  Govern your open lakehouse

                  Unified governance for your open lakehouse

                  Knowledge Catalog is deeply integrated with Google Cloud's Lakehouse and integrates its catalog to support governance policies that are centrally defined and enforced across multiple engines like BigQuery and Google Cloud Managed Service for Apache Spark. Knowledge Catalog also enriches governance across Lakehouse by supporting semantic search, data lineage, profiling, and quality checks, providing a managed foundation for your open data lakehouse.

                    Unified governance for your open lakehouse

                    Knowledge Catalog is deeply integrated with Google Cloud's Lakehouse and integrates its catalog to support governance policies that are centrally defined and enforced across multiple engines like BigQuery and Google Cloud Managed Service for Apache Spark. Knowledge Catalog also enriches governance across Lakehouse by supporting semantic search, data lineage, profiling, and quality checks, providing a managed foundation for your open data lakehouse.

                      Generate a solution
                      What problem are you trying to solve?
                      What you'll get:
                      Step-by-step guide
                      Reference architecture
                      Available pre-built solutions
                      This service was built with Gemini Enterprise Agent Platform. You must be 18 or older to use it. Do not enter sensitive, confidential, or personal info.

                      Pricing

                      Knowledge Catalog pricingKnowledge Catalog pricing is based on pay-as-you-go usage.
                      Service and usageDescriptionPrice (USD)

                      Knowledge Catalog processing

                      Knowledge Catalog standard and premium processing are metered by the Data Compute Unit (DCU). DCU-hour is an abstract billing unit and the actual metering depends on the individual features you use.

                      Free tier Knowledge Catalog processing

                      First 100 DCU-hour per month for Knowledge Catalog standard processing.

                      No charge

                      Standard Knowledge Catalog processing

                      Knowledge Catalog standard tier covers the data discovery functionality that automatically discovers table and fileset metadata from Cloud Storage.

                      Starting at

                      $0.060

                      per DCU-hour

                      Premium Knowledge Catalog processing

                      The Knowledge Catalog premium processing tier covers the data exploration workbench, data lineage, data quality, and data profiling capabilities of Knowledge Catalog.

                      Starting at

                      $0.089

                      per DCU-hour

                      Knowledge Catalog metadata and API pricing

                      Metadata storage pricing

                      Knowledge Catalog measures the average amount of the stored metadata during a short time interval. For billing, these measurements are combined into a one-month average, which is multiplied by the monthly rate.


                      Knowledge Catalog free tier

                      First 1 MiB monthly average storage.

                      No charge

                      Metadata storage

                      Over 1 MiB monthly average storage.

                      Starting at

                      $2

                      per GiB per month

                      API charges

                      Knowledge Catalog charges for API calls made to the Data Catalog API and Data Lineage API.

                      API calls

                      First 1 million in a month.

                      No charge

                      API calls

                      Over 1 million in a month.

                      Starting at

                      $10

                      per 100,000 API calls

                      Knowledge Catalog shuffle storage pricing

                      Shuffle storage pricing covers any disk storage specified in the environments configured for the data exploration workbench.

                      Starting at

                      $0.040

                      per GB-month

                      Other usage

                      Data organization features in Knowledge Catalog (lake, zone, or asset setup) and security policy application and propagation, are provided free of charge.

                      Some Knowledge Catalog functionalities trigger job execution using Google Cloud Managed Service for Apache Spark, BigQuery, and Dataflow. Usages for those services are charged according to their respective pricing models, and charges will show up under those services as such.

                      Knowledge Catalog pricing

                      Knowledge Catalog pricing is based on pay-as-you-go usage.

                      Knowledge Catalog processing

                      Description

                      Knowledge Catalog standard and premium processing are metered by the Data Compute Unit (DCU). DCU-hour is an abstract billing unit and the actual metering depends on the individual features you use.

                      Price (USD)

                      Free tier Knowledge Catalog processing

                      First 100 DCU-hour per month for Knowledge Catalog standard processing.

                      Description

                      No charge

                      Standard Knowledge Catalog processing

                      Knowledge Catalog standard tier covers the data discovery functionality that automatically discovers table and fileset metadata from Cloud Storage.

                      Description

                      Starting at

                      $0.060

                      per DCU-hour

                      Premium Knowledge Catalog processing

                      The Knowledge Catalog premium processing tier covers the data exploration workbench, data lineage, data quality, and data profiling capabilities of Knowledge Catalog.

                      Description

                      Starting at

                      $0.089

                      per DCU-hour

                      Knowledge Catalog metadata and API pricing

                      Description

                      Metadata storage pricing

                      Knowledge Catalog measures the average amount of the stored metadata during a short time interval. For billing, these measurements are combined into a one-month average, which is multiplied by the monthly rate.


                      Price (USD)

                      Knowledge Catalog free tier

                      First 1 MiB monthly average storage.

                      Description

                      No charge

                      Metadata storage

                      Over 1 MiB monthly average storage.

                      Description

                      Starting at

                      $2

                      per GiB per month

                      API charges

                      Knowledge Catalog charges for API calls made to the Data Catalog API and Data Lineage API.

                      Description

                      API calls

                      First 1 million in a month.

                      Description

                      No charge

                      API calls

                      Over 1 million in a month.

                      Description

                      Starting at

                      $10

                      per 100,000 API calls

                      Knowledge Catalog shuffle storage pricing

                      Description

                      Shuffle storage pricing covers any disk storage specified in the environments configured for the data exploration workbench.

                      Price (USD)

                      Starting at

                      $0.040

                      per GB-month

                      Other usage

                      Description

                      Data organization features in Knowledge Catalog (lake, zone, or asset setup) and security policy application and propagation, are provided free of charge.

                      Price (USD)

                      Some Knowledge Catalog functionalities trigger job execution using Google Cloud Managed Service for Apache Spark, BigQuery, and Dataflow. Usages for those services are charged according to their respective pricing models, and charges will show up under those services as such.

                      Description

                      Explore pricing

                      Visit the Knowledge Catalog pricing to see pricing per region and more.

                      Custom Quote

                      Connect with our sales team to get a custom quote for your organization.

                      Start your proof of concept

                      New customers get $300 in free credits

                      What is data governance?

                      How Knowledge Catalog works

                      Knowledge Catalog best practices

                      Learn more about governance for your lakehouse

                      Partners & Integration

                      Partnering with industry leaders
                      • Ab Initio
                      • accenture
                      • Anomalo
                      • Atlan
                      • Confluent
                      • Collibra logo
                      • Datahub
                      • hcl
                      • informatica logo
                      • nvidia logo
                      • starburst logo
                      • tableau
                      • Ab Initio
                      • accenture
                      • Anomalo
                      • Atlan
                      • Confluent
                      • Collibra logo
                      • Datahub
                      • hcl
                      • informatica logo
                      • nvidia logo
                      • starburst logo
                      • tableau

                      Explore all partners in Google Cloud Partner Center.

                      Google Cloud