Data Catalog

A fully managed and highly scalable data discovery and metadata management service.

New customers get $300 in free credits to spend on Google Cloud during the Free Trial. All customers get up to 1 MiB of business or ingested metadata storage and 1 million API calls, free of charge.

Try Google Cloud free
  • action/check_circle_24px Created with Sketch.

    Pinpoint your data with a simple but powerful faceted-search interface

  • action/check_circle_24px Created with Sketch.

    Sync technical metadata automatically and create schematized tags for business metadata

  • action/check_circle_24px Created with Sketch.

    Tag sensitive data automatically, through Cloud Data Loss Prevention (DLP) integration

  • action/check_circle_24px Created with Sketch.

    Get access immediately then scale without infrastructure to set up or manage

Benefits

Simplifies data discovery at any scale

Empower any user on the team to find or tag data with a powerful UI, built with the same search technology as Gmail, or via API access. Data Catalog is fully managed, so you can start and scale effortlessly.

Offers a unified view of all datasets

Understand your data assets in Google Cloud and beyond. Integrations with BigQuery, Pub/Sub, Cloud Storage, and many connectors provide a unified view and tagging mechanism for technical and business metadata.

Provides a foundation for data governance

Enforce data security policies and maintain compliance through Cloud IAM and Cloud DLP integrations, that help ensure the right people gain access to the right data and sensitive data is protected.

Key features

Key features

Serverless

Fully managed and scalable metadata management service; requires no infrastructure to set up or manage, allowing you to focus on your business.

Metadata as a service

Metadata management service for cataloging data assets via custom APIs and the UI, thereby providing a unified view of data wherever it is.

Central catalog

A flexible and powerful cataloging system for capturing both technical metadata (automatically) as well as business metadata (tags) in a structured format.

View all features

What's new

What's new

Sign up for Google Cloud newsletters to receive product updates, event information, special offers, and more.

Documentation

Documentation

Quickstart
Quickstart for tagging datasets

Make a BigQuery dataset, create a tag template with a schema, look up the Data Catalog entry for your table, and attach the tag to your table.

Tutorial
How to search with Data Catalog

Use Data Catalog to perform a search of data assets, such as datasets, tables, views, and Pub/Sub topics in your Google Cloud projects.

Google Cloud Basics
Restricting access with BigQuery column-level security

This page explains how to use BigQuery column-level security to restrict access to BigQuery data at the column level.

Tutorial
Access on-premises metadata connectors on GitHub

Commons code used by the Data Catalog connectors and links for the connectors sample code.

Use cases

Use cases

Use case
Ingest metadata from on-premises RDBMS assets

While you can use the Data Catalog API to create your own connectors for ingesting metadata from a data source of your choice, we provide you with “ready to use” open-source connectors for ingesting metadata from a number of common data sources like MySQL, PostgreSQL, Hive, Teradata, Oracle, SQL Server, Redshift, and more. Once in Data Catalog, all assets can be searched for and tagged.

Data Catalog integrations across Google Cloud and open source connectors
Use case
Ingest metadata from BI systems

The Data Catalog API can be used to ingest metadata from any business intelligence asset. For Looker and Tableau we have open-sourced ready-to-use connectors so they're discoverable and can be tagged directly in Data Catalog.

All features

All features

Serverless Fully managed and scalable metadata management service; requires no infrastructure to set up or manage, allowing you to focus on your business.
Metadata as a service Metadata management service for cataloging data assets via custom APIs and the UI, thereby providing a unified view of data wherever it is.
Central catalog A flexible and powerful cataloging system for capturing both technical metadata (automatically) as well as business metadata (tags) in a structured format.
Search and discovery A simple and easy-to-use UI with powerful structured search capabilities to quickly and easily find data assets; powered by the same Google search technology that supports Gmail and Drive.
Schematized metadata Supports schematized tags (e.g., Enum, Bool, DateTime) and not just simple text tags—providing organizations rich and organized business metadata.
Cloud DLP integration Discovers and classifies sensitive data, providing intelligence and helping to simplify the process of governing your data.
On-prem connectors Ingest technical metadata from non-Google Cloud data assets to Data Catalog for a unified view of all your data assets.
Cloud IAM integration Provides access-level controls and honors source ACLs for read, write, and search for the data assets; giving you enterprise-grade access control.
Governance Offers a strong security and compliance foundation with Cloud DLP and Cloud IAM integrations.

Pricing

Pricing

Pricing for Data Catalog is split between metadata storage and API calls—both on a consumption basis. Metadata storage includes any new metadata stored in Data Catalog, including:

• Business metadata, such as Data Catalog tag templates and tags

• Cloud Storage filesets schemas attached to Pub/Sub topics

•  Custom types metadata stored in Data Catalog, etc.

Metadata storage does not include the technical metadata stored by other Google Cloud services, for example, dataset table and column names stored in BigQuery. Detailed pricing and examples for both metadata storage and API calls may be found in the Data Catalog documentation.  

Partners

Partners and integrations

Our strategic partnerships help build a strong ecosystem and allow customers to have a unified data discovery experience for hybrid cloud scenarios, using their platform of choice.