Manage entries and ingest custom sources

This document describes how to create and manage entry types, entry groups, and custom entries to enable integration of custom data sources into Dataplex.

An entry represents a resource that you capture metadata for. An entry group is a container for one or more entries, used to manage access control and regional location. An entry type defines the required metadata for entries. Entry types bring structure and rules into a free-flowing and loosely-defined entry resource, allowing entries to be extensible.

To integrate a custom data source into Dataplex, you create a custom entry by using a custom entry type that is under a custom entry group. Creating a custom entry involves the following high-level steps:

  1. Create an entry group.
  2. Create an entry type.
  3. Create a custom entry for the entry type within the entry group.

Entries

An entry represents a data asset that you capture metadata for. Every entry is an instance of an entry type. Each operation on aspects for an entry needs to comply with the required aspects of its entry type. For example, when you create an entry, you must provide values for all the aspect types defined by the entry type. You can't delete those aspects for an entry that are marked as required in the entry type.

Categories of entries

  • System entries: Dataplex creates entries for Google Cloud resources, such as BigQuery datasets or tables. The entries that Dataplex creates are called system entries. Dataplex automatically keeps system entries up-to-date through continuous metadata synchronization from the supported Google Cloud systems.

    You can't modify the metadata that is populated automatically (called required aspects) for system entries. You can only add and modify additional metadata for system entries by using optional aspects. For more information, see Categories of aspects.

  • Custom entries: entries that you can create and manage for custom resources are called custom entries. Custom resources are the resources in non-Google services that are hosted either in Google Cloud or externally (for example, on-premises).

Entry groups

An entry group is a container for one or more entries. You can use entry groups to manage access control and regional location for the entries. Every entry group belongs to a project.

Categories of entry groups

  • System entry groups: for Google Cloud resources, Dataplex automatically creates entry groups for each system in every project and location where the resources are. For example, @bigquery is the system entry group for BigQuery.

  • Custom entry groups: Entry groups that you create for custom resources.

Entry types

Entry types define the required metadata for entries of this type, using a set of required aspect types.

You can specify the required aspect types only on entries, and not on the columns of an entry. When you create an entry of a specific entry type, you must provide values for all required aspect types that are specified by the entry type.

Required aspect types that are referenced within an entry type must belong to the same project as the entry type.

Categories of entry types

  • Custom entry types: Entry types that you create and manage. You can use these entry types to create custom entries.

  • System entry types: Dataplex provides these entry types by default. System entry types are further categorized into reusable and restricted.

    The following table describes the categories of system entry types, and the list of entry types that Dataplex provides for each of the categories:

    Category of system entry type Description Entry types that Dataplex provides
    Reusable system entry type You can use this entry type to create custom entries.
    • generic
    Restricted system entry type These are reserved for system use, such as creating entries for Google Cloud resources.

    You can't use these entry types to create entries, but you can edit entries of these entry types to add optional aspects.
    • bigquery-connection
    • bigquery-dataset
    • bigquery-model
    • bigquery-routine
    • bigquery-table
    • bigquery-view
    • cloudsql-database
    • cloudsql-instance
    • cloudsql-schema
    • cloudsql-table
    • cloudsql-view
    • dataform-repository
    • dataform-code-asset
    • sql-access
    • storage
    • storage-bucket
    • storage-folder

You can create a custom entry type in a specific regional location or as a global resource. System entry types are always global. The location of an entry type impacts the scope of its applicability. For more information, see Project and location constraints.

Before you begin

Before you manage entries and ingest custom data sources, ensure that you have completed the tasks described in this section.

Required roles

To get the permissions that you need to create and manage entries, ask your administrator to grant you the following IAM roles on the resource:

For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

For more information, see Dataplex IAM roles.

Enable the API

Enable the Dataplex API in your Dataplex project.

Enable the API

Create an entry group

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry groups > Custom tab.

  3. Click Create entry group (Dataplex Catalog).

  4. In the Create entry group window, enter the following:

    1. Optional: In the Display name field, enter a display name for your entry group.
    2. Entry group ID: Enter a unique ID for your entry group.
    3. Optional: In the Description field, enter a description for your entry group.
    4. Location: Select a location. You can't modify the location after you create the entry group.
  5. Optional: In the Labels section, add arbitrary labels as key-value pairs to your resources:

    1. Click Add label.
    2. In the Key field, enter a key.
    3. In the Value field, enter a value for the key.
    4. To add more labels, click Add label and repeat the steps.
  6. Click Save.

gcloud

To create an entry group, use the gcloud dataplex entry-groups create command.

REST

To create an entry group, use the entryGroups.Create method.

Create an entry type

To ingest a new source, you must create an entry type.

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry types > Custom tab.

  3. Click Create.

  4. In the Create entry type window, enter the following:

    1. Optional: In the Display name field, enter a display name for your entry type.
    2. Entry type ID: Enter a unique ID for your entry type. You can't modify this after you create the entry type.
    3. Optional: In the Description field, enter a description for your entry type.
    4. Optional: In the System field, enter the source system.
    5. Optional: In the Platform field, enter the platform that entries of this type belong to. For example, Google Cloud.
    6. In the Location field, select a location. You can't modify the location after you create the entry type.
  5. Optional: In the Type aliases section, define the data type for your entry type. The data type can be used for querying entries.

    1. Click Add type alias.
    2. In the Type alias field, select a data type. You can add multiple type aliases.
  6. In the Required aspect types section, select the aspect types that are mandatory for this entry type. Each entry that is created based on this type will have these required aspect types assigned.

    1. Click Choose aspect type.
    2. In the Select aspect types window, select the aspect type.
    3. Click Select.

    You can't delete the required aspects from an entry.

  7. Optional: In the Labels section, add arbitrary labels as key-value pairs to your resources:

    1. Click Add label.
    2. In the Key field, enter a key.
    3. In the Value field, enter a value for the key.
    4. To add more labels, click Add label and repeat the steps.
  8. Click Save.

gcloud

To create an entry type, use the gcloud dataplex entry-types create command.

REST

To create an entry type, use the entryTypes.create method.

Create a custom entry

Before you create a custom entry, ensure that you have created an entry group and an entry type.

Console

Creating a custom entry using the Google Cloud console isn't supported. Instead, use the Google Cloud CLI or the API.

gcloud

To create a custom entry, use the gcloud dataplex entries create command.

REST

To create a custom entry, use the entries.create method.

After you create a custom entry, you can add aspects to the entry. For more information, see Add aspects to an entry.

Manage entry groups

This section describes how to view the list of available entry groups, view details, update, and delete entry groups.

View the list of available entry groups

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry groups tab.

    This page lists all the available Dataplex Catalog and Data Catalog entry groups.

  3. To view custom entry groups, click the Custom tab. In the Custom tab, the Catalog source column displays where the resource resides—Dataplex Catalog or Data Catalog.

    To view system entry groups, click the System tab.

    For more information about custom and system entry groups, see the categories of entry groups section of this document.

  4. Optional: To view the list of entry groups in your selected project, click the Custom tab, and then click the Show from all projects toggle to the off position.

    The Show from all projects toggle is on by default, and the list includes Dataplex Catalog resources from your selected organization and Data Catalog resources from all the organizations that you can access.

gcloud

To view the list of available entry groups, use the gcloud dataplex entry-groups list command.

REST

To view the list of available entry groups, use the entryGroups.list method.

View details of an entry group

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry groups tab.

  3. Click the entry group for which you want to view the details.

    The entry group details page opens. You can access information such as display name, entry group ID, description, project ID, location, labels, creation date, and last modified date of the selected entry group.

    For a Data Catalog entry group, you can view the details in both the Data Catalog and Dataplex Catalog web interfaces. To do this, on the entry group details page, click Data Catalog or Dataplex Catalog.

    The Sample entries section displays 10 related entries created recently in the selected entry group.

  4. Optional: To view all the entries related to an entry group, in the Sample entries section, click Show all related entries in search.

gcloud

To retrieve the details of an entry group, use the gcloud dataplex entry-groups describe command.

REST

To retrieve the details of an entry group, use the entryGroups.get method.

Update an entry group

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry groups > Custom tab.

  3. Click the entry group that you want to update.

  4. On the Entry group details page, click Edit.

  5. Edit the display name, description, and labels, as required.

  6. Click Save.

gcloud

To update an entry group, use the gcloud dataplex entry-groups update command.

REST

To update an entry group, use the entryGroups.patch method.

Delete an entry group

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry groups > Custom tab.

  3. Click the entry group that you want to delete.

  4. On the Entry group details page, click Delete. Confirm when prompted.

gcloud

To delete an entry group, use the gcloud dataplex entry-groups delete command.

REST

To delete an entry group, use the entryGroups.delete method.

Manage entry types

This section describes how to view the list of available entry groups, view details, update, and delete entry types.

View the list of available entry types

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry types tab.

  3. To view custom entry types, click the Custom tab. To view system entry types, click the System tab. For more information about custom and system entry types, see the categories of entry types section of this document.

  4. Optional: To view the list of entry types in your selected project, click the Custom tab, and then click the Show from all projects toggle to the off position.

    The Show from all projects toggle is on by default, and the list includes entry types across all projects.

gcloud

To view the list of available entry types, use the gcloud dataplex entry-types list command.

REST

To view the list of available entry types, use the entryTypes.list method.

View details of an entry type

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry types > Custom tab..

  3. Click the entry type for which you want to view the details.

    The entry type details page opens. You can access information such as display name, entry type ID, description, project ID, location, platform, system, type aliases, labels, creation date, and last modified date of the selected entry type.

  4. Optional: To view the list of 10 related entries created recently, click the Sample entries tab.

  5. Optional: To view all the entries related to an entry group, click the Sample entries tab and then click Show all related entries in search.

gcloud

To retrieve the details of an entry type, use the gcloud dataplex entry-types describe command.

REST

To retrieve the details of an entry type, use the entryTypes.get method.

Update an entry type

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry types > Custom tab.

  3. Click the entry type that you want to update.

  4. On the Entry type details page, click Edit.

  5. Edit the display name, description, system, platform, type aliases, and labels, as required.

  6. Click Save.

gcloud

To update an entry type, use the gcloud dataplex entry-types update command.

REST

To update an entry type, use the entryTypes.patch method.

Delete an entry type

Console

  1. In the Google Cloud console, go to the Dataplex Catalog page.

    Go to Catalog

  2. Click the Entry types > Custom tab.

  3. Click the entry type that you want to delete.

  4. On the Entry type details page, click Delete. Confirm when prompted.

gcloud

To delete an entry type, use the gcloud dataplex entry-types delete command.

REST

To delete an entry type, use the entryTypes.delete method.

What's next