Package google.cloud.dataplex.v1

Index

CatalogService

The primary resources offered by this service are EntryGroups, EntryTypes, AspectTypes, and Entries. They collectively let data administrators organize, manage, secure, and catalog data located across cloud projects in their organization in a variety of storage systems, including Cloud Storage and BigQuery.

CancelMetadataJob

rpc CancelMetadataJob(CancelMetadataJobRequest) returns (Empty)

Cancels a metadata job.

If you cancel a metadata import job that is in progress, the changes in the job might be partially applied. We recommend that you reset the state of the entry groups in your project by running another metadata job that reverts the changes from the canceled job.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateAspectType

rpc CreateAspectType(CreateAspectTypeRequest) returns (Operation)

Creates an AspectType.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • dataplex.aspectTypes.create

For more information, see the IAM documentation.

CreateEntry

rpc CreateEntry(CreateEntryRequest) returns (Entry)

Creates an Entry.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permissions on the parent resource:

  • dataplex.aspectTypes.use
  • dataplex.entries.create
  • dataplex.entryGroups.useContactsAspect
  • dataplex.entryGroups.useGenericAspect
  • dataplex.entryGroups.useGenericEntry
  • dataplex.entryGroups.useOverviewAspect
  • dataplex.entryGroups.useSchemaAspect
  • dataplex.entryTypes.use

For more information, see the IAM documentation.

CreateEntryGroup

rpc CreateEntryGroup(CreateEntryGroupRequest) returns (Operation)

Creates an EntryGroup.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • dataplex.entryGroups.create

For more information, see the IAM documentation.

CreateEntryType

rpc CreateEntryType(CreateEntryTypeRequest) returns (Operation)

Creates an EntryType.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • dataplex.entryTypes.create

For more information, see the IAM documentation.

CreateMetadataJob

rpc CreateMetadataJob(CreateMetadataJobRequest) returns (Operation)

Creates a metadata job. For example, use a metadata job to import Dataplex Catalog entries and aspects from a third-party system into Dataplex.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteAspectType

rpc DeleteAspectType(DeleteAspectTypeRequest) returns (Operation)

Deletes an AspectType.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.aspectTypes.delete

For more information, see the IAM documentation.

DeleteEntry

rpc DeleteEntry(DeleteEntryRequest) returns (Entry)

Deletes an Entry.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.entries.delete

For more information, see the IAM documentation.

DeleteEntryGroup

rpc DeleteEntryGroup(DeleteEntryGroupRequest) returns (Operation)

Deletes an EntryGroup.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.entryGroups.delete

For more information, see the IAM documentation.

DeleteEntryType

rpc DeleteEntryType(DeleteEntryTypeRequest) returns (Operation)

Deletes an EntryType.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.entryTypes.delete

For more information, see the IAM documentation.

GetAspectType

rpc GetAspectType(GetAspectTypeRequest) returns (AspectType)

Gets an AspectType.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.aspectTypes.get

For more information, see the IAM documentation.

GetEntry

rpc GetEntry(GetEntryRequest) returns (Entry)

Gets an Entry.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.entries.get

For more information, see the IAM documentation.

GetEntryGroup

rpc GetEntryGroup(GetEntryGroupRequest) returns (EntryGroup)

Gets an EntryGroup.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.entryGroups.get

For more information, see the IAM documentation.

GetEntryType

rpc GetEntryType(GetEntryTypeRequest) returns (EntryType)

Gets an EntryType.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.entryTypes.get

For more information, see the IAM documentation.

GetMetadataJob

rpc GetMetadataJob(GetMetadataJobRequest) returns (MetadataJob)

Gets a metadata job.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListAspectTypes

rpc ListAspectTypes(ListAspectTypesRequest) returns (ListAspectTypesResponse)

Lists AspectType resources in a project and location.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • dataplex.aspectTypes.list

For more information, see the IAM documentation.

ListEntries

rpc ListEntries(ListEntriesRequest) returns (ListEntriesResponse)

Lists Entries within an EntryGroup.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • dataplex.entries.list

For more information, see the IAM documentation.

ListEntryGroups

rpc ListEntryGroups(ListEntryGroupsRequest) returns (ListEntryGroupsResponse)

Lists EntryGroup resources in a project and location.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • dataplex.entryGroups.list

For more information, see the IAM documentation.

ListEntryTypes

rpc ListEntryTypes(ListEntryTypesRequest) returns (ListEntryTypesResponse)

Lists EntryType resources in a project and location.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • dataplex.entryTypes.list

For more information, see the IAM documentation.

ListMetadataJobs

rpc ListMetadataJobs(ListMetadataJobsRequest) returns (ListMetadataJobsResponse)

Lists metadata jobs.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

LookupEntry

rpc LookupEntry(LookupEntryRequest) returns (Entry)

Looks up a single Entry by name using the permission on the source system.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

SearchEntries

rpc SearchEntries(SearchEntriesRequest) returns (SearchEntriesResponse)

Searches for Entries matching the given query and scope.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.projects.search

For more information, see the IAM documentation.

UpdateAspectType

rpc UpdateAspectType(UpdateAspectTypeRequest) returns (Operation)

Updates an AspectType.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.aspectTypes.update

For more information, see the IAM documentation.

UpdateEntry

rpc UpdateEntry(UpdateEntryRequest) returns (Entry)

Updates an Entry.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permissions on the name resource:

  • dataplex.aspectTypes.use
  • dataplex.entries.create
  • dataplex.entries.update
  • dataplex.entryGroups.useContactsAspect
  • dataplex.entryGroups.useGenericAspect
  • dataplex.entryGroups.useGenericEntry
  • dataplex.entryGroups.useOverviewAspect
  • dataplex.entryGroups.useSchemaAspect
  • dataplex.entryTypes.use

For more information, see the IAM documentation.

UpdateEntryGroup

rpc UpdateEntryGroup(UpdateEntryGroupRequest) returns (Operation)

Updates an EntryGroup.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.entryGroups.update

For more information, see the IAM documentation.

UpdateEntryType

rpc UpdateEntryType(UpdateEntryTypeRequest) returns (Operation)

Updates an EntryType.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.entryTypes.update

For more information, see the IAM documentation.

ContentService

ContentService manages Notebook and SQL Scripts for Dataplex.

CreateContent

rpc CreateContent(CreateContentRequest) returns (Content)

Create a content.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteContent

rpc DeleteContent(DeleteContentRequest) returns (Empty)

Delete a content.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetContent

rpc GetContent(GetContentRequest) returns (Content)

Get a content resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetIamPolicy

rpc GetIamPolicy(GetIamPolicyRequest) returns (Policy)

Gets the access control policy for a contentitem resource. A NOT_FOUND error is returned if the resource does not exist. An empty policy is returned if the resource exists but does not have a policy set on it.

Caller must have Google IAM dataplex.content.getIamPolicy permission on the resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListContent

rpc ListContent(ListContentRequest) returns (ListContentResponse)

List content.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

SetIamPolicy

rpc SetIamPolicy(SetIamPolicyRequest) returns (Policy)

Sets the access control policy on the specified contentitem resource. Replaces any existing policy.

Caller must have Google IAM dataplex.content.setIamPolicy permission on the resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

TestIamPermissions

rpc TestIamPermissions(TestIamPermissionsRequest) returns (TestIamPermissionsResponse)

Returns the caller's permissions on a resource. If the resource does not exist, an empty set of permissions is returned (a NOT_FOUND error is not returned).

A caller is not required to have Google IAM permission to make this request.

Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may "fail open" without warning.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateContent

rpc UpdateContent(UpdateContentRequest) returns (Content)

Update a content. Only supports full resource update.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DataScanService

DataScanService manages DataScan resources which can be configured to run various types of data scanning workload and generate enriched metadata (e.g. Data Profile, Data Quality) for the data source.

CreateDataScan

rpc CreateDataScan(CreateDataScanRequest) returns (Operation)

Creates a DataScan resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteDataScan

rpc DeleteDataScan(DeleteDataScanRequest) returns (Operation)

Deletes a DataScan resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GenerateDataQualityRules

rpc GenerateDataQualityRules(GenerateDataQualityRulesRequest) returns (GenerateDataQualityRulesResponse)

Generates recommended data quality rules based on the results of a data profiling scan.

Use the recommendations to build rules for a data quality scan.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetDataScan

rpc GetDataScan(GetDataScanRequest) returns (DataScan)

Gets a DataScan resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetDataScanJob

rpc GetDataScanJob(GetDataScanJobRequest) returns (DataScanJob)

Gets a DataScanJob resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListDataScanJobs

rpc ListDataScanJobs(ListDataScanJobsRequest) returns (ListDataScanJobsResponse)

Lists DataScanJobs under the given DataScan.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListDataScans

rpc ListDataScans(ListDataScansRequest) returns (ListDataScansResponse)

Lists DataScans.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

RunDataScan

rpc RunDataScan(RunDataScanRequest) returns (RunDataScanResponse)

Runs an on-demand execution of a DataScan

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateDataScan

rpc UpdateDataScan(UpdateDataScanRequest) returns (Operation)

Updates a DataScan resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DataTaxonomyService

DataTaxonomyService enables attribute-based governance. The resources currently offered include DataTaxonomy and DataAttribute.

CreateDataAttribute

rpc CreateDataAttribute(CreateDataAttributeRequest) returns (Operation)

Create a DataAttribute resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateDataAttributeBinding

rpc CreateDataAttributeBinding(CreateDataAttributeBindingRequest) returns (Operation)

Create a DataAttributeBinding resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateDataTaxonomy

rpc CreateDataTaxonomy(CreateDataTaxonomyRequest) returns (Operation)

Create a DataTaxonomy resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteDataAttribute

rpc DeleteDataAttribute(DeleteDataAttributeRequest) returns (Operation)

Deletes a Data Attribute resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteDataAttributeBinding

rpc DeleteDataAttributeBinding(DeleteDataAttributeBindingRequest) returns (Operation)

Deletes a DataAttributeBinding resource. All attributes within the DataAttributeBinding must be deleted before the DataAttributeBinding can be deleted.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteDataTaxonomy

rpc DeleteDataTaxonomy(DeleteDataTaxonomyRequest) returns (Operation)

Deletes a DataTaxonomy resource. All attributes within the DataTaxonomy must be deleted before the DataTaxonomy can be deleted.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetDataAttribute

rpc GetDataAttribute(GetDataAttributeRequest) returns (DataAttribute)

Retrieves a Data Attribute resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetDataAttributeBinding

rpc GetDataAttributeBinding(GetDataAttributeBindingRequest) returns (DataAttributeBinding)

Retrieves a DataAttributeBinding resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetDataTaxonomy

rpc GetDataTaxonomy(GetDataTaxonomyRequest) returns (DataTaxonomy)

Retrieves a DataTaxonomy resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListDataAttributeBindings

rpc ListDataAttributeBindings(ListDataAttributeBindingsRequest) returns (ListDataAttributeBindingsResponse)

Lists DataAttributeBinding resources in a project and location.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListDataAttributes

rpc ListDataAttributes(ListDataAttributesRequest) returns (ListDataAttributesResponse)

Lists Data Attribute resources in a DataTaxonomy.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListDataTaxonomies

rpc ListDataTaxonomies(ListDataTaxonomiesRequest) returns (ListDataTaxonomiesResponse)

Lists DataTaxonomy resources in a project and location.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateDataAttribute

rpc UpdateDataAttribute(UpdateDataAttributeRequest) returns (Operation)

Updates a DataAttribute resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateDataAttributeBinding

rpc UpdateDataAttributeBinding(UpdateDataAttributeBindingRequest) returns (Operation)

Updates a DataAttributeBinding resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateDataTaxonomy

rpc UpdateDataTaxonomy(UpdateDataTaxonomyRequest) returns (Operation)

Updates a DataTaxonomy resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DataplexService

Dataplex service provides data lakes as a service. The primary resources offered by this service are Lakes, Zones and Assets which collectively allow a data administrator to organize, manage, secure and catalog data across their organization located across cloud projects in a variety of storage systems including Cloud Storage and BigQuery.

CancelJob

rpc CancelJob(CancelJobRequest) returns (Empty)

Cancel jobs running for the task resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateAsset

rpc CreateAsset(CreateAssetRequest) returns (Operation)

Creates an asset resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateEnvironment

rpc CreateEnvironment(CreateEnvironmentRequest) returns (Operation)

Create an environment resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateLake

rpc CreateLake(CreateLakeRequest) returns (Operation)

Creates a lake resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateTask

rpc CreateTask(CreateTaskRequest) returns (Operation)

Creates a task resource within a lake.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateZone

rpc CreateZone(CreateZoneRequest) returns (Operation)

Creates a zone resource within a lake.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteAsset

rpc DeleteAsset(DeleteAssetRequest) returns (Operation)

Deletes an asset resource. The referenced storage resource is detached (default) or deleted based on the associated Lifecycle policy.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteEnvironment

rpc DeleteEnvironment(DeleteEnvironmentRequest) returns (Operation)

Delete the environment resource. All the child resources must have been deleted before environment deletion can be initiated.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteLake

rpc DeleteLake(DeleteLakeRequest) returns (Operation)

Deletes a lake resource. All zones within the lake must be deleted before the lake can be deleted.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteTask

rpc DeleteTask(DeleteTaskRequest) returns (Operation)

Delete the task resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteZone

rpc DeleteZone(DeleteZoneRequest) returns (Operation)

Deletes a zone resource. All assets within a zone must be deleted before the zone can be deleted.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetAsset

rpc GetAsset(GetAssetRequest) returns (Asset)

Retrieves an asset resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetEnvironment

rpc GetEnvironment(GetEnvironmentRequest) returns (Environment)

Get environment resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetJob

rpc GetJob(GetJobRequest) returns (Job)

Get job resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetLake

rpc GetLake(GetLakeRequest) returns (Lake)

Retrieves a lake resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetTask

rpc GetTask(GetTaskRequest) returns (Task)

Get task resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetZone

rpc GetZone(GetZoneRequest) returns (Zone)

Retrieves a zone resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListAssetActions

rpc ListAssetActions(ListAssetActionsRequest) returns (ListActionsResponse)

Lists action resources in an asset.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListAssets

rpc ListAssets(ListAssetsRequest) returns (ListAssetsResponse)

Lists asset resources in a zone.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListEnvironments

rpc ListEnvironments(ListEnvironmentsRequest) returns (ListEnvironmentsResponse)

Lists environments under the given lake.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListJobs

rpc ListJobs(ListJobsRequest) returns (ListJobsResponse)

Lists Jobs under the given task.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListLakeActions

rpc ListLakeActions(ListLakeActionsRequest) returns (ListActionsResponse)

Lists action resources in a lake.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListLakes

rpc ListLakes(ListLakesRequest) returns (ListLakesResponse)

Lists lake resources in a project and location.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListSessions

rpc ListSessions(ListSessionsRequest) returns (ListSessionsResponse)

Lists session resources in an environment.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListTasks

rpc ListTasks(ListTasksRequest) returns (ListTasksResponse)

Lists tasks under the given lake.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListZoneActions

rpc ListZoneActions(ListZoneActionsRequest) returns (ListActionsResponse)

Lists action resources in a zone.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListZones

rpc ListZones(ListZonesRequest) returns (ListZonesResponse)

Lists zone resources in a lake.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

RunTask

rpc RunTask(RunTaskRequest) returns (RunTaskResponse)

Run an on demand execution of a Task.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • dataplex.tasks.run

For more information, see the IAM documentation.

UpdateAsset

rpc UpdateAsset(UpdateAssetRequest) returns (Operation)

Updates an asset resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateEnvironment

rpc UpdateEnvironment(UpdateEnvironmentRequest) returns (Operation)

Update the environment resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateLake

rpc UpdateLake(UpdateLakeRequest) returns (Operation)

Updates a lake resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateTask

rpc UpdateTask(UpdateTaskRequest) returns (Operation)

Update the task resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateZone

rpc UpdateZone(UpdateZoneRequest) returns (Operation)

Updates a zone resource.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

MetadataService

Metadata service manages metadata resources such as tables, filesets and partitions.

CreateEntity

rpc CreateEntity(CreateEntityRequest) returns (Entity)

Create a metadata entity.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreatePartition

rpc CreatePartition(CreatePartitionRequest) returns (Partition)

Create a metadata partition.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteEntity

rpc DeleteEntity(DeleteEntityRequest) returns (Empty)

Delete a metadata entity.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeletePartition

rpc DeletePartition(DeletePartitionRequest) returns (Empty)

Delete a metadata partition.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetEntity

rpc GetEntity(GetEntityRequest) returns (Entity)

Get a metadata entity.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetPartition

rpc GetPartition(GetPartitionRequest) returns (Partition)

Get a metadata partition of an entity.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListEntities

rpc ListEntities(ListEntitiesRequest) returns (ListEntitiesResponse)

List metadata entities in a zone.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListPartitions

rpc ListPartitions(ListPartitionsRequest) returns (ListPartitionsResponse)

List metadata partitions of an entity.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateEntity

rpc UpdateEntity(UpdateEntityRequest) returns (Entity)

Update a metadata entity. Only supports full resource update.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

Action

Action represents an issue requiring administrator action for resolution.

Fields
category

Category

The category of issue associated with the action.

issue

string

Detailed description of the issue requiring action.

detect_time

Timestamp

The time that the issue was detected.

name

string

Output only. The relative resource name of the action, of the form: projects/{project}/locations/{location}/lakes/{lake}/actions/{action} projects/{project}/locations/{location}/lakes/{lake}/zones/{zone}/actions/{action} projects/{project}/locations/{location}/lakes/{lake}/zones/{zone}/assets/{asset}/actions/{action}.

lake

string

Output only. The relative resource name of the lake, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}.

zone

string

Output only. The relative resource name of the zone, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}.

asset

string

Output only. The relative resource name of the asset, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/assets/{asset_id}.

data_locations[]

string

The list of data locations associated with this action. Cloud Storage locations are represented as URI paths(E.g. gs://bucket/table1/year=2020/month=Jan/). BigQuery locations refer to resource names(E.g. bigquery.googleapis.com/projects/project-id/datasets/dataset-id).

Union field details. Additional details about the action based on the action category. details can be only one of the following:
invalid_data_format

InvalidDataFormat

Details for issues related to invalid or unsupported data formats.

incompatible_data_schema

IncompatibleDataSchema

Details for issues related to incompatible schemas detected within data.

invalid_data_partition

InvalidDataPartition

Details for issues related to invalid or unsupported data partition structure.

missing_data

MissingData

Details for issues related to absence of data within managed resources.

missing_resource

MissingResource

Details for issues related to absence of a managed resource.

unauthorized_resource

UnauthorizedResource

Details for issues related to lack of permissions to access data resources.

failed_security_policy_apply

FailedSecurityPolicyApply

Details for issues related to applying security policy.

invalid_data_organization

InvalidDataOrganization

Details for issues related to invalid data arrangement.

Category

The category of issues.

Enums
CATEGORY_UNSPECIFIED Unspecified category.
RESOURCE_MANAGEMENT Resource management related issues.
SECURITY_POLICY Security policy related issues.
DATA_DISCOVERY Data and discovery related issues.

FailedSecurityPolicyApply

Failed to apply security policy to the managed resource(s) under a lake, zone or an asset. For a lake or zone resource, one or more underlying assets has a failure applying security policy to the associated managed resource.

Fields
asset

string

Resource name of one of the assets with failing security policy application. Populated for a lake or zone resource only.

IncompatibleDataSchema

Action details for incompatible schemas detected by discovery.

Fields
table

string

The name of the table containing invalid data.

existing_schema

string

The existing and expected schema of the table. The schema is provided as a JSON formatted structure listing columns and data types.

new_schema

string

The new and incompatible schema within the table. The schema is provided as a JSON formatted structured listing columns and data types.

sampled_data_locations[]

string

The list of data locations sampled and used for format/schema inference.

schema_change

SchemaChange

Whether the action relates to a schema that is incompatible or modified.

SchemaChange

Whether the action relates to a schema that is incompatible or modified.

Enums
SCHEMA_CHANGE_UNSPECIFIED Schema change unspecified.
INCOMPATIBLE Newly discovered schema is incompatible with existing schema.
MODIFIED Newly discovered schema has changed from existing schema for data in a curated zone.

InvalidDataFormat

Action details for invalid or unsupported data files detected by discovery.

Fields
sampled_data_locations[]

string

The list of data locations sampled and used for format/schema inference.

expected_format

string

The expected data format of the entity.

new_format

string

The new unexpected data format within the entity.

InvalidDataOrganization

This type has no fields.

Action details for invalid data arrangement.

InvalidDataPartition

Action details for invalid or unsupported partitions detected by discovery.

Fields
expected_structure

PartitionStructure

The issue type of InvalidDataPartition.

PartitionStructure

The expected partition structure.

Enums
PARTITION_STRUCTURE_UNSPECIFIED PartitionStructure unspecified.
CONSISTENT_KEYS Consistent hive-style partition definition (both raw and curated zone).
HIVE_STYLE_KEYS Hive style partition definition (curated zone only).

MissingData

This type has no fields.

Action details for absence of data detected by discovery.

MissingResource

This type has no fields.

Action details for resource references in assets that cannot be located.

UnauthorizedResource

This type has no fields.

Action details for unauthorized resource issues raised to indicate that the service account associated with the lake instance is not authorized to access or manage the resource associated with an asset.

Aspect

An aspect is a single piece of metadata describing an entry.

Fields
aspect_type

string

Output only. The resource name of the type used to create this Aspect.

path

string

Output only. The path in the entry under which the aspect is attached.

create_time

Timestamp

Output only. The time when the Aspect was created.

update_time

Timestamp

Output only. The time when the Aspect was last updated.

data

Struct

Required. The content of the aspect, according to its aspect type schema. The maximum size of the field is 120KB (encoded as UTF-8).

aspect_source

AspectSource

Optional. Information related to the source system of the aspect.

AspectSource

Information related to the source system of the aspect.

Fields
create_time

Timestamp

The time the aspect was created in the source system.

update_time

Timestamp

The time the aspect was last updated in the source system.

AspectType

AspectType is a template for creating Aspects, and represents the JSON-schema for a given Entry, for example, BigQuery Table Schema.

Fields
name

string

Output only. The relative resource name of the AspectType, of the form: projects/{project_number}/locations/{location_id}/aspectTypes/{aspect_type_id}.

uid

string

Output only. System generated globally unique ID for the AspectType. If you delete and recreate the AspectType with the same name, then this ID will be different.

create_time

Timestamp

Output only. The time when the AspectType was created.

update_time

Timestamp

Output only. The time when the AspectType was last updated.

description

string

Optional. Description of the AspectType.

display_name

string

Optional. User friendly display name.

labels

map<string, string>

Optional. User-defined labels for the AspectType.

etag

string

The service computes this checksum. The client may send it on update and delete requests to ensure it has an up-to-date value before proceeding.

authorization

Authorization

Immutable. Defines the Authorization for this type.

metadata_template

MetadataTemplate

Required. MetadataTemplate of the aspect.

Authorization

Autorization for an AspectType.

Fields
alternate_use_permission

string

Immutable. The IAM permission grantable on the EntryGroup to allow access to instantiate Aspects of Dataplex owned AspectTypes, only settable for Dataplex owned Types.

MetadataTemplate

MetadataTemplate definition for an AspectType.

Fields
index

int32

Optional. Index is used to encode Template messages. The value of index can range between 1 and 2,147,483,647. Index must be unique within all fields in a Template. (Nested Templates can reuse indexes). Once a Template is defined, the index cannot be changed, because it identifies the field in the actual storage format. Index is a mandatory field, but it is optional for top level fields, and map/array "values" definitions.

name

string

Required. The name of the field.

type

string

Required. The datatype of this field. The following values are supported:

Primitive types:

  • string
  • integer
  • boolean
  • double
  • datetime. Must be of the format RFC3339 UTC "Zulu" (Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z").

Complex types:

  • enum
  • array
  • map
  • record
record_fields[]

MetadataTemplate

Optional. Field definition. You must specify it if the type is record. It defines the nested fields.

enum_values[]

EnumValue

Optional. The list of values for an enum type. You must define it if the type is enum.

map_items

MetadataTemplate

Optional. If the type is map, set map_items. map_items can refer to a primitive field or a complex (record only) field. To specify a primitive field, you only need to set name and type in the nested MetadataTemplate. The recommended value for the name field is item, as this isn't used in the actual payload.

array_items

MetadataTemplate

Optional. If the type is array, set array_items. array_items can refer to a primitive field or a complex (record only) field. To specify a primitive field, you only need to set name and type in the nested MetadataTemplate. The recommended value for the name field is item, as this isn't used in the actual payload.

type_id

string

Optional. You can use type id if this definition of the field needs to be reused later. The type id must be unique across the entire template. You can only specify it if the field type is record.

type_ref

string

Optional. A reference to another field definition (not an inline definition). The value must be equal to the value of an id field defined elsewhere in the MetadataTemplate. Only fields with record type can refer to other fields.

constraints

Constraints

Optional. Specifies the constraints on this field.

annotations

Annotations

Optional. Specifies annotations on this field.

Annotations

Definition of the annotations of a field.

Fields
deprecated

string

Optional. Marks a field as deprecated. You can include a deprecation message.

display_name

string

Optional. Display name for a field.

description

string

Optional. Description for a field.

display_order

int32

Optional. Display order for a field. You can use this to reorder where a field is rendered.

string_type

string

Optional. You can use String Type annotations to specify special meaning to string fields. The following values are supported:

  • richText: The field must be interpreted as a rich text field.
  • url: A fully qualified URL link.
  • resource: A service qualified resource reference.
string_values[]

string

Optional. Suggested hints for string fields. You can use them to suggest values to users through console.

Constraints

Definition of the constraints of a field.

Fields
required

bool

Optional. Marks this field as optional or required.

EnumValue

Definition of Enumvalue, to be used for enum fields.

Fields
index

int32

Required. Index for the enum value. It can't be modified.

name

string

Required. Name of the enumvalue. This is the actual value that the aspect can contain.

deprecated

string

Optional. You can set this message if you need to deprecate an enum value.

Asset

An asset represents a cloud resource that is being managed within a lake as a member of a zone.

Fields
name

string

Output only. The relative resource name of the asset, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/assets/{asset_id}.

display_name

string

Optional. User friendly display name.

uid

string

Output only. System generated globally unique ID for the asset. This ID will be different if the asset is deleted and re-created with the same name.

create_time

Timestamp

Output only. The time when the asset was created.

update_time

Timestamp

Output only. The time when the asset was last updated.

labels

map<string, string>

Optional. User defined labels for the asset.

description

string

Optional. Description of the asset.

state

State

Output only. Current state of the asset.

resource_spec

ResourceSpec

Required. Specification of the resource that is referenced by this asset.

resource_status

ResourceStatus

Output only. Status of the resource referenced by this asset.

security_status

SecurityStatus

Output only. Status of the security policy applied to resource referenced by this asset.

discovery_spec

DiscoverySpec

Optional. Specification of the discovery feature applied to data referenced by this asset. When this spec is left unset, the asset will use the spec set on the parent zone.

discovery_status

DiscoveryStatus

Output only. Status of the discovery feature applied to data referenced by this asset.

DiscoverySpec

Settings to manage the metadata discovery and publishing for an asset.

Fields
enabled

bool

Optional. Whether discovery is enabled.

include_patterns[]

string

Optional. The list of patterns to apply for selecting data to include during discovery if only a subset of the data should considered. For Cloud Storage bucket assets, these are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these are interpreted as patterns to match table names.

exclude_patterns[]

string

Optional. The list of patterns to apply for selecting data to exclude during discovery. For Cloud Storage bucket assets, these are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these are interpreted as patterns to match table names.

csv_options

CsvOptions

Optional. Configuration for CSV data.

json_options

JsonOptions

Optional. Configuration for Json data.

Union field trigger. Determines when discovery is triggered. trigger can be only one of the following:
schedule

string

Optional. Cron schedule (https://en.wikipedia.org/wiki/Cron) for running discovery periodically. Successive discovery runs must be scheduled at least 60 minutes apart. The default value is to run discovery every 60 minutes. To explicitly set a timezone to the cron tab, apply a prefix in the cron tab: "CRON_TZ=${IANA_TIME_ZONE}" or TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database. For example, CRON_TZ=America/New_York 1 * * * *, or TZ=America/New_York 1 * * * *.

CsvOptions

Describe CSV and similar semi-structured data formats.

Fields
header_rows

int32

Optional. The number of rows to interpret as header rows that should be skipped when reading data rows.

delimiter

string

Optional. The delimiter being used to separate values. This defaults to ','.

encoding

string

Optional. The character encoding of the data. The default is UTF-8.

disable_type_inference

bool

Optional. Whether to disable the inference of data type for CSV data. If true, all columns will be registered as strings.

JsonOptions

Describe JSON data format.

Fields
encoding

string

Optional. The character encoding of the data. The default is UTF-8.

disable_type_inference

bool

Optional. Whether to disable the inference of data type for Json data. If true, all columns will be registered as their primitive types (strings, number or boolean).

DiscoveryStatus

Status of discovery for an asset.

Fields
state

State

The current status of the discovery feature.

message

string

Additional information about the current state.

update_time

Timestamp

Last update time of the status.

last_run_time

Timestamp

The start time of the last discovery run.

stats

Stats

Data Stats of the asset reported by discovery.

last_run_duration

Duration

The duration of the last discovery run.

State

Current state of discovery.

Enums
STATE_UNSPECIFIED State is unspecified.
SCHEDULED Discovery for the asset is scheduled.
IN_PROGRESS Discovery for the asset is running.
PAUSED Discovery for the asset is currently paused (e.g. due to a lack of available resources). It will be automatically resumed.
DISABLED Discovery for the asset is disabled.

Stats

The aggregated data statistics for the asset reported by discovery.

Fields
data_items

int64

The count of data items within the referenced resource.

data_size

int64

The number of stored data bytes within the referenced resource.

tables

int64

The count of table entities within the referenced resource.

filesets

int64

The count of fileset entities within the referenced resource.

ResourceSpec

Identifies the cloud resource that is referenced by this asset.

Fields
name

string

Immutable. Relative name of the cloud resource that contains the data that is being managed within a lake. For example: projects/{project_number}/buckets/{bucket_id} projects/{project_number}/datasets/{dataset_id}

type

Type

Required. Immutable. Type of resource.

read_access_mode

AccessMode

Optional. Determines how read permissions are handled for each asset and their associated tables. Only available to storage buckets assets.

AccessMode

Access Mode determines how data stored within the resource is read. This is only applicable to storage bucket assets.

Enums
ACCESS_MODE_UNSPECIFIED Access mode unspecified.
DIRECT Default. Data is accessed directly using storage APIs.
MANAGED Data is accessed through a managed interface using BigQuery APIs.

Type

Type of resource.

Enums
TYPE_UNSPECIFIED Type not specified.
STORAGE_BUCKET Cloud Storage bucket.
BIGQUERY_DATASET BigQuery dataset.

ResourceStatus

Status of the resource referenced by an asset.

Fields
state

State

The current state of the managed resource.

message

string

Additional information about the current state.

update_time

Timestamp

Last update time of the status.

managed_access_identity

string

Output only. Service account associated with the BigQuery Connection.

State

The state of a resource.

Enums
STATE_UNSPECIFIED State unspecified.
READY Resource does not have any errors.
ERROR Resource has errors.

SecurityStatus

Security policy status of the asset. Data security policy, i.e., readers, writers & owners, should be specified in the lake/zone/asset IAM policy.

Fields
state

State

The current state of the security policy applied to the attached resource.

message

string

Additional information about the current state.

update_time

Timestamp

Last update time of the status.

State

The state of the security policy.

Enums
STATE_UNSPECIFIED State unspecified.
READY Security policy has been successfully applied to the attached resource.
APPLYING Security policy is in the process of being applied to the attached resource.
ERROR Security policy could not be applied to the attached resource due to errors.

AssetStatus

Aggregated status of the underlying assets of a lake or zone.

Fields
update_time

Timestamp

Last update time of the status.

active_assets

int32

Number of active assets.

security_policy_applying_assets

int32

Number of assets that are in process of updating the security policy on attached resources.

CancelJobRequest

Cancel task jobs.

Fields
name

string

Required. The resource name of the job: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/task/{task_id}/job/{job_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.tasks.cancel

CancelMetadataJobRequest

Cancel metadata job request.

Fields
name

string

Required. The resource name of the job, in the format projects/{project_id_or_number}/locations/{location_id}/metadataJobs/{metadata_job_id}

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.metadataJobs.cancel

Content

Content represents a user-visible notebook or a sql script

Fields
name

string

Output only. The relative resource name of the content, of the form: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/content/{content_id}

uid

string

Output only. System generated globally unique ID for the content. This ID will be different if the content is deleted and re-created with the same name.

path

string

Required. The path for the Content file, represented as directory structure. Unique within a lake. Limited to alphanumerics, hyphens, underscores, dots and slashes.

create_time

Timestamp

Output only. Content creation time.

update_time

Timestamp

Output only. The time when the content was last updated.

labels

map<string, string>

Optional. User defined labels for the content.

description

string

Optional. Description of the content.

Union field data. Only returned in GetContent requests and not in ListContent request. data can be only one of the following:
data_text

string

Required. Content data in string format.

Union field content. Types of content content can be only one of the following:
sql_script

SqlScript

Sql Script related configurations.

notebook

Notebook

Notebook related configurations.

Notebook

Configuration for Notebook content.

Fields
kernel_type

KernelType

Required. Kernel Type of the notebook.

KernelType

Kernel Type of the Jupyter notebook.

Enums
KERNEL_TYPE_UNSPECIFIED Kernel Type unspecified.
PYTHON3 Python 3 Kernel.

SqlScript

Configuration for the Sql Script content.

Fields
engine

QueryEngine

Required. Query Engine to be used for the Sql Query.

QueryEngine

Query Engine Type of the SQL Script.

Enums
QUERY_ENGINE_UNSPECIFIED Value was unspecified.
SPARK Spark SQL Query.

CreateAspectTypeRequest

Create AspectType Request.

Fields
parent

string

Required. The resource name of the AspectType, of the form: projects/{project_number}/locations/{location_id} where location_id refers to a Google Cloud region.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.aspectTypes.create
aspect_type_id

string

Required. AspectType identifier.

aspect_type

AspectType

Required. AspectType Resource.

validate_only

bool

Optional. The service validates the request without performing any mutations. The default is false.

CreateAssetRequest

Create asset request.

Fields
parent

string

Required. The resource name of the parent zone: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.assets.create
asset_id

string

Required. Asset identifier. This ID will be used to generate names such as table names when publishing metadata to Hive Metastore and BigQuery. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must end with a number or a letter. * Must be between 1-63 characters. * Must be unique within the zone.

asset

Asset

Required. Asset resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateContentRequest

Create content request.

Fields
parent

string

Required. The resource name of the parent lake: projects/{project_id}/locations/{location_id}/lakes/{lake_id}

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.content.create
content

Content

Required. Content resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateDataAttributeBindingRequest

Create DataAttributeBinding request.

Fields
parent

string

Required. The resource name of the parent data taxonomy projects/{project_number}/locations/{location_id}

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.dataAttributeBindings.create
data_attribute_binding_id

string

Required. DataAttributeBinding identifier. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must be between 1-63 characters. * Must end with a number or a letter. * Must be unique within the Location.

data_attribute_binding

DataAttributeBinding

Required. DataAttributeBinding resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateDataAttributeRequest

Create DataAttribute request.

Fields
parent

string

Required. The resource name of the parent data taxonomy projects/{project_number}/locations/{location_id}/dataTaxonomies/{data_taxonomy_id}

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.dataAttributes.create
data_attribute_id

string

Required. DataAttribute identifier. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must be between 1-63 characters. * Must end with a number or a letter. * Must be unique within the DataTaxonomy.

data_attribute

DataAttribute

Required. DataAttribute resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateDataScanRequest

Create dataScan request.

Fields
parent

string

Required. The resource name of the parent location: projects/{project}/locations/{location_id} where project refers to a project_id or project_number and location_id refers to a GCP region.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.datascans.create
data_scan

DataScan

Required. DataScan resource.

data_scan_id

string

Required. DataScan identifier.

  • Must contain only lowercase letters, numbers and hyphens.
  • Must start with a letter.
  • Must end with a number or a letter.
  • Must be between 1-63 characters.
  • Must be unique within the customer project / location.
validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateDataTaxonomyRequest

Create DataTaxonomy request.

Fields
parent

string

Required. The resource name of the data taxonomy location, of the form: projects/{project_number}/locations/{location_id} where location_id refers to a GCP region.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.dataTaxonomies.create
data_taxonomy_id

string

Required. DataTaxonomy identifier. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must be between 1-63 characters. * Must end with a number or a letter. * Must be unique within the Project.

data_taxonomy

DataTaxonomy

Required. DataTaxonomy resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateEntityRequest

Create a metadata entity request.

Fields
parent

string

Required. The resource name of the parent zone: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.entities.create
entity

Entity

Required. Entity resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateEntryGroupRequest

Create EntryGroup Request.

Fields
parent

string

Required. The resource name of the entryGroup, of the form: projects/{project_number}/locations/{location_id} where location_id refers to a GCP region.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.entryGroups.create
entry_group_id

string

Required. EntryGroup identifier.

entry_group

EntryGroup

Required. EntryGroup Resource.

validate_only

bool

Optional. The service validates the request without performing any mutations. The default is false.

CreateEntryRequest

Create Entry request.

Fields
parent

string

Required. The resource name of the parent Entry Group: projects/{project}/locations/{location}/entryGroups/{entry_group}.

entry_id

string

Required. Entry identifier. It has to be unique within an Entry Group.

Entries corresponding to Google Cloud resources use an Entry ID format based on full resource names. The format is a full resource name of the resource without the prefix double slashes in the API service name part of the full resource name. This allows retrieval of entries using their associated resource name.

For example, if the full resource name of a resource is //library.googleapis.com/shelves/shelf1/books/book2, then the suggested entry_id is library.googleapis.com/shelves/shelf1/books/book2.

It is also suggested to follow the same convention for entries corresponding to resources from providers or systems other than Google Cloud.

The maximum size of the field is 4000 characters.

entry

Entry

Required. Entry resource.

CreateEntryTypeRequest

Create EntryType Request.

Fields
parent

string

Required. The resource name of the EntryType, of the form: projects/{project_number}/locations/{location_id} where location_id refers to a Google Cloud region.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.entryTypes.create
entry_type_id

string

Required. EntryType identifier.

entry_type

EntryType

Required. EntryType Resource.

validate_only

bool

Optional. The service validates the request without performing any mutations. The default is false.

CreateEnvironmentRequest

Create environment request.

Fields
parent

string

Required. The resource name of the parent lake: projects/{project_id}/locations/{location_id}/lakes/{lake_id}.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.environments.create
environment_id

string

Required. Environment identifier. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must be between 1-63 characters. * Must end with a number or a letter. * Must be unique within the lake.

environment

Environment

Required. Environment resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateLakeRequest

Create lake request.

Fields
parent

string

Required. The resource name of the lake location, of the form: projects/{project_number}/locations/{location_id} where location_id refers to a GCP region.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.lakes.create
lake_id

string

Required. Lake identifier. This ID will be used to generate names such as database and dataset names when publishing metadata to Hive Metastore and BigQuery. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must end with a number or a letter. * Must be between 1-63 characters. * Must be unique within the customer project / location.

lake

Lake

Required. Lake resource

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateMetadataJobRequest

Create metadata job request.

Fields
parent

string

Required. The resource name of the parent location, in the format projects/{project_id_or_number}/locations/{location_id}

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.metadataJobs.create
metadata_job

MetadataJob

Required. The metadata job resource.

metadata_job_id

string

Optional. The metadata job ID. If not provided, a unique ID is generated with the prefix metadata-job-.

CreatePartitionRequest

Create metadata partition request.

Fields
parent

string

Required. The resource name of the parent zone: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{entity_id}.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.partitions.create
partition

Partition

Required. Partition resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateTaskRequest

Create task request.

Fields
parent

string

Required. The resource name of the parent lake: projects/{project_number}/locations/{location_id}/lakes/{lake_id}.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.tasks.create
task_id

string

Required. Task identifier.

task

Task

Required. Task resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

CreateZoneRequest

Create zone request.

Fields
parent

string

Required. The resource name of the parent lake: projects/{project_number}/locations/{location_id}/lakes/{lake_id}.

Authorization requires the following IAM permission on the specified resource parent:

  • dataplex.zones.create
zone_id

string

Required. Zone identifier. This ID will be used to generate names such as database and dataset names when publishing metadata to Hive Metastore and BigQuery. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must end with a number or a letter. * Must be between 1-63 characters. * Must be unique across all lakes from all locations in a project. * Must not be one of the reserved IDs (i.e. "default", "global-temp")

zone

Zone

Required. Zone resource.

validate_only

bool

Optional. Only validate the request, but do not perform mutations. The default is false.

DataAccessSpec

DataAccessSpec holds the access control configuration to be enforced on data stored within resources (eg: rows, columns in BigQuery Tables). When associated with data, the data is only accessible to principals explicitly granted access through the DataAccessSpec. Principals with access to the containing resource are not implicitly granted access.

Fields
readers[]

string

Optional. The format of strings follows the pattern followed by IAM in the bindings. user:{email}, serviceAccount:{email} group:{email}. The set of principals to be granted reader role on data stored within resources.

DataAttribute

Denotes one dataAttribute in a dataTaxonomy, for example, PII. DataAttribute resources can be defined in a hierarchy. A single dataAttribute resource can contain specs of multiple types

PII
  - ResourceAccessSpec :
                - readers :foo@bar.com
  - DataAccessSpec :
                - readers :bar@foo.com
Fields
name

string

Output only. The relative resource name of the dataAttribute, of the form: projects/{project_number}/locations/{location_id}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id}.

uid

string

Output only. System generated globally unique ID for the DataAttribute. This ID will be different if the DataAttribute is deleted and re-created with the same name.

create_time

Timestamp

Output only. The time when the DataAttribute was created.

update_time

Timestamp

Output only. The time when the DataAttribute was last updated.

description

string

Optional. Description of the DataAttribute.

display_name

string

Optional. User friendly display name.

labels

map<string, string>

Optional. User-defined labels for the DataAttribute.

parent_id

string

Optional. The ID of the parent DataAttribute resource, should belong to the same data taxonomy. Circular dependency in parent chain is not valid. Maximum depth of the hierarchy allowed is 4. [a -> b -> c -> d -> e, depth = 4]

attribute_count

int32

Output only. The number of child attributes present for this attribute.

etag

string

This checksum is computed by the server based on the value of other fields, and may be sent on update and delete requests to ensure the client has an up-to-date value before proceeding.

resource_access_spec

ResourceAccessSpec

Optional. Specified when applied to a resource (eg: Cloud Storage bucket, BigQuery dataset, BigQuery table).

data_access_spec

DataAccessSpec

Optional. Specified when applied to data stored on the resource (eg: rows, columns in BigQuery Tables).

DataAttributeBinding

DataAttributeBinding represents binding of attributes to resources. Eg: Bind 'CustomerInfo' entity with 'PII' attribute.

Fields
name

string

Output only. The relative resource name of the Data Attribute Binding, of the form: projects/{project_number}/locations/{location}/dataAttributeBindings/{data_attribute_binding_id}

uid

string

Output only. System generated globally unique ID for the DataAttributeBinding. This ID will be different if the DataAttributeBinding is deleted and re-created with the same name.

create_time

Timestamp

Output only. The time when the DataAttributeBinding was created.

update_time

Timestamp

Output only. The time when the DataAttributeBinding was last updated.

description

string

Optional. Description of the DataAttributeBinding.

display_name

string

Optional. User friendly display name.

labels

map<string, string>

Optional. User-defined labels for the DataAttributeBinding.

etag

string

This checksum is computed by the server based on the value of other fields, and may be sent on update and delete requests to ensure the client has an up-to-date value before proceeding. Etags must be used when calling the DeleteDataAttributeBinding and the UpdateDataAttributeBinding method.

attributes[]

string

Optional. List of attributes to be associated with the resource, provided in the form: projects/{project}/locations/{location}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id}

paths[]

Path

Optional. The list of paths for items within the associated resource (eg. columns and partitions within a table) along with attribute bindings.

Union field resource_reference. The reference to the resource that is associated to attributes, or the query to match resources and associate attributes. resource_reference can be only one of the following:
resource

string

Optional. Immutable. The resource name of the resource that is associated to attributes. Presently, only entity resource is supported in the form: projects/{project}/locations/{location}/lakes/{lake}/zones/{zone}/entities/{entity_id} Must belong in the same project and region as the attribute binding, and there can only exist one active binding for a resource.

Path

Represents a subresource of the given resource, and associated bindings with it. Currently supported subresources are column and partition schema fields within a table.

Fields
name

string

Required. The name identifier of the path. Nested columns should be of the form: 'address.city'.

attributes[]

string

Optional. List of attributes to be associated with the path of the resource, provided in the form: projects/{project}/locations/{location}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id}

DataProfileResult

DataProfileResult defines the output of DataProfileScan. Each field of the table will have field type specific profile result.

Fields
row_count

int64

The count of rows scanned.

profile

Profile

The profile information per field.

scanned_data

ScannedData

The data scanned for this result.

post_scan_actions_result

PostScanActionsResult

Output only. The result of post scan actions.

PostScanActionsResult

The result of post scan actions of DataProfileScan job.

Fields
bigquery_export_result

BigQueryExportResult

Output only. The result of BigQuery export post scan action.

BigQueryExportResult

The result of BigQuery export post scan action.

Fields
state

State

Output only. Execution state for the BigQuery exporting.

message

string

Output only. Additional information about the BigQuery exporting.

State

Execution state for the exporting.

Enums
STATE_UNSPECIFIED The exporting state is unspecified.
SUCCEEDED The exporting completed successfully.
FAILED The exporting is no longer running due to an error.
SKIPPED The exporting is skipped due to no valid scan result to export (usually caused by scan failed).

Profile

Contains name, type, mode and field type specific profile information.

Fields
fields[]

Field

List of fields with structural and profile information for each field.

Field

A field within a table.

Fields
name

string

The name of the field.

type

string

The data type retrieved from the schema of the data source. For instance, for a BigQuery native table, it is the BigQuery Table Schema. For a Dataplex Entity, it is the Entity Schema.

mode

string

The mode of the field. Possible values include:

  • REQUIRED, if it is a required field.
  • NULLABLE, if it is an optional field.
  • REPEATED, if it is a repeated field.
profile

ProfileInfo

Profile information for the corresponding field.

ProfileInfo

The profile information for each field type.

Fields
null_ratio

double

Ratio of rows with null value against total scanned rows.

distinct_ratio

double

Ratio of rows with distinct values against total scanned rows. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.

top_n_values[]

TopNValue

The list of top N non-null values, frequency and ratio with which they occur in the scanned data. N is 10 or equal to the number of distinct values in the field, whichever is smaller. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.

Union field field_info. Structural and profile information for specific field type. Not available, if mode is REPEATABLE. field_info can be only one of the following:
string_profile

StringFieldInfo

String type field information.

integer_profile

IntegerFieldInfo

Integer type field information.

double_profile

DoubleFieldInfo

Double type field information.

DoubleFieldInfo

The profile information for a double type field.

Fields
average

double

Average of non-null values in the scanned data. NaN, if the field has a NaN.

standard_deviation

double

Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN.

min

double

Minimum of non-null values in the scanned data. NaN, if the field has a NaN.

quartiles[]

double

A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of quartile values for the scanned data, occurring in order Q1, median, Q3.

max

double

Maximum of non-null values in the scanned data. NaN, if the field has a NaN.

IntegerFieldInfo

The profile information for an integer type field.

Fields
average

double

Average of non-null values in the scanned data. NaN, if the field has a NaN.

standard_deviation

double

Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN.

min

int64

Minimum of non-null values in the scanned data. NaN, if the field has a NaN.

quartiles[]

int64

A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of approximate quartile values for the scanned data, occurring in order Q1, median, Q3.

max

int64

Maximum of non-null values in the scanned data. NaN, if the field has a NaN.

StringFieldInfo

The profile information for a string type field.

Fields
min_length

int64

Minimum length of non-null values in the scanned data.

max_length

int64

Maximum length of non-null values in the scanned data.

average_length

double

Average length of non-null values in the scanned data.

TopNValue

Top N non-null values in the scanned data.

Fields
value

string

String value of a top N non-null value.

count

int64

Count of the corresponding value in the scanned data.

ratio

double

Ratio of the corresponding value in the field against the total number of rows in the scanned data.

DataProfileSpec

DataProfileScan related setting.

Fields
sampling_percent

float

Optional. The percentage of the records to be selected from the dataset for DataScan.

  • Value can range between 0.0 and 100.0 with up to 3 significant decimal digits.
  • Sampling is not applied if sampling_percent is not specified, 0 or 100.
row_filter

string

Optional. A filter applied to all rows in a single DataScan job. The filter needs to be a valid SQL expression for a WHERE clause in BigQuery standard SQL syntax. Example: col1 >= 0 AND col2 < 10

post_scan_actions

PostScanActions

Optional. Actions to take upon job completion..

include_fields

SelectedFields

Optional. The fields to include in data profile.

If not specified, all fields at the time of profile scan job execution are included, except for ones listed in exclude_fields.

exclude_fields

SelectedFields

Optional. The fields to exclude from data profile.

If specified, the fields will be excluded from data profile, regardless of include_fields value.

PostScanActions

The configuration of post scan actions of DataProfileScan job.

Fields
bigquery_export

BigQueryExport

Optional. If set, results will be exported to the provided BigQuery table.

BigQueryExport

The configuration of BigQuery export post scan action.

Fields
results_table

string

Optional. The BigQuery table to export DataProfileScan results to. Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID

SelectedFields

The specification for fields to include or exclude in data profile scan.

Fields
field_names[]

string

Optional. Expected input is a list of fully qualified names of fields as in the schema.

Only top-level field names for nested fields are supported. For instance, if 'x' is of nested field type, listing 'x' is supported but 'x.y.z' is not supported. Here 'y' and 'y.z' are nested fields of 'x'.

DataQualityColumnResult

DataQualityColumnResult provides a more detailed, per-column view of the results.

Fields
column

string

Output only. The column specified in the DataQualityRule.

score

float

Output only. The column-level data quality score for this data scan job if and only if the 'column' field is set.

The score ranges between between [0, 100] (up to two decimal points).

DataQualityDimension

A dimension captures data quality intent about a defined subset of the rules specified.

Fields
name

string

The dimension name a rule belongs to. Supported dimensions are ["COMPLETENESS", "ACCURACY", "CONSISTENCY", "VALIDITY", "UNIQUENESS", "INTEGRITY"]

DataQualityDimensionResult

DataQualityDimensionResult provides a more detailed, per-dimension view of the results.

Fields
dimension

DataQualityDimension

Output only. The dimension config specified in the DataQualitySpec, as is.

passed

bool

Whether the dimension passed or failed.

score

float

Output only. The dimension-level data quality score for this data scan job if and only if the 'dimension' field is set.

The score ranges between [0, 100] (up to two decimal points).

DataQualityResult

The output of a DataQualityScan.

Fields
passed

bool

Overall data quality result -- true if all rules passed.

dimensions[]

DataQualityDimensionResult

A list of results at the dimension level.

A dimension will have a corresponding DataQualityDimensionResult if and only if there is at least one rule with the 'dimension' field set to it.

columns[]

DataQualityColumnResult

Output only. A list of results at the column level.

A column will have a corresponding DataQualityColumnResult if and only if there is at least one rule with the 'column' field set to it.

rules[]

DataQualityRuleResult

A list of all the rules in a job, and their results.

row_count

int64

The count of rows processed.

scanned_data

ScannedData

The data scanned for this result.

post_scan_actions_result

PostScanActionsResult

Output only. The result of post scan actions.

score

float

Output only. The overall data quality score.

The score ranges between [0, 100] (up to two decimal points).

PostScanActionsResult

The result of post scan actions of DataQualityScan job.

Fields
bigquery_export_result

BigQueryExportResult

Output only. The result of BigQuery export post scan action.

BigQueryExportResult

The result of BigQuery export post scan action.

Fields
state

State

Output only. Execution state for the BigQuery exporting.

message

string

Output only. Additional information about the BigQuery exporting.

State

Execution state for the exporting.

Enums
STATE_UNSPECIFIED The exporting state is unspecified.
SUCCEEDED The exporting completed successfully.
FAILED The exporting is no longer running due to an error.
SKIPPED The exporting is skipped due to no valid scan result to export (usually caused by scan failed).

DataQualityRule

A rule captures data quality intent about a data source.

Fields
column

string

Optional. The unnested column which this rule is evaluated against.

ignore_null

bool

Optional. Rows with null values will automatically fail a rule, unless ignore_null is true. In that case, such null rows are trivially considered passing.

This field is only valid for the following type of rules:

  • RangeExpectation
  • RegexExpectation
  • SetExpectation
  • UniquenessExpectation
dimension

string

Required. The dimension a rule belongs to. Results are also aggregated at the dimension level. Supported dimensions are ["COMPLETENESS", "ACCURACY", "CONSISTENCY", "VALIDITY", "UNIQUENESS", "INTEGRITY"]

threshold

double

Optional. The minimum ratio of passing_rows / total_rows required to pass this rule, with a range of [0.0, 1.0].

0 indicates default value (i.e. 1.0).

This field is only valid for row-level type rules.

name

string

Optional. A mutable name for the rule.

  • The name must contain only letters (a-z, A-Z), numbers (0-9), or hyphens (-).
  • The maximum length is 63 characters.
  • Must start with a letter.
  • Must end with a number or a letter.
description

string

Optional. Description of the rule.

  • The maximum length is 1,024 characters.
suspended

bool

Optional. Whether the Rule is active or suspended. Default is false.

Union field rule_type. The rule-specific configuration. rule_type can be only one of the following:
range_expectation

RangeExpectation

Row-level rule which evaluates whether each column value lies between a specified range.

non_null_expectation

NonNullExpectation

Row-level rule which evaluates whether each column value is null.

set_expectation

SetExpectation

Row-level rule which evaluates whether each column value is contained by a specified set.

regex_expectation

RegexExpectation

Row-level rule which evaluates whether each column value matches a specified regex.

uniqueness_expectation

UniquenessExpectation

Row-level rule which evaluates whether each column value is unique.

statistic_range_expectation

StatisticRangeExpectation

Aggregate rule which evaluates whether the column aggregate statistic lies between a specified range.

row_condition_expectation

RowConditionExpectation

Row-level rule which evaluates whether each row in a table passes the specified condition.

table_condition_expectation

TableConditionExpectation

Aggregate rule which evaluates whether the provided expression is true for a table.

sql_assertion

SqlAssertion

Aggregate rule which evaluates the number of rows returned for the provided statement. If any rows are returned, this rule fails.

NonNullExpectation

This type has no fields.

Evaluates whether each column value is null.

RangeExpectation

Evaluates whether each column value lies between a specified range.

Fields
min_value

string

Optional. The minimum column value allowed for a row to pass this validation. At least one of min_value and max_value need to be provided.

max_value

string

Optional. The maximum column value allowed for a row to pass this validation. At least one of min_value and max_value need to be provided.

strict_min_enabled

bool

Optional. Whether each value needs to be strictly greater than ('>') the minimum, or if equality is allowed.

Only relevant if a min_value has been defined. Default = false.

strict_max_enabled

bool

Optional. Whether each value needs to be strictly lesser than ('<') the maximum, or if equality is allowed.

Only relevant if a max_value has been defined. Default = false.

RegexExpectation

Evaluates whether each column value matches a specified regex.

Fields
regex

string

Optional. A regular expression the column value is expected to match.

RowConditionExpectation

Evaluates whether each row passes the specified condition.

The SQL expression needs to use BigQuery standard SQL syntax and should produce a boolean value per row as the result.

Example: col1 >= 0 AND col2 < 10

Fields
sql_expression

string

Optional. The SQL expression.

SetExpectation

Evaluates whether each column value is contained by a specified set.

Fields
values[]

string

Optional. Expected values for the column value.

SqlAssertion

A SQL statement that is evaluated to return rows that match an invalid state. If any rows are are returned, this rule fails.

The SQL statement must use BigQuery standard SQL syntax, and must not contain any semicolons.

You can use the data reference parameter ${data()} to reference the source table with all of its precondition filters applied. Examples of precondition filters include row filters, incremental data filters, and sampling. For more information, see Data reference parameter.

Example: SELECT * FROM ${data()} WHERE price < 0

Fields
sql_statement

string

Optional. The SQL statement.

StatisticRangeExpectation

Evaluates whether the column aggregate statistic lies between a specified range.

Fields
statistic

ColumnStatistic

Optional. The aggregate metric to evaluate.

min_value

string

Optional. The minimum column statistic value allowed for a row to pass this validation.

At least one of min_value and max_value need to be provided.

max_value

string

Optional. The maximum column statistic value allowed for a row to pass this validation.

At least one of min_value and max_value need to be provided.

strict_min_enabled

bool

Optional. Whether column statistic needs to be strictly greater than ('>') the minimum, or if equality is allowed.

Only relevant if a min_value has been defined. Default = false.

strict_max_enabled

bool

Optional. Whether column statistic needs to be strictly lesser than ('<') the maximum, or if equality is allowed.

Only relevant if a max_value has been defined. Default = false.

ColumnStatistic

The list of aggregate metrics a rule can be evaluated against.

Enums
STATISTIC_UNDEFINED Unspecified statistic type
MEAN Evaluate the column mean
MIN Evaluate the column min
MAX Evaluate the column max

TableConditionExpectation

Evaluates whether the provided expression is true.

The SQL expression needs to use BigQuery standard SQL syntax and should produce a scalar boolean result.

Example: MIN(col1) >= 0

Fields
sql_expression

string

Optional. The SQL expression.

UniquenessExpectation

This type has no fields.

Evaluates whether the column has duplicates.

DataQualityRuleResult

DataQualityRuleResult provides a more detailed, per-rule view of the results.

Fields
rule

DataQualityRule

The rule specified in the DataQualitySpec, as is.

passed

bool

Whether the rule passed or failed.

evaluated_count

int64

The number of rows a rule was evaluated against.

This field is only valid for row-level type rules.

Evaluated count can be configured to either

  • include all rows (default) - with null rows automatically failing rule evaluation, or
  • exclude null rows from the evaluated_count, by setting ignore_nulls = true.
passed_count

int64

The number of rows which passed a rule evaluation.

This field is only valid for row-level type rules.

null_count

int64

The number of rows with null values in the specified column.

pass_ratio

double

The ratio of passed_count / evaluated_count.

This field is only valid for row-level type rules.

failing_rows_query

string

The query to find rows that did not pass this rule.

This field is only valid for row-level type rules.

assertion_row_count

int64

Output only. The number of rows returned by the SQL statement in a SQL assertion rule.

This field is only valid for SQL assertion rules.

DataQualityScanRuleResult

Information about the result of a data quality rule for data quality scan. The monitored resource is 'DataScan'.

Fields
job_id

string

Identifier of the specific data scan job this log entry is for.

data_source

string

The data source of the data scan (e.g. BigQuery table name).

column

string

The column which this rule is evaluated against.

rule_name

string

The name of the data quality rule.

rule_type

RuleType

The type of the data quality rule.

evalution_type

EvaluationType

The evaluation type of the data quality rule.

rule_dimension

string

The dimension of the data quality rule.

threshold_percent

double

The passing threshold ([0.0, 100.0]) of the data quality rule.

result

Result

The result of the data quality rule.

evaluated_row_count

int64

The number of rows evaluated against the data quality rule. This field is only valid for rules of PER_ROW evaluation type.

passed_row_count

int64

The number of rows which passed a rule evaluation. This field is only valid for rules of PER_ROW evaluation type.

null_row_count

int64

The number of rows with null values in the specified column.

assertion_row_count

int64

The number of rows returned by the SQL statement in a SQL assertion rule. This field is only valid for SQL assertion rules.

EvaluationType

The evaluation type of the data quality rule.

Enums
EVALUATION_TYPE_UNSPECIFIED An unspecified evaluation type.
PER_ROW The rule evaluation is done at per row level.
AGGREGATE The rule evaluation is done for an aggregate of rows.

Result

Whether the data quality rule passed or failed.

Enums
RESULT_UNSPECIFIED An unspecified result.
PASSED The data quality rule passed.
FAILED The data quality rule failed.

RuleType

The type of the data quality rule.

Enums
RULE_TYPE_UNSPECIFIED An unspecified rule type.
NON_NULL_EXPECTATION See DataQualityRule.NonNullExpectation.
RANGE_EXPECTATION See DataQualityRule.RangeExpectation.
REGEX_EXPECTATION See DataQualityRule.RegexExpectation.
ROW_CONDITION_EXPECTATION See DataQualityRule.RowConditionExpectation.
SET_EXPECTATION See DataQualityRule.SetExpectation.
STATISTIC_RANGE_EXPECTATION See DataQualityRule.StatisticRangeExpectation.
TABLE_CONDITION_EXPECTATION See DataQualityRule.TableConditionExpectation.
UNIQUENESS_EXPECTATION See DataQualityRule.UniquenessExpectation.
SQL_ASSERTION See DataQualityRule.SqlAssertion.

DataQualitySpec

DataQualityScan related setting.

Fields
rules[]

DataQualityRule

Required. The list of rules to evaluate against a data source. At least one rule is required.

sampling_percent

float

Optional. The percentage of the records to be selected from the dataset for DataScan.

  • Value can range between 0.0 and 100.0 with up to 3 significant decimal digits.
  • Sampling is not applied if sampling_percent is not specified, 0 or 100.
row_filter

string

Optional. A filter applied to all rows in a single DataScan job. The filter needs to be a valid SQL expression for a WHERE clause in BigQuery standard SQL syntax. Example: col1 >= 0 AND col2 < 10

post_scan_actions

PostScanActions

Optional. Actions to take upon job completion.

PostScanActions

The configuration of post scan actions of DataQualityScan.

Fields
bigquery_export

BigQueryExport

Optional. If set, results will be exported to the provided BigQuery table.

notification_report

NotificationReport

Optional. If set, results will be sent to the provided notification receipts upon triggers.

BigQueryExport

The configuration of BigQuery export post scan action.

Fields
results_table

string

Optional. The BigQuery table to export DataQualityScan results to. Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID

JobEndTrigger

This type has no fields.

This trigger is triggered whenever a scan job run ends, regardless of the result.

JobFailureTrigger

This type has no fields.

This trigger is triggered when the scan job itself fails, regardless of the result.

NotificationReport

The configuration of notification report post scan action.

Fields
recipients

Recipients

Required. The recipients who will receive the notification report.

score_threshold_trigger

ScoreThresholdTrigger

Optional. If set, report will be sent when score threshold is met.

job_failure_trigger

JobFailureTrigger

Optional. If set, report will be sent when a scan job fails.

job_end_trigger

JobEndTrigger

Optional. If set, report will be sent when a scan job ends.

Recipients

The individuals or groups who are designated to receive notifications upon triggers.

Fields
emails[]

string

Optional. The email recipients who will receive the DataQualityScan results report.

ScoreThresholdTrigger

This trigger is triggered when the DQ score in the job result is less than a specified input score.

Fields
score_threshold

float

Optional. The score range is in [0,100].

DataScan

Represents a user-visible job which provides the insights for the related data source.

For example:

  • Data Quality: generates queries based on the rules and runs against the data to get data quality check results.
  • Data Profile: analyzes the data in table(s) and generates insights about the structure, content and relationships (such as null percent, cardinality, min/max/mean, etc).
Fields
name

string

Output only. The relative resource name of the scan, of the form: projects/{project}/locations/{location_id}/dataScans/{datascan_id}, where project refers to a project_id or project_number and location_id refers to a GCP region.

uid

string

Output only. System generated globally unique ID for the scan. This ID will be different if the scan is deleted and re-created with the same name.

description

string

Optional. Description of the scan.

  • Must be between 1-1024 characters.
display_name

string

Optional. User friendly display name.

  • Must be between 1-256 characters.
labels

map<string, string>

Optional. User-defined labels for the scan.

state

State

Output only. Current state of the DataScan.

create_time

Timestamp

Output only. The time when the scan was created.

update_time

Timestamp

Output only. The time when the scan was last updated.

data

DataSource

Required. The data source for DataScan.

execution_spec

ExecutionSpec

Optional. DataScan execution settings.

If not specified, the fields in it will use their default values.

execution_status

ExecutionStatus

Output only. Status of the data scan execution.

type

DataScanType

Output only. The type of DataScan.

Union field spec. Data Scan related setting. It is required and immutable which means once data_quality_spec is set, it cannot be changed to data_profile_spec. spec can be only one of the following:
data_quality_spec

DataQualitySpec

DataQualityScan related setting.

data_profile_spec

DataProfileSpec

DataProfileScan related setting.

Union field result. The result of the data scan. result can be only one of the following:
data_quality_result

DataQualityResult

Output only. The result of the data quality scan.

data_profile_result

DataProfileResult

Output only. The result of the data profile scan.

ExecutionSpec

DataScan execution settings.

Fields
trigger

Trigger

Optional. Spec related to how often and when a scan should be triggered.

If not specified, the default is OnDemand, which means the scan will not run until the user calls RunDataScan API.

Union field incremental. Spec related to incremental scan of the data

When an option is selected for incremental scan, it cannot be unset or changed. If not specified, a data scan will run for all data in the table. incremental can be only one of the following:

field

string

Immutable. The unnested field (of type Date or Timestamp) that contains values which monotonically increase over time.

If not specified, a data scan will run for all data in the table.

ExecutionStatus

Status of the data scan execution.

Fields
latest_job_start_time

Timestamp

The time when the latest DataScanJob started.

latest_job_end_time

Timestamp

The time when the latest DataScanJob ended.

latest_job_create_time

Timestamp

Optional. The time when the DataScanJob execution was created.

DataScanEvent

These messages contain information about the execution of a datascan. The monitored resource is 'DataScan' Next ID: 13

Fields
data_source

string

The data source of the data scan

job_id

string

The identifier of the specific data scan job this log entry is for.

create_time

Timestamp

The time when the data scan job was created.

start_time

Timestamp

The time when the data scan job started to run.

end_time

Timestamp

The time when the data scan job finished.

type

ScanType

The type of the data scan.

state

State

The status of the data scan job.

message

string

The message describing the data scan job event.

spec_version

string

A version identifier of the spec which was used to execute this job.

trigger

Trigger

The trigger type of the data scan job.

scope

Scope

The scope of the data scan (e.g. full, incremental).

post_scan_actions_result

PostScanActionsResult

The result of post scan actions.

Union field result. The result of the data scan job. result can be only one of the following:
data_profile

DataProfileResult

Data profile result for data profile type data scan.

data_quality

DataQualityResult

Data quality result for data quality type data scan.

Union field appliedConfigs. The applied configs in the data scan job. appliedConfigs can be only one of the following:
data_profile_configs

DataProfileAppliedConfigs

Applied configs for data profile type data scan.

data_quality_configs

DataQualityAppliedConfigs

Applied configs for data quality type data scan.

DataProfileAppliedConfigs

Applied configs for data profile type data scan job.

Fields
sampling_percent

float

The percentage of the records selected from the dataset for DataScan.

  • Value ranges between 0.0 and 100.0.
  • Value 0.0 or 100.0 imply that sampling was not applied.
row_filter_applied

bool

Boolean indicating whether a row filter was applied in the DataScan job.

column_filter_applied

bool

Boolean indicating whether a column filter was applied in the DataScan job.

DataProfileResult

Data profile result for data scan job.

Fields
row_count

int64

The count of rows processed in the data scan job.

DataQualityAppliedConfigs

Applied configs for data quality type data scan job.

Fields
sampling_percent

float

The percentage of the records selected from the dataset for DataScan.

  • Value ranges between 0.0 and 100.0.
  • Value 0.0 or 100.0 imply that sampling was not applied.
row_filter_applied

bool

Boolean indicating whether a row filter was applied in the DataScan job.

DataQualityResult

Data quality result for data scan job.

Fields
row_count

int64

The count of rows processed in the data scan job.

passed

bool

Whether the data quality result was pass or not.

dimension_passed

map<string, bool>

The result of each dimension for data quality result. The key of the map is the name of the dimension. The value is the bool value depicting whether the dimension result was pass or not.

score

float

The table-level data quality score for the data scan job.

The data quality score ranges between [0, 100] (up to two decimal points).

dimension_score

map<string, float>

The score of each dimension for data quality result. The key of the map is the name of the dimension. The value is the data quality score for the dimension.

The score ranges between [0, 100] (up to two decimal points).

column_score

map<string, float>

The score of each column scanned in the data scan job. The key of the map is the name of the column. The value is the data quality score for the column.

The score ranges between [0, 100] (up to two decimal points).

PostScanActionsResult

Post scan actions result for data scan job.

Fields
bigquery_export_result

BigQueryExportResult

The result of BigQuery export post scan action.

BigQueryExportResult

The result of BigQuery export post scan action.

Fields
state

State

Execution state for the BigQuery exporting.

message

string

Additional information about the BigQuery exporting.

State

Execution state for the exporting.

Enums
STATE_UNSPECIFIED The exporting state is unspecified.
SUCCEEDED The exporting completed successfully.
FAILED The exporting is no longer running due to an error.
SKIPPED The exporting is skipped due to no valid scan result to export (usually caused by scan failed).

ScanType

The type of the data scan.

Enums
SCAN_TYPE_UNSPECIFIED An unspecified data scan type.
DATA_PROFILE Data scan for data profile.
DATA_QUALITY Data scan for data quality.

Scope

The scope of job for the data scan.

Enums
SCOPE_UNSPECIFIED An unspecified scope type.
FULL Data scan runs on all of the data.
INCREMENTAL Data scan runs on incremental data.

State

The job state of the data scan.

Enums
STATE_UNSPECIFIED Unspecified job state.
STARTED Data scan job started.
SUCCEEDED Data scan job successfully completed.
FAILED Data scan job was unsuccessful.
CANCELLED Data scan job was cancelled.
CREATED Data scan job was createed.

Trigger

The trigger type for the data scan.

Enums
TRIGGER_UNSPECIFIED An unspecified trigger type.
ON_DEMAND Data scan triggers on demand.
SCHEDULE Data scan triggers as per schedule.

DataScanJob

A DataScanJob represents an instance of DataScan execution.

Fields
name

string

Output only. The relative resource name of the DataScanJob, of the form: projects/{project}/locations/{location_id}/dataScans/{datascan_id}/jobs/{job_id}, where project refers to a project_id or project_number and location_id refers to a GCP region.

uid

string

Output only. System generated globally unique ID for the DataScanJob.

create_time

Timestamp

Output only. The time when the DataScanJob was created.

start_time

Timestamp

Output only. The time when the DataScanJob was started.

end_time

Timestamp

Output only. The time when the DataScanJob ended.

state

State

Output only. Execution state for the DataScanJob.

message

string

Output only. Additional information about the current state.

type

DataScanType

Output only. The type of the parent DataScan.

Union field spec. Data Scan related setting. spec can be only one of the following:
data_quality_spec

DataQualitySpec

Output only. DataQualityScan related setting.

data_profile_spec

DataProfileSpec

Output only. DataProfileScan related setting.

Union field result. The result of the data scan. result can be only one of the following:
data_quality_result

DataQualityResult

Output only. The result of the data quality scan.

data_profile_result

DataProfileResult

Output only. The result of the data profile scan.

State

Execution state for the DataScanJob.

Enums
STATE_UNSPECIFIED The DataScanJob state is unspecified.
RUNNING The DataScanJob is running.
CANCELING The DataScanJob is canceling.
CANCELLED The DataScanJob cancellation was successful.
SUCCEEDED The DataScanJob completed successfully.
FAILED The DataScanJob is no longer running due to an error.
PENDING The DataScanJob has been created but not started to run yet.

DataScanType

The type of DataScan.

Enums
DATA_SCAN_TYPE_UNSPECIFIED The DataScan type is unspecified.
DATA_QUALITY Data Quality scan.
DATA_PROFILE Data Profile scan.

DataSource

The data source for DataScan.

Fields
Union field source. The source is required and immutable. Once it is set, it cannot be change to others. source can be only one of the following:
entity

string

Immutable. The Dataplex entity that represents the data source (e.g. BigQuery table) for DataScan, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{entity_id}.

resource

string

Immutable. The service-qualified full resource name of the cloud resource for a DataScan job to scan against. The field could be: BigQuery table of type "TABLE" for DataProfileScan/DataQualityScan Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID

DataTaxonomy

DataTaxonomy represents a set of hierarchical DataAttributes resources, grouped with a common theme Eg: 'SensitiveDataTaxonomy' can have attributes to manage PII data. It is defined at project level.

Fields
name

string

Output only. The relative resource name of the DataTaxonomy, of the form: projects/{project_number}/locations/{location_id}/dataTaxonomies/{data_taxonomy_id}.

uid

string

Output only. System generated globally unique ID for the dataTaxonomy. This ID will be different if the DataTaxonomy is deleted and re-created with the same name.

create_time

Timestamp

Output only. The time when the DataTaxonomy was created.

update_time

Timestamp

Output only. The time when the DataTaxonomy was last updated.

description

string

Optional. Description of the DataTaxonomy.

display_name

string

Optional. User friendly display name.

labels

map<string, string>

Optional. User-defined labels for the DataTaxonomy.

attribute_count

int32

Output only. The number of attributes in the DataTaxonomy.

etag

string

This checksum is computed by the server based on the value of other fields, and may be sent on update and delete requests to ensure the client has an up-to-date value before proceeding.

class_count

int32

Output only. The number of classes in the DataTaxonomy.

DeleteAspectTypeRequest

Delele AspectType Request.

Fields
name

string

Required. The resource name of the AspectType: projects/{project_number}/locations/{location_id}/aspectTypes/{aspect_type_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.aspectTypes.delete
etag

string

Optional. If the client provided etag value does not match the current etag value, the DeleteAspectTypeRequest method returns an ABORTED error response.

DeleteAssetRequest

Delete asset request.

Fields
name

string

Required. The resource name of the asset: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/assets/{asset_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.assets.delete

DeleteContentRequest

Delete content request.

Fields
name

string

Required. The resource name of the content: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/content/{content_id}

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.content.delete

DeleteDataAttributeBindingRequest

Delete DataAttributeBinding request.

Fields
name

string

Required. The resource name of the DataAttributeBinding: projects/{project_number}/locations/{location_id}/dataAttributeBindings/{data_attribute_binding_id}

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.dataAttributeBindings.delete
etag

string

Required. If the client provided etag value does not match the current etag value, the DeleteDataAttributeBindingRequest method returns an ABORTED error response. Etags must be used when calling the DeleteDataAttributeBinding.

DeleteDataAttributeRequest

Delete DataAttribute request.

Fields
name

string

Required. The resource name of the DataAttribute: projects/{project_number}/locations/{location_id}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id}

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.dataAttributes.delete
etag

string

Optional. If the client provided etag value does not match the current etag value, the DeleteDataAttribute method returns an ABORTED error response.

DeleteDataScanRequest

Delete dataScan request.

Fields
name

string

Required. The resource name of the dataScan: projects/{project}/locations/{location_id}/dataScans/{data_scan_id} where project refers to a project_id or project_number and location_id refers to a GCP region.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.datascans.delete

DeleteDataTaxonomyRequest

Delete DataTaxonomy request.

Fields
name

string

Required. The resource name of the DataTaxonomy: projects/{project_number}/locations/{location_id}/dataTaxonomies/{data_taxonomy_id}

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.dataTaxonomies.delete
etag

string

Optional. If the client provided etag value does not match the current etag value,the DeleteDataTaxonomy method returns an ABORTED error.

DeleteEntityRequest

Delete a metadata entity request.

Fields
name

string

Required. The resource name of the entity: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{entity_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.entities.delete
etag

string

Required. The etag associated with the entity, which can be retrieved with a [GetEntity][] request.

DeleteEntryGroupRequest

Delete EntryGroup Request.

Fields
name

string

Required. The resource name of the EntryGroup: projects/{project_number}/locations/{location_id}/entryGroups/{entry_group_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.entryGroups.delete
etag

string

Optional. If the client provided etag value does not match the current etag value, the DeleteEntryGroupRequest method returns an ABORTED error response.

DeleteEntryRequest

Delete Entry request.

Fields
name

string

Required. The resource name of the Entry: projects/{project}/locations/{location}/entryGroups/{entry_group}/entries/{entry}.

DeleteEntryTypeRequest

Delele EntryType Request.

Fields
name

string

Required. The resource name of the EntryType: projects/{project_number}/locations/{location_id}/entryTypes/{entry_type_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.entryTypes.delete
etag

string

Optional. If the client provided etag value does not match the current etag value, the DeleteEntryTypeRequest method returns an ABORTED error response.

DeleteEnvironmentRequest

Delete environment request.

Fields
name

string

Required. The resource name of the environment: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/environments/{environment_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.environments.delete

DeleteLakeRequest

Delete lake request.

Fields
name

string

Required. The resource name of the lake: projects/{project_number}/locations/{location_id}/lakes/{lake_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.lakes.delete

DeletePartitionRequest

Delete metadata partition request.

Fields
name

string

Required. The resource name of the partition. format: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{entity_id}/partitions/{partition_value_path}. The {partition_value_path} segment consists of an ordered sequence of partition values separated by "/". All values must be provided.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.partitions.delete
etag
(deprecated)

string

Optional. The etag associated with the partition.

DeleteTaskRequest

Delete task request.

Fields
name

string

Required. The resource name of the task: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/task/{task_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.tasks.delete

DeleteZoneRequest

Delete zone request.

Fields
name

string

Required. The resource name of the zone: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.zones.delete

DiscoveryEvent

The payload associated with Discovery data processing.

Fields
message

string

The log message.

lake_id

string

The id of the associated lake.

zone_id

string

The id of the associated zone.

asset_id

string

The id of the associated asset.

data_location

string

The data location associated with the event.

type

EventType

The type of the event being logged.

Union field details. Additional details about the event. details can be only one of the following:
config

ConfigDetails

Details about discovery configuration in effect.

entity

EntityDetails

Details about the entity associated with the event.

partition

PartitionDetails

Details about the partition associated with the event.

action

ActionDetails

Details about the action associated with the event.

ActionDetails

Details about the action.

Fields
type

string

The type of action. Eg. IncompatibleDataSchema, InvalidDataFormat

ConfigDetails

Details about configuration events.

Fields
parameters

map<string, string>

A list of discovery configuration parameters in effect. The keys are the field paths within DiscoverySpec. Eg. includePatterns, excludePatterns, csvOptions.disableTypeInference, etc.

EntityDetails

Details about the entity.

Fields
entity

string

The name of the entity resource. The name is the fully-qualified resource name.

type

EntityType

The type of the entity resource.

EntityType

The type of the entity.

Enums
ENTITY_TYPE_UNSPECIFIED An unspecified event type.
TABLE Entities representing structured data.
FILESET Entities representing unstructured data.

EventType

The type of the event.

Enums
EVENT_TYPE_UNSPECIFIED An unspecified event type.
CONFIG An event representing discovery configuration in effect.
ENTITY_CREATED An event representing a metadata entity being created.
ENTITY_UPDATED An event representing a metadata entity being updated.
ENTITY_DELETED An event representing a metadata entity being deleted.
PARTITION_CREATED An event representing a partition being created.
PARTITION_UPDATED An event representing a partition being updated.
PARTITION_DELETED An event representing a partition being deleted.

PartitionDetails

Details about the partition.

Fields
partition

string

The name to the partition resource. The name is the fully-qualified resource name.

entity

string

The name to the containing entity resource. The name is the fully-qualified resource name.

type

EntityType

The type of the containing entity resource.

sampled_data_locations[]

string

The locations of the data items (e.g., a Cloud Storage objects) sampled for metadata inference.

Entity

Represents tables and fileset metadata contained within a zone.

Fields
name

string

Output only. The resource name of the entity, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{id}.

display_name

string

Optional. Display name must be shorter than or equal to 256 characters.

description

string

Optional. User friendly longer description text. Must be shorter than or equal to 1024 characters.

create_time

Timestamp

Output only. The time when the entity was created.

update_time

Timestamp

Output only. The time when the entity was last updated.

id

string

Required. A user-provided entity ID. It is mutable, and will be used as the published table name. Specifying a new ID in an update entity request will override the existing value. The ID must contain only letters (a-z, A-Z), numbers (0-9), and underscores, and consist of 256 or fewer characters.

etag

string

Optional. The etag associated with the entity, which can be retrieved with a [GetEntity][] request. Required for update and delete requests.

type

Type

Required. Immutable. The type of entity.

asset

string

Required. Immutable. The ID of the asset associated with the storage location containing the entity data. The entity must be with in the same zone with the asset.

data_path

string

Required. Immutable. The storage path of the entity data. For Cloud Storage data, this is the fully-qualified path to the entity, such as gs://bucket/path/to/data. For BigQuery data, this is the name of the table resource, such as projects/project_id/datasets/dataset_id/tables/table_id.

data_path_pattern

string

Optional. The set of items within the data path constituting the data in the entity, represented as a glob path. Example: gs://bucket/path/to/data/**/*.csv.

catalog_entry

string

Output only. The name of the associated Data Catalog entry.

system

StorageSystem

Required. Immutable. Identifies the storage system of the entity data.

format

StorageFormat

Required. Identifies the storage format of the entity data. It does not apply to entities with data stored in BigQuery.

compatibility

CompatibilityStatus

Output only. Metadata stores that the entity is compatible with.

access

StorageAccess

Output only. Identifies the access mechanism to the entity. Not user settable.

uid

string

Output only. System generated unique ID for the Entity. This ID will be different if the Entity is deleted and re-created with the same name.

schema

Schema

Required. The description of the data structure and layout. The schema is not included in list responses. It is only included in SCHEMA and FULL entity views of a GetEntity response.

CompatibilityStatus

Provides compatibility information for various metadata stores.

Fields
hive_metastore

Compatibility

Output only. Whether this entity is compatible with Hive Metastore.

bigquery

Compatibility

Output only. Whether this entity is compatible with BigQuery.

Compatibility

Provides compatibility information for a specific metadata store.

Fields
compatible

bool

Output only. Whether the entity is compatible and can be represented in the metadata store.

reason

string

Output only. Provides additional detail if the entity is incompatible with the metadata store.

Type

The type of entity.

Enums
TYPE_UNSPECIFIED Type unspecified.
TABLE Structured and semi-structured data.
FILESET Unstructured data.

Entry

An entry is a representation of a data resource that can be described by various metadata.

Fields
name

string

Identifier. The relative resource name of the entry, in the format projects/{project_id_or_number}/locations/{location_id}/entryGroups/{entry_group_id}/entries/{entry_id}.

entry_type

string

Required. Immutable. The relative resource name of the entry type that was used to create this entry, in the format projects/{project_id_or_number}/locations/{location_id}/entryTypes/{entry_type_id}.

create_time

Timestamp

Output only. The time when the entry was created in Dataplex.

update_time

Timestamp

Output only. The time when the entry was last updated in Dataplex.

aspects

map<string, Aspect>

Optional. The aspects that are attached to the entry. Depending on how the aspect is attached to the entry, the format of the aspect key can be one of the following:

  • If the aspect is attached directly to the entry: {project_id_or_number}.{location_id}.{aspect_type_id}
  • If the aspect is attached to an entry's path: {project_id_or_number}.{location_id}.{aspect_type_id}@{path}
parent_entry

string

Optional. Immutable. The resource name of the parent entry.

fully_qualified_name

string

Optional. A name for the entry that can be referenced by an external system. For more information, see Fully qualified names. The maximum size of the field is 4000 characters.

entry_source

EntrySource

Optional. Information related to the source system of the data resource that is represented by the entry.

EntryGroup

An Entry Group represents a logical grouping of one or more Entries.

Fields
name

string

Output only. The relative resource name of the EntryGroup, in the format projects/{project_id_or_number}/locations/{location_id}/entryGroups/{entry_group_id}.

uid

string

Output only. System generated globally unique ID for the EntryGroup. If you delete and recreate the EntryGroup with the same name, this ID will be different.

create_time

Timestamp

Output only. The time when the EntryGroup was created.

update_time

Timestamp

Output only. The time when the EntryGroup was last updated.

description

string

Optional. Description of the EntryGroup.

display_name

string

Optional. User friendly display name.

labels

map<string, string>

Optional. User-defined labels for the EntryGroup.

etag

string

This checksum is computed by the service, and might be sent on update and delete requests to ensure the client has an up-to-date value before proceeding.

EntrySource

Information related to the source system of the data resource that is represented by the entry.

Fields
resource

string

The name of the resource in the source system. Maximum length is 4,000 characters.

system

string

The name of the source system. Maximum length is 64 characters.

platform

string

The platform containing the source system. Maximum length is 64 characters.

display_name

string

A user-friendly display name. Maximum length is 500 characters.

description

string

A description of the data resource. Maximum length is 2,000 characters.

labels

map<string, string>

User-defined labels. The maximum size of keys and values is 128 characters each.

ancestors[]

Ancestor

Immutable. The entries representing the ancestors of the data resource in the source system.

create_time

Timestamp

The time when the resource was created in the source system.

update_time

Timestamp

The time when the resource was last updated in the source system. If the entry exists in the system and its EntrySource has update_time populated, further updates to the EntrySource of the entry must provide incremental updates to its update_time.

location

string

Output only. Location of the resource in the source system. You can search the entry by this location. By default, this should match the location of the entry group containing this entry. A different value allows capturing the source location for data external to Google Cloud.

Ancestor

Information about individual items in the hierarchy that is associated with the data resource.

Fields
name

string

Optional. The name of the ancestor resource.

type

string

Optional. The type of the ancestor resource.

EntryType

Entry Type is a template for creating Entries.

Fields
name

string

Output only. The relative resource name of the EntryType, of the form: projects/{project_number}/locations/{location_id}/entryTypes/{entry_type_id}.

uid

string

Output only. System generated globally unique ID for the EntryType. This ID will be different if the EntryType is deleted and re-created with the same name.

create_time

Timestamp

Output only. The time when the EntryType was created.

update_time

Timestamp

Output only. The time when the EntryType was last updated.

description

string

Optional. Description of the EntryType.

display_name

string

Optional. User friendly display name.

labels

map<string, string>

Optional. User-defined labels for the EntryType.

etag

string

Optional. This checksum is computed by the service, and might be sent on update and delete requests to ensure the client has an up-to-date value before proceeding.

type_aliases[]

string

Optional. Indicates the classes this Entry Type belongs to, for example, TABLE, DATABASE, MODEL.

platform

string

Optional. The platform that Entries of this type belongs to.

system

string

Optional. The system that Entries of this type belongs to. Examples include CloudSQL, MariaDB etc

required_aspects[]

AspectInfo

AspectInfo for the entry type.

authorization

Authorization

Immutable. Authorization defined for this type.

AspectInfo

Fields
type

string

Required aspect type for the entry type.

Authorization

Authorization for an Entry Type.

Fields
alternate_use_permission

string

Immutable. The IAM permission grantable on the Entry Group to allow access to instantiate Entries of Dataplex owned Entry Types, only settable for Dataplex owned Types.

EntryView

View for controlling which parts of an entry are to be returned.

Enums
ENTRY_VIEW_UNSPECIFIED Unspecified EntryView. Defaults to FULL.
BASIC Returns entry only, without aspects.
FULL Returns all required aspects as well as the keys of all non-required aspects.
CUSTOM Returns aspects matching custom fields in GetEntryRequest. If the number of aspects exceeds 100, the first 100 will be returned.
ALL Returns all aspects. If the number of aspects exceeds 100, the first 100 will be returned.

Environment

Environment represents a user-visible compute infrastructure for analytics within a lake.

Fields
name

string

Output only. The relative resource name of the environment, of the form: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/environment/{environment_id}

display_name

string

Optional. User friendly display name.

uid

string

Output only. System generated globally unique ID for the environment. This ID will be different if the environment is deleted and re-created with the same name.

create_time

Timestamp

Output only. Environment creation time.

update_time

Timestamp

Output only. The time when the environment was last updated.

labels

map<string, string>

Optional. User defined labels for the environment.

description

string

Optional. Description of the environment.

state

State

Output only. Current state of the environment.

infrastructure_spec

InfrastructureSpec

Required. Infrastructure specification for the Environment.

session_spec

SessionSpec

Optional. Configuration for sessions created for this environment.

session_status

SessionStatus

Output only. Status of sessions created for this environment.

endpoints

Endpoints

Output only. URI Endpoints to access sessions associated with the Environment.

Endpoints

URI Endpoints to access sessions associated with the Environment.

Fields
notebooks

string

Output only. URI to serve notebook APIs

sql

string

Output only. URI to serve SQL APIs

InfrastructureSpec

Configuration for the underlying infrastructure used to run workloads.

Fields
Union field resources. Hardware config resources can be only one of the following:
compute

ComputeResources

Optional. Compute resources needed for analyze interactive workloads.

Union field runtime. Software config runtime can be only one of the following:
os_image

OsImageRuntime

Required. Software Runtime Configuration for analyze interactive workloads.

ComputeResources

Compute resources associated with the analyze interactive workloads.

Fields
disk_size_gb

int32

Optional. Size in GB of the disk. Default is 100 GB.

node_count

int32

Optional. Total number of nodes in the sessions created for this environment.

max_node_count

int32

Optional. Max configurable nodes. If max_node_count > node_count, then auto-scaling is enabled.

OsImageRuntime

Software Runtime Configuration to run Analyze.

Fields
image_version

string

Required. Dataplex Image version.

java_libraries[]

string

Optional. List of Java jars to be included in the runtime environment. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar

python_packages[]

string

Optional. A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz

properties

map<string, string>

Optional. Spark properties to provide configuration for use in sessions created for this environment. The properties to set on daemon config files. Property keys are specified in prefix:property format. The prefix must be "spark".

SessionSpec

Configuration for sessions created for this environment.

Fields
max_idle_duration

Duration

Optional. The idle time configuration of the session. The session will be auto-terminated at the end of this period.

enable_fast_startup

bool

Optional. If True, this causes sessions to be pre-created and available for faster startup to enable interactive exploration use-cases. This defaults to False to avoid additional billed charges. These can only be set to True for the environment with name set to "default", and with default configuration.

SessionStatus

Status of sessions created for this environment.

Fields
active

bool

Output only. Queries over sessions to mark whether the environment is currently active or not

GenerateDataQualityRulesRequest

Request details for generating data quality rule recommendations.

Fields
name

string

Required. The name must be one of the following:

  • The name of a data scan with at least one successful, completed data profiling job
  • The name of a successful, completed data profiling job (a data scan job where the job type is data profiling)

GenerateDataQualityRulesResponse

Response details for data quality rule recommendations.

Fields
rule[]

DataQualityRule

The data quality rules that Dataplex generates based on the results of a data profiling scan.

GetAspectTypeRequest

Get AspectType request.

Fields
name

string

Required. The resource name of the AspectType: projects/{project_number}/locations/{location_id}/aspectTypes/{aspect_type_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.aspectTypes.get

GetAssetRequest

Get asset request.

Fields
name

string

Required. The resource name of the asset: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/assets/{asset_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.assets.get

GetContentRequest

Get content request.

Fields
name

string

Required. The resource name of the content: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/content/{content_id}

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.content.get
view

ContentView

Optional. Specify content view to make a partial request.

ContentView

Specifies whether the request should return the full or the partial representation.

Enums
CONTENT_VIEW_UNSPECIFIED Content view not specified. Defaults to BASIC. The API will default to the BASIC view.
BASIC Will not return the data_text field.
FULL Returns the complete proto.

GetDataAttributeBindingRequest

Get DataAttributeBinding request.

Fields
name

string

Required. The resource name of the DataAttributeBinding: projects/{project_number}/locations/{location_id}/dataAttributeBindings/{data_attribute_binding_id}

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.dataAttributeBindings.get

GetDataAttributeRequest

Get DataAttribute request.

Fields
name

string

Required. The resource name of the dataAttribute: projects/{project_number}/locations/{location_id}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id}

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.dataAttributes.get

GetDataScanJobRequest

Get DataScanJob request.

Fields
name

string

Required. The resource name of the DataScanJob: projects/{project}/locations/{location_id}/dataScans/{data_scan_id}/jobs/{data_scan_job_id} where project refers to a project_id or project_number and location_id refers to a GCP region.

Authorization requires the following IAM permission on the specified resource name:

  • iam.permissions.none
view

DataScanJobView

Optional. Select the DataScanJob view to return. Defaults to BASIC.

DataScanJobView

DataScanJob view options.

Enums
DATA_SCAN_JOB_VIEW_UNSPECIFIED The API will default to the BASIC view.
BASIC Basic view that does not include spec and result.
FULL Include everything.

GetDataScanRequest

Get dataScan request.

Fields
name

string

Required. The resource name of the dataScan: projects/{project}/locations/{location_id}/dataScans/{data_scan_id} where project refers to a project_id or project_number and location_id refers to a GCP region.

Authorization requires the following IAM permission on the specified resource name:

  • iam.permissions.none
view

DataScanView

Optional. Select the DataScan view to return. Defaults to BASIC.

DataScanView

DataScan view options.

Enums
DATA_SCAN_VIEW_UNSPECIFIED The API will default to the BASIC view.
BASIC Basic view that does not include spec and result.
FULL Include everything.

GetDataTaxonomyRequest

Get DataTaxonomy request.

Fields
name

string

Required. The resource name of the DataTaxonomy: projects/{project_number}/locations/{location_id}/dataTaxonomies/{data_taxonomy_id}

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.dataTaxonomies.get

GetEntityRequest

Get metadata entity request.

Fields
name

string

Required. The resource name of the entity: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{entity_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.entities.get
view

EntityView

Optional. Used to select the subset of entity information to return. Defaults to BASIC.

EntityView

Entity views for get entity partial result.

Enums
ENTITY_VIEW_UNSPECIFIED The API will default to the BASIC view.
BASIC Minimal view that does not include the schema.
SCHEMA Include basic information and schema.
FULL Include everything. Currently, this is the same as the SCHEMA view.

GetEntryGroupRequest

Get EntryGroup request.

Fields
name

string

Required. The resource name of the EntryGroup: projects/{project_number}/locations/{location_id}/entryGroups/{entry_group_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.entryGroups.get

GetEntryRequest

Get Entry request.

Fields
name

string

Required. The resource name of the Entry: projects/{project}/locations/{location}/entryGroups/{entry_group}/entries/{entry}.

view

EntryView

Optional. View to control which parts of an entry the service should return.

aspect_types[]

string

Optional. Limits the aspects returned to the provided aspect types. It only works for CUSTOM view.

paths[]

string

Optional. Limits the aspects returned to those associated with the provided paths within the Entry. It only works for CUSTOM view.

GetEntryTypeRequest

Get EntryType request.

Fields
name

string

Required. The resource name of the EntryType: projects/{project_number}/locations/{location_id}/entryTypes/{entry_type_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.entryTypes.get

GetEnvironmentRequest

Get environment request.

Fields
name

string

Required. The resource name of the environment: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/environments/{environment_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.environments.get

GetJobRequest

Get job request.

Fields
name

string

Required. The resource name of the job: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/tasks/{task_id}/jobs/{job_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.tasks.get

GetLakeRequest

Get lake request.

Fields
name

string

Required. The resource name of the lake: projects/{project_number}/locations/{location_id}/lakes/{lake_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.lakes.get

GetMetadataJobRequest

Get metadata job request.

Fields
name

string

Required. The resource name of the metadata job, in the format projects/{project_id_or_number}/locations/{location_id}/metadataJobs/{metadata_job_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.metadataJobs.get

GetPartitionRequest

Get metadata partition request.

Fields
name

string

Required. The resource name of the partition: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{entity_id}/partitions/{partition_value_path}. The {partition_value_path} segment consists of an ordered sequence of partition values separated by "/". All values must be provided.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.partitions.get

GetTaskRequest

Get task request.

Fields
name

string

Required. The resource name of the task: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/tasks/{tasks_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.tasks.get

GetZoneRequest

Get zone request.

Fields
name

string

Required. The resource name of the zone: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}.

Authorization requires the following IAM permission on the specified resource name:

  • dataplex.zones.get

GovernanceEvent

Payload associated with Governance related log events.

Fields
message

string

The log message.

event_type

EventType

The type of the event.

entity

Entity

Entity resource information if the log event is associated with a specific entity.

Entity

Information about Entity resource that the log event is associated with.

Fields
entity

string

The Entity resource the log event is associated with. Format: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{entity_id}

entity_type