Index
CatalogService
(interface)ContentService
(interface)DataScanService
(interface)DataTaxonomyService
(interface)DataplexService
(interface)MetadataService
(interface)Action
(message)Action.Category
(enum)Action.FailedSecurityPolicyApply
(message)Action.IncompatibleDataSchema
(message)Action.IncompatibleDataSchema.SchemaChange
(enum)Action.InvalidDataFormat
(message)Action.InvalidDataOrganization
(message)Action.InvalidDataPartition
(message)Action.InvalidDataPartition.PartitionStructure
(enum)Action.MissingData
(message)Action.MissingResource
(message)Action.UnauthorizedResource
(message)Aspect
(message)AspectSource
(message)AspectType
(message)AspectType.Authorization
(message)AspectType.MetadataTemplate
(message)AspectType.MetadataTemplate.Annotations
(message)AspectType.MetadataTemplate.Constraints
(message)AspectType.MetadataTemplate.EnumValue
(message)Asset
(message)Asset.DiscoverySpec
(message)Asset.DiscoverySpec.CsvOptions
(message)Asset.DiscoverySpec.JsonOptions
(message)Asset.DiscoveryStatus
(message)Asset.DiscoveryStatus.State
(enum)Asset.DiscoveryStatus.Stats
(message)Asset.ResourceSpec
(message)Asset.ResourceSpec.AccessMode
(enum)Asset.ResourceSpec.Type
(enum)Asset.ResourceStatus
(message)Asset.ResourceStatus.State
(enum)Asset.SecurityStatus
(message)Asset.SecurityStatus.State
(enum)AssetStatus
(message)CancelJobRequest
(message)CancelMetadataJobRequest
(message)Content
(message)Content.Notebook
(message)Content.Notebook.KernelType
(enum)Content.SqlScript
(message)Content.SqlScript.QueryEngine
(enum)CreateAspectTypeRequest
(message)CreateAssetRequest
(message)CreateContentRequest
(message)CreateDataAttributeBindingRequest
(message)CreateDataAttributeRequest
(message)CreateDataScanRequest
(message)CreateDataTaxonomyRequest
(message)CreateEntityRequest
(message)CreateEntryGroupRequest
(message)CreateEntryRequest
(message)CreateEntryTypeRequest
(message)CreateEnvironmentRequest
(message)CreateLakeRequest
(message)CreateMetadataJobRequest
(message)CreatePartitionRequest
(message)CreateTaskRequest
(message)CreateZoneRequest
(message)DataAccessSpec
(message)DataAttribute
(message)DataAttributeBinding
(message)DataAttributeBinding.Path
(message)DataDiscoveryResult
(message)DataDiscoveryResult.BigQueryPublishing
(message)DataDiscoverySpec
(message)DataDiscoverySpec.BigQueryPublishingConfig
(message)DataDiscoverySpec.BigQueryPublishingConfig.TableType
(enum)DataDiscoverySpec.StorageConfig
(message)DataDiscoverySpec.StorageConfig.CsvOptions
(message)DataDiscoverySpec.StorageConfig.JsonOptions
(message)DataProfileResult
(message)DataProfileResult.PostScanActionsResult
(message)DataProfileResult.PostScanActionsResult.BigQueryExportResult
(message)DataProfileResult.PostScanActionsResult.BigQueryExportResult.State
(enum)DataProfileResult.Profile
(message)DataProfileResult.Profile.Field
(message)DataProfileResult.Profile.Field.ProfileInfo
(message)DataProfileResult.Profile.Field.ProfileInfo.DoubleFieldInfo
(message)DataProfileResult.Profile.Field.ProfileInfo.IntegerFieldInfo
(message)DataProfileResult.Profile.Field.ProfileInfo.StringFieldInfo
(message)DataProfileResult.Profile.Field.ProfileInfo.TopNValue
(message)DataProfileSpec
(message)DataProfileSpec.PostScanActions
(message)DataProfileSpec.PostScanActions.BigQueryExport
(message)DataProfileSpec.SelectedFields
(message)DataQualityColumnResult
(message)DataQualityDimension
(message)DataQualityDimensionResult
(message)DataQualityResult
(message)DataQualityResult.PostScanActionsResult
(message)DataQualityResult.PostScanActionsResult.BigQueryExportResult
(message)DataQualityResult.PostScanActionsResult.BigQueryExportResult.State
(enum)DataQualityRule
(message)DataQualityRule.NonNullExpectation
(message)DataQualityRule.RangeExpectation
(message)DataQualityRule.RegexExpectation
(message)DataQualityRule.RowConditionExpectation
(message)DataQualityRule.SetExpectation
(message)DataQualityRule.SqlAssertion
(message)DataQualityRule.StatisticRangeExpectation
(message)DataQualityRule.StatisticRangeExpectation.ColumnStatistic
(enum)DataQualityRule.TableConditionExpectation
(message)DataQualityRule.UniquenessExpectation
(message)DataQualityRuleResult
(message)DataQualityScanRuleResult
(message)DataQualityScanRuleResult.EvaluationType
(enum)DataQualityScanRuleResult.Result
(enum)DataQualityScanRuleResult.RuleType
(enum)DataQualitySpec
(message)DataQualitySpec.PostScanActions
(message)DataQualitySpec.PostScanActions.BigQueryExport
(message)DataQualitySpec.PostScanActions.JobEndTrigger
(message)DataQualitySpec.PostScanActions.JobFailureTrigger
(message)DataQualitySpec.PostScanActions.NotificationReport
(message)DataQualitySpec.PostScanActions.Recipients
(message)DataQualitySpec.PostScanActions.ScoreThresholdTrigger
(message)DataScan
(message)DataScan.ExecutionSpec
(message)DataScan.ExecutionStatus
(message)DataScanEvent
(message)DataScanEvent.DataProfileAppliedConfigs
(message)DataScanEvent.DataProfileResult
(message)DataScanEvent.DataQualityAppliedConfigs
(message)DataScanEvent.DataQualityResult
(message)DataScanEvent.PostScanActionsResult
(message)DataScanEvent.PostScanActionsResult.BigQueryExportResult
(message)DataScanEvent.PostScanActionsResult.BigQueryExportResult.State
(enum)DataScanEvent.ScanType
(enum)DataScanEvent.Scope
(enum)DataScanEvent.State
(enum)DataScanEvent.Trigger
(enum)DataScanJob
(message)DataScanJob.State
(enum)DataScanType
(enum)DataSource
(message)DataTaxonomy
(message)DeleteAspectTypeRequest
(message)DeleteAssetRequest
(message)DeleteContentRequest
(message)DeleteDataAttributeBindingRequest
(message)DeleteDataAttributeRequest
(message)DeleteDataScanRequest
(message)DeleteDataTaxonomyRequest
(message)DeleteEntityRequest
(message)DeleteEntryGroupRequest
(message)DeleteEntryRequest
(message)DeleteEntryTypeRequest
(message)DeleteEnvironmentRequest
(message)DeleteLakeRequest
(message)DeletePartitionRequest
(message)DeleteTaskRequest
(message)DeleteZoneRequest
(message)DiscoveryEvent
(message)DiscoveryEvent.ActionDetails
(message)DiscoveryEvent.ConfigDetails
(message)DiscoveryEvent.EntityDetails
(message)DiscoveryEvent.EntityType
(enum)DiscoveryEvent.EventType
(enum)DiscoveryEvent.PartitionDetails
(message)DiscoveryEvent.TableDetails
(message)DiscoveryEvent.TableType
(enum)Entity
(message)Entity.CompatibilityStatus
(message)Entity.CompatibilityStatus.Compatibility
(message)Entity.Type
(enum)Entry
(message)EntryGroup
(message)EntrySource
(message)EntrySource.Ancestor
(message)EntryType
(message)EntryType.AspectInfo
(message)EntryType.Authorization
(message)EntryView
(enum)Environment
(message)Environment.Endpoints
(message)Environment.InfrastructureSpec
(message)Environment.InfrastructureSpec.ComputeResources
(message)Environment.InfrastructureSpec.OsImageRuntime
(message)Environment.SessionSpec
(message)Environment.SessionStatus
(message)GenerateDataQualityRulesRequest
(message)GenerateDataQualityRulesResponse
(message)GetAspectTypeRequest
(message)GetAssetRequest
(message)GetContentRequest
(message)GetContentRequest.ContentView
(enum)GetDataAttributeBindingRequest
(message)GetDataAttributeRequest
(message)GetDataScanJobRequest
(message)GetDataScanJobRequest.DataScanJobView
(enum)GetDataScanRequest
(message)GetDataScanRequest.DataScanView
(enum)GetDataTaxonomyRequest
(message)GetEntityRequest
(message)GetEntityRequest.EntityView
(enum)GetEntryGroupRequest
(message)GetEntryRequest
(message)GetEntryTypeRequest
(message)GetEnvironmentRequest
(message)GetJobRequest
(message)GetLakeRequest
(message)GetMetadataJobRequest
(message)GetPartitionRequest
(message)GetTaskRequest
(message)GetZoneRequest
(message)GovernanceEvent
(message)GovernanceEvent.Entity
(message)GovernanceEvent.Entity.EntityType
(enum)GovernanceEvent.EventType
(enum)ImportItem
(message)Job
(message)Job.Service
(enum)Job.State
(enum)Job.Trigger
(enum)JobEvent
(message)JobEvent.ExecutionTrigger
(enum)JobEvent.Service
(enum)JobEvent.State
(enum)JobEvent.Type
(enum)Lake
(message)Lake.Metastore
(message)Lake.MetastoreStatus
(message)Lake.MetastoreStatus.State
(enum)ListActionsResponse
(message)ListAspectTypesRequest
(message)ListAspectTypesResponse
(message)ListAssetActionsRequest
(message)ListAssetsRequest
(message)ListAssetsResponse
(message)ListContentRequest
(message)ListContentResponse
(message)ListDataAttributeBindingsRequest
(message)ListDataAttributeBindingsResponse
(message)ListDataAttributesRequest
(message)ListDataAttributesResponse
(message)ListDataScanJobsRequest
(message)ListDataScanJobsResponse
(message)ListDataScansRequest
(message)ListDataScansResponse
(message)ListDataTaxonomiesRequest
(message)ListDataTaxonomiesResponse
(message)ListEntitiesRequest
(message)ListEntitiesRequest.EntityView
(enum)ListEntitiesResponse
(message)ListEntriesRequest
(message)ListEntriesResponse
(message)ListEntryGroupsRequest
(message)ListEntryGroupsResponse
(message)ListEntryTypesRequest
(message)ListEntryTypesResponse
(message)ListEnvironmentsRequest
(message)ListEnvironmentsResponse
(message)ListJobsRequest
(message)ListJobsResponse
(message)ListLakeActionsRequest
(message)ListLakesRequest
(message)ListLakesResponse
(message)ListMetadataJobsRequest
(message)ListMetadataJobsResponse
(message)ListPartitionsRequest
(message)ListPartitionsResponse
(message)ListSessionsRequest
(message)ListSessionsResponse
(message)ListTasksRequest
(message)ListTasksResponse
(message)ListZoneActionsRequest
(message)ListZonesRequest
(message)ListZonesResponse
(message)LookupEntryRequest
(message)MetadataJob
(message)MetadataJob.ImportJobResult
(message)MetadataJob.ImportJobSpec
(message)MetadataJob.ImportJobSpec.ImportJobScope
(message)MetadataJob.ImportJobSpec.LogLevel
(enum)MetadataJob.ImportJobSpec.SyncMode
(enum)MetadataJob.Status
(message)MetadataJob.Status.State
(enum)MetadataJob.Type
(enum)OperationMetadata
(message)Partition
(message)ResourceAccessSpec
(message)RunDataScanRequest
(message)RunDataScanResponse
(message)RunTaskRequest
(message)RunTaskResponse
(message)ScannedData
(message)ScannedData.IncrementalField
(message)Schema
(message)Schema.Mode
(enum)Schema.PartitionField
(message)Schema.PartitionStyle
(enum)Schema.SchemaField
(message)Schema.Type
(enum)SearchEntriesRequest
(message)SearchEntriesResponse
(message)SearchEntriesResult
(message)SearchEntriesResult.Snippets
(message) (deprecated)Session
(message)SessionEvent
(message)SessionEvent.EventType
(enum)SessionEvent.QueryDetail
(message)SessionEvent.QueryDetail.Engine
(enum)State
(enum)StorageAccess
(message)StorageAccess.AccessMode
(enum)StorageFormat
(message)StorageFormat.CompressionFormat
(enum)StorageFormat.CsvOptions
(message)StorageFormat.Format
(enum)StorageFormat.IcebergOptions
(message)StorageFormat.JsonOptions
(message)StorageSystem
(enum)Task
(message)Task.ExecutionSpec
(message)Task.ExecutionStatus
(message)Task.InfrastructureSpec
(message)Task.InfrastructureSpec.BatchComputeResources
(message)Task.InfrastructureSpec.ContainerImageRuntime
(message)Task.InfrastructureSpec.VpcNetwork
(message)Task.NotebookTaskConfig
(message)Task.SparkTaskConfig
(message)Task.TriggerSpec
(message)Task.TriggerSpec.Type
(enum)TransferStatus
(enum)Trigger
(message)Trigger.OnDemand
(message)Trigger.Schedule
(message)UpdateAspectTypeRequest
(message)UpdateAssetRequest
(message)UpdateContentRequest
(message)UpdateDataAttributeBindingRequest
(message)UpdateDataAttributeRequest
(message)UpdateDataScanRequest
(message)UpdateDataTaxonomyRequest
(message)UpdateEntityRequest
(message)UpdateEntryGroupRequest
(message)UpdateEntryRequest
(message)UpdateEntryTypeRequest
(message)UpdateEnvironmentRequest
(message)UpdateLakeRequest
(message)UpdateTaskRequest
(message)UpdateZoneRequest
(message)Zone
(message)Zone.DiscoverySpec
(message)Zone.DiscoverySpec.CsvOptions
(message)Zone.DiscoverySpec.JsonOptions
(message)Zone.ResourceSpec
(message)Zone.ResourceSpec.LocationType
(enum)Zone.Type
(enum)
CatalogService
The primary resources offered by this service are EntryGroups, EntryTypes, AspectTypes, and Entries. They collectively let data administrators organize, manage, secure, and catalog data located across cloud projects in their organization in a variety of storage systems, including Cloud Storage and BigQuery.
CancelMetadataJob |
---|
Cancels a metadata job. If you cancel a metadata import job that is in progress, the changes in the job might be partially applied. We recommend that you reset the state of the entry groups in your project by running another metadata job that reverts the changes from the canceled job.
|
CreateAspectType |
---|
Creates an AspectType.
|
CreateEntry |
---|
Creates an Entry.
|
CreateEntryGroup |
---|
Creates an EntryGroup.
|
CreateEntryType |
---|
Creates an EntryType.
|
CreateMetadataJob |
---|
Creates a metadata job. For example, use a metadata job to import Dataplex Catalog entries and aspects from a third-party system into Dataplex.
|
DeleteAspectType |
---|
Deletes an AspectType.
|
DeleteEntry |
---|
Deletes an Entry.
|
DeleteEntryGroup |
---|
Deletes an EntryGroup.
|
DeleteEntryType |
---|
Deletes an EntryType.
|
GetAspectType |
---|
Gets an AspectType.
|
GetEntry |
---|
Gets an Entry. Caution: The BigQuery metadata that is stored in Dataplex Catalog is changing. For more information, see Changes to BigQuery metadata stored in Dataplex Catalog.
|
GetEntryGroup |
---|
Gets an EntryGroup.
|
GetEntryType |
---|
Gets an EntryType.
|
GetMetadataJob |
---|
Gets a metadata job.
|
ListAspectTypes |
---|
Lists AspectType resources in a project and location.
|
ListEntries |
---|
Lists Entries within an EntryGroup.
|
ListEntryGroups |
---|
Lists EntryGroup resources in a project and location.
|
ListEntryTypes |
---|
Lists EntryType resources in a project and location.
|
ListMetadataJobs |
---|
Lists metadata jobs.
|
LookupEntry |
---|
Looks up a single Entry by name using the permission on the source system. Caution: The BigQuery metadata that is stored in Dataplex Catalog is changing. For more information, see Changes to BigQuery metadata stored in Dataplex Catalog.
|
SearchEntries |
---|
Searches for Entries matching the given query and scope.
|
UpdateAspectType |
---|
Updates an AspectType.
|
UpdateEntry |
---|
Updates an Entry.
|
UpdateEntryGroup |
---|
Updates an EntryGroup.
|
UpdateEntryType |
---|
Updates an EntryType.
|
ContentService
ContentService manages Notebook and SQL Scripts for Dataplex.
CreateContent |
---|
Create a content.
|
DeleteContent |
---|
Delete a content.
|
GetContent |
---|
Get a content resource.
|
GetIamPolicy |
---|
Gets the access control policy for a contentitem resource. A Caller must have Google IAM
|
ListContent |
---|
List content.
|
SetIamPolicy |
---|
Sets the access control policy on the specified contentitem resource. Replaces any existing policy. Caller must have Google IAM
|
TestIamPermissions |
---|
Returns the caller's permissions on a resource. If the resource does not exist, an empty set of permissions is returned (a A caller is not required to have Google IAM permission to make this request. Note: This operation is designed to be used for building permission-aware UIs and command-line tools, not for authorization checking. This operation may "fail open" without warning.
|
UpdateContent |
---|
Update a content. Only supports full resource update.
|
DataScanService
DataScanService manages DataScan resources which can be configured to run various types of data scanning workload and generate enriched metadata (e.g. Data Profile, Data Quality) for the data source.
CreateDataScan |
---|
Creates a DataScan resource.
|
DeleteDataScan |
---|
Deletes a DataScan resource.
|
GenerateDataQualityRules |
---|
Generates recommended data quality rules based on the results of a data profiling scan. Use the recommendations to build rules for a data quality scan.
|
GetDataScan |
---|
Gets a DataScan resource.
|
GetDataScanJob |
---|
Gets a DataScanJob resource.
|
ListDataScanJobs |
---|
Lists DataScanJobs under the given DataScan.
|
ListDataScans |
---|
Lists DataScans.
|
RunDataScan |
---|
Runs an on-demand execution of a DataScan
|
UpdateDataScan |
---|
Updates a DataScan resource.
|
DataTaxonomyService
DataTaxonomyService enables attribute-based governance. The resources currently offered include DataTaxonomy and DataAttribute.
CreateDataAttribute |
---|
Create a DataAttribute resource.
|
CreateDataAttributeBinding |
---|
Create a DataAttributeBinding resource.
|
CreateDataTaxonomy |
---|
Create a DataTaxonomy resource.
|
DeleteDataAttribute |
---|
Deletes a Data Attribute resource.
|
DeleteDataAttributeBinding |
---|
Deletes a DataAttributeBinding resource. All attributes within the DataAttributeBinding must be deleted before the DataAttributeBinding can be deleted.
|
DeleteDataTaxonomy |
---|
Deletes a DataTaxonomy resource. All attributes within the DataTaxonomy must be deleted before the DataTaxonomy can be deleted.
|
GetDataAttribute |
---|
Retrieves a Data Attribute resource.
|
GetDataAttributeBinding |
---|
Retrieves a DataAttributeBinding resource.
|
GetDataTaxonomy |
---|
Retrieves a DataTaxonomy resource.
|
ListDataAttributeBindings |
---|
Lists DataAttributeBinding resources in a project and location.
|
ListDataAttributes |
---|
Lists Data Attribute resources in a DataTaxonomy.
|
ListDataTaxonomies |
---|
Lists DataTaxonomy resources in a project and location.
|
UpdateDataAttribute |
---|
Updates a DataAttribute resource.
|
UpdateDataAttributeBinding |
---|
Updates a DataAttributeBinding resource.
|
UpdateDataTaxonomy |
---|
Updates a DataTaxonomy resource.
|
DataplexService
Dataplex service provides data lakes as a service. The primary resources offered by this service are Lakes, Zones and Assets which collectively allow a data administrator to organize, manage, secure and catalog data across their organization located across cloud projects in a variety of storage systems including Cloud Storage and BigQuery.
CancelJob |
---|
Cancel jobs running for the task resource.
|
CreateAsset |
---|
Creates an asset resource.
|
CreateEnvironment |
---|
Create an environment resource.
|
CreateLake |
---|
Creates a lake resource.
|
CreateTask |
---|
Creates a task resource within a lake.
|
CreateZone |
---|
Creates a zone resource within a lake.
|
DeleteAsset |
---|
Deletes an asset resource. The referenced storage resource is detached (default) or deleted based on the associated Lifecycle policy.
|
DeleteEnvironment |
---|
Delete the environment resource. All the child resources must have been deleted before environment deletion can be initiated.
|
DeleteLake |
---|
Deletes a lake resource. All zones within the lake must be deleted before the lake can be deleted.
|
DeleteTask |
---|
Delete the task resource.
|
DeleteZone |
---|
Deletes a zone resource. All assets within a zone must be deleted before the zone can be deleted.
|
GetAsset |
---|
Retrieves an asset resource.
|
GetEnvironment |
---|
Get environment resource.
|
GetJob |
---|
Get job resource.
|
GetLake |
---|
Retrieves a lake resource.
|
GetTask |
---|
Get task resource.
|
GetZone |
---|
Retrieves a zone resource.
|
ListAssetActions |
---|
Lists action resources in an asset.
|
ListAssets |
---|
Lists asset resources in a zone.
|
ListEnvironments |
---|
Lists environments under the given lake.
|
ListJobs |
---|
Lists Jobs under the given task.
|
ListLakeActions |
---|
Lists action resources in a lake.
|
ListLakes |
---|
Lists lake resources in a project and location.
|
ListSessions |
---|
Lists session resources in an environment.
|
ListTasks |
---|
Lists tasks under the given lake.
|
ListZoneActions |
---|
Lists action resources in a zone.
|
ListZones |
---|
Lists zone resources in a lake.
|
RunTask |
---|
Run an on demand execution of a Task.
|
UpdateAsset |
---|
Updates an asset resource.
|
UpdateEnvironment |
---|
Update the environment resource.
|
UpdateLake |
---|
Updates a lake resource.
|
UpdateTask |
---|
Update the task resource.
|
UpdateZone |
---|
Updates a zone resource.
|
MetadataService
Metadata service manages metadata resources such as tables, filesets and partitions.
CreateEntity |
---|
Create a metadata entity.
|
CreatePartition |
---|
Create a metadata partition.
|
DeleteEntity |
---|
Delete a metadata entity.
|
DeletePartition |
---|
Delete a metadata partition.
|
GetEntity |
---|
Get a metadata entity.
|
GetPartition |
---|
Get a metadata partition of an entity.
|
ListEntities |
---|
List metadata entities in a zone.
|
ListPartitions |
---|
List metadata partitions of an entity.
|
UpdateEntity |
---|
Update a metadata entity. Only supports full resource update.
|
Action
Action represents an issue requiring administrator action for resolution.
Fields | |
---|---|
category |
The category of issue associated with the action. |
issue |
Detailed description of the issue requiring action. |
detect_ |
The time that the issue was detected. |
name |
Output only. The relative resource name of the action, of the form: |
lake |
Output only. The relative resource name of the lake, of the form: |
zone |
Output only. The relative resource name of the zone, of the form: |
asset |
Output only. The relative resource name of the asset, of the form: |
data_ |
The list of data locations associated with this action. Cloud Storage locations are represented as URI paths(E.g. |
Union field details . Additional details about the action based on the action category. details can be only one of the following: |
|
invalid_ |
Details for issues related to invalid or unsupported data formats. |
incompatible_ |
Details for issues related to incompatible schemas detected within data. |
invalid_ |
Details for issues related to invalid or unsupported data partition structure. |
missing_ |
Details for issues related to absence of data within managed resources. |
missing_ |
Details for issues related to absence of a managed resource. |
unauthorized_ |
Details for issues related to lack of permissions to access data resources. |
failed_ |
Details for issues related to applying security policy. |
invalid_ |
Details for issues related to invalid data arrangement. |
Category
The category of issues.
Enums | |
---|---|
CATEGORY_UNSPECIFIED |
Unspecified category. |
RESOURCE_MANAGEMENT |
Resource management related issues. |
SECURITY_POLICY |
Security policy related issues. |
DATA_DISCOVERY |
Data and discovery related issues. |
FailedSecurityPolicyApply
Failed to apply security policy to the managed resource(s) under a lake, zone or an asset. For a lake or zone resource, one or more underlying assets has a failure applying security policy to the associated managed resource.
Fields | |
---|---|
asset |
Resource name of one of the assets with failing security policy application. Populated for a lake or zone resource only. |
IncompatibleDataSchema
Action details for incompatible schemas detected by discovery.
Fields | |
---|---|
table |
The name of the table containing invalid data. |
existing_ |
The existing and expected schema of the table. The schema is provided as a JSON formatted structure listing columns and data types. |
new_ |
The new and incompatible schema within the table. The schema is provided as a JSON formatted structured listing columns and data types. |
sampled_ |
The list of data locations sampled and used for format/schema inference. |
schema_ |
Whether the action relates to a schema that is incompatible or modified. |
SchemaChange
Whether the action relates to a schema that is incompatible or modified.
Enums | |
---|---|
SCHEMA_CHANGE_UNSPECIFIED |
Schema change unspecified. |
INCOMPATIBLE |
Newly discovered schema is incompatible with existing schema. |
MODIFIED |
Newly discovered schema has changed from existing schema for data in a curated zone. |
InvalidDataFormat
Action details for invalid or unsupported data files detected by discovery.
Fields | |
---|---|
sampled_ |
The list of data locations sampled and used for format/schema inference. |
expected_ |
The expected data format of the entity. |
new_ |
The new unexpected data format within the entity. |
InvalidDataOrganization
This type has no fields.
Action details for invalid data arrangement.
InvalidDataPartition
Action details for invalid or unsupported partitions detected by discovery.
Fields | |
---|---|
expected_ |
The issue type of InvalidDataPartition. |
PartitionStructure
The expected partition structure.
Enums | |
---|---|
PARTITION_STRUCTURE_UNSPECIFIED |
PartitionStructure unspecified. |
CONSISTENT_KEYS |
Consistent hive-style partition definition (both raw and curated zone). |
HIVE_STYLE_KEYS |
Hive style partition definition (curated zone only). |
MissingData
This type has no fields.
Action details for absence of data detected by discovery.
MissingResource
This type has no fields.
Action details for resource references in assets that cannot be located.
Aspect
An aspect is a single piece of metadata describing an entry.
Fields | |
---|---|
aspect_ |
Output only. The resource name of the type used to create this Aspect. |
path |
Output only. The path in the entry under which the aspect is attached. |
create_ |
Output only. The time when the Aspect was created. |
update_ |
Output only. The time when the Aspect was last updated. |
data |
Required. The content of the aspect, according to its aspect type schema. The maximum size of the field is 120KB (encoded as UTF-8). |
aspect_ |
Optional. Information related to the source system of the aspect. |
AspectSource
Information related to the source system of the aspect.
Fields | |
---|---|
create_ |
The time the aspect was created in the source system. |
update_ |
The time the aspect was last updated in the source system. |
data_ |
The version of the data format used to produce this data. This field is used to indicated when the underlying data format changes (e.g., schema modifications, changes to the source URL format definition, etc). |
AspectType
AspectType is a template for creating Aspects, and represents the JSON-schema for a given Entry, for example, BigQuery Table Schema.
Fields | |
---|---|
name |
Output only. The relative resource name of the AspectType, of the form: projects/{project_number}/locations/{location_id}/aspectTypes/{aspect_type_id}. |
uid |
Output only. System generated globally unique ID for the AspectType. If you delete and recreate the AspectType with the same name, then this ID will be different. |
create_ |
Output only. The time when the AspectType was created. |
update_ |
Output only. The time when the AspectType was last updated. |
description |
Optional. Description of the AspectType. |
display_ |
Optional. User friendly display name. |
labels |
Optional. User-defined labels for the AspectType. |
etag |
The service computes this checksum. The client may send it on update and delete requests to ensure it has an up-to-date value before proceeding. |
authorization |
Immutable. Defines the Authorization for this type. |
metadata_ |
Required. MetadataTemplate of the aspect. |
transfer_ |
Output only. Denotes the transfer status of the Aspect Type. It is unspecified for Aspect Types created from Dataplex API. |
Authorization
Autorization for an AspectType.
Fields | |
---|---|
alternate_ |
Immutable. The IAM permission grantable on the EntryGroup to allow access to instantiate Aspects of Dataplex owned AspectTypes, only settable for Dataplex owned Types. |
MetadataTemplate
MetadataTemplate definition for an AspectType.
Fields | |
---|---|
index |
Optional. Index is used to encode Template messages. The value of index can range between 1 and 2,147,483,647. Index must be unique within all fields in a Template. (Nested Templates can reuse indexes). Once a Template is defined, the index cannot be changed, because it identifies the field in the actual storage format. Index is a mandatory field, but it is optional for top level fields, and map/array "values" definitions. |
name |
Required. The name of the field. |
type |
Required. The datatype of this field. The following values are supported: Primitive types:
Complex types:
|
record_ |
Optional. Field definition. You must specify it if the type is record. It defines the nested fields. |
enum_ |
Optional. The list of values for an enum type. You must define it if the type is enum. |
map_ |
Optional. If the type is map, set map_items. map_items can refer to a primitive field or a complex (record only) field. To specify a primitive field, you only need to set name and type in the nested MetadataTemplate. The recommended value for the name field is item, as this isn't used in the actual payload. |
array_ |
Optional. If the type is array, set array_items. array_items can refer to a primitive field or a complex (record only) field. To specify a primitive field, you only need to set name and type in the nested MetadataTemplate. The recommended value for the name field is item, as this isn't used in the actual payload. |
type_ |
Optional. You can use type id if this definition of the field needs to be reused later. The type id must be unique across the entire template. You can only specify it if the field type is record. |
type_ |
Optional. A reference to another field definition (not an inline definition). The value must be equal to the value of an id field defined elsewhere in the MetadataTemplate. Only fields with record type can refer to other fields. |
constraints |
Optional. Specifies the constraints on this field. |
annotations |
Optional. Specifies annotations on this field. |
Annotations
Definition of the annotations of a field.
Fields | |
---|---|
deprecated |
Optional. Marks a field as deprecated. You can include a deprecation message. |
display_ |
Optional. Display name for a field. |
description |
Optional. Description for a field. |
display_ |
Optional. Display order for a field. You can use this to reorder where a field is rendered. |
string_ |
Optional. You can use String Type annotations to specify special meaning to string fields. The following values are supported:
|
string_ |
Optional. Suggested hints for string fields. You can use them to suggest values to users through console. |
Constraints
Definition of the constraints of a field.
Fields | |
---|---|
required |
Optional. Marks this field as optional or required. |
EnumValue
Definition of Enumvalue, to be used for enum fields.
Fields | |
---|---|
index |
Required. Index for the enum value. It can't be modified. |
name |
Required. Name of the enumvalue. This is the actual value that the aspect can contain. |
deprecated |
Optional. You can set this message if you need to deprecate an enum value. |
Asset
An asset represents a cloud resource that is being managed within a lake as a member of a zone.
Fields | |
---|---|
name |
Output only. The relative resource name of the asset, of the form: |
display_ |
Optional. User friendly display name. |
uid |
Output only. System generated globally unique ID for the asset. This ID will be different if the asset is deleted and re-created with the same name. |
create_ |
Output only. The time when the asset was created. |
update_ |
Output only. The time when the asset was last updated. |
labels |
Optional. User defined labels for the asset. |
description |
Optional. Description of the asset. |
state |
Output only. Current state of the asset. |
resource_ |
Required. Specification of the resource that is referenced by this asset. |
resource_ |
Output only. Status of the resource referenced by this asset. |
security_ |
Output only. Status of the security policy applied to resource referenced by this asset. |
discovery_ |
Optional. Specification of the discovery feature applied to data referenced by this asset. When this spec is left unset, the asset will use the spec set on the parent zone. |
discovery_ |
Output only. Status of the discovery feature applied to data referenced by this asset. |
DiscoverySpec
Settings to manage the metadata discovery and publishing for an asset.
Fields | |
---|---|
enabled |
Optional. Whether discovery is enabled. |
include_ |
Optional. The list of patterns to apply for selecting data to include during discovery if only a subset of the data should considered. For Cloud Storage bucket assets, these are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these are interpreted as patterns to match table names. |
exclude_ |
Optional. The list of patterns to apply for selecting data to exclude during discovery. For Cloud Storage bucket assets, these are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these are interpreted as patterns to match table names. |
csv_ |
Optional. Configuration for CSV data. |
json_ |
Optional. Configuration for Json data. |
Union field trigger . Determines when discovery is triggered. trigger can be only one of the following: |
|
schedule |
Optional. Cron schedule (https://en.wikipedia.org/wiki/Cron) for running discovery periodically. Successive discovery runs must be scheduled at least 60 minutes apart. The default value is to run discovery every 60 minutes. To explicitly set a timezone to the cron tab, apply a prefix in the cron tab: "CRON_TZ=${IANA_TIME_ZONE}" or TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database. For example, |
CsvOptions
Describe CSV and similar semi-structured data formats.
Fields | |
---|---|
header_ |
Optional. The number of rows to interpret as header rows that should be skipped when reading data rows. |
delimiter |
Optional. The delimiter being used to separate values. This defaults to ','. |
encoding |
Optional. The character encoding of the data. The default is UTF-8. |
disable_ |
Optional. Whether to disable the inference of data type for CSV data. If true, all columns will be registered as strings. |
JsonOptions
Describe JSON data format.
Fields | |
---|---|
encoding |
Optional. The character encoding of the data. The default is UTF-8. |
disable_ |
Optional. Whether to disable the inference of data type for Json data. If true, all columns will be registered as their primitive types (strings, number or boolean). |
DiscoveryStatus
Status of discovery for an asset.
Fields | |
---|---|
state |
The current status of the discovery feature. |
message |
Additional information about the current state. |
update_ |
Last update time of the status. |
last_ |
The start time of the last discovery run. |
stats |
Data Stats of the asset reported by discovery. |
last_ |
The duration of the last discovery run. |
State
Current state of discovery.
Enums | |
---|---|
STATE_UNSPECIFIED |
State is unspecified. |
SCHEDULED |
Discovery for the asset is scheduled. |
IN_PROGRESS |
Discovery for the asset is running. |
PAUSED |
Discovery for the asset is currently paused (e.g. due to a lack of available resources). It will be automatically resumed. |
DISABLED |
Discovery for the asset is disabled. |
Stats
The aggregated data statistics for the asset reported by discovery.
Fields | |
---|---|
data_ |
The count of data items within the referenced resource. |
data_ |
The number of stored data bytes within the referenced resource. |
tables |
The count of table entities within the referenced resource. |
filesets |
The count of fileset entities within the referenced resource. |
ResourceSpec
Identifies the cloud resource that is referenced by this asset.
Fields | |
---|---|
name |
Immutable. Relative name of the cloud resource that contains the data that is being managed within a lake. For example: |
type |
Required. Immutable. Type of resource. |
read_ |
Optional. Determines how read permissions are handled for each asset and their associated tables. Only available to storage buckets assets. |
AccessMode
Access Mode determines how data stored within the resource is read. This is only applicable to storage bucket assets.
Enums | |
---|---|
ACCESS_MODE_UNSPECIFIED |
Access mode unspecified. |
DIRECT |
Default. Data is accessed directly using storage APIs. |
MANAGED |
Data is accessed through a managed interface using BigQuery APIs. |
Type
Type of resource.
Enums | |
---|---|
TYPE_UNSPECIFIED |
Type not specified. |
STORAGE_BUCKET |
Cloud Storage bucket. |
BIGQUERY_DATASET |
BigQuery dataset. |
ResourceStatus
Status of the resource referenced by an asset.
Fields | |
---|---|
state |
The current state of the managed resource. |
message |
Additional information about the current state. |
update_ |
Last update time of the status. |
managed_ |
Output only. Service account associated with the BigQuery Connection. |
State
The state of a resource.
Enums | |
---|---|
STATE_UNSPECIFIED |
State unspecified. |
READY |
Resource does not have any errors. |
ERROR |
Resource has errors. |
SecurityStatus
Security policy status of the asset. Data security policy, i.e., readers, writers & owners, should be specified in the lake/zone/asset IAM policy.
Fields | |
---|---|
state |
The current state of the security policy applied to the attached resource. |
message |
Additional information about the current state. |
update_ |
Last update time of the status. |
State
The state of the security policy.
Enums | |
---|---|
STATE_UNSPECIFIED |
State unspecified. |
READY |
Security policy has been successfully applied to the attached resource. |
APPLYING |
Security policy is in the process of being applied to the attached resource. |
ERROR |
Security policy could not be applied to the attached resource due to errors. |
AssetStatus
Aggregated status of the underlying assets of a lake or zone.
Fields | |
---|---|
update_ |
Last update time of the status. |
active_ |
Number of active assets. |
security_ |
Number of assets that are in process of updating the security policy on attached resources. |
CancelJobRequest
Cancel task jobs.
Fields | |
---|---|
name |
Required. The resource name of the job: Authorization requires the following IAM permission on the specified resource
|
CancelMetadataJobRequest
Cancel metadata job request.
Fields | |
---|---|
name |
Required. The resource name of the job, in the format Authorization requires the following IAM permission on the specified resource
|
Content
Content represents a user-visible notebook or a sql script
Fields | |
---|---|
name |
Output only. The relative resource name of the content, of the form: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/content/{content_id} |
uid |
Output only. System generated globally unique ID for the content. This ID will be different if the content is deleted and re-created with the same name. |
path |
Required. The path for the Content file, represented as directory structure. Unique within a lake. Limited to alphanumerics, hyphens, underscores, dots and slashes. |
create_ |
Output only. Content creation time. |
update_ |
Output only. The time when the content was last updated. |
labels |
Optional. User defined labels for the content. |
description |
Optional. Description of the content. |
Union field data . Only returned in GetContent requests and not in ListContent request. data can be only one of the following: |
|
data_ |
Required. Content data in string format. |
Union field content . Types of content content can be only one of the following: |
|
sql_ |
Sql Script related configurations. |
notebook |
Notebook related configurations. |
Notebook
Configuration for Notebook content.
Fields | |
---|---|
kernel_ |
Required. Kernel Type of the notebook. |
KernelType
Kernel Type of the Jupyter notebook.
Enums | |
---|---|
KERNEL_TYPE_UNSPECIFIED |
Kernel Type unspecified. |
PYTHON3 |
Python 3 Kernel. |
SqlScript
Configuration for the Sql Script content.
Fields | |
---|---|
engine |
Required. Query Engine to be used for the Sql Query. |
QueryEngine
Query Engine Type of the SQL Script.
Enums | |
---|---|
QUERY_ENGINE_UNSPECIFIED |
Value was unspecified. |
SPARK |
Spark SQL Query. |
CreateAspectTypeRequest
Create AspectType Request.
Fields | |
---|---|
parent |
Required. The resource name of the AspectType, of the form: projects/{project_number}/locations/{location_id} where Authorization requires the following IAM permission on the specified resource
|
aspect_ |
Required. AspectType identifier. |
aspect_ |
Required. AspectType Resource. |
validate_ |
Optional. The service validates the request without performing any mutations. The default is false. |
CreateAssetRequest
Create asset request.
Fields | |
---|---|
parent |
Required. The resource name of the parent zone: Authorization requires the following IAM permission on the specified resource
|
asset_ |
Required. Asset identifier. This ID will be used to generate names such as table names when publishing metadata to Hive Metastore and BigQuery. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must end with a number or a letter. * Must be between 1-63 characters. * Must be unique within the zone. |
asset |
Required. Asset resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateContentRequest
Create content request.
Fields | |
---|---|
parent |
Required. The resource name of the parent lake: projects/{project_id}/locations/{location_id}/lakes/{lake_id} Authorization requires the following IAM permission on the specified resource
|
content |
Required. Content resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateDataAttributeBindingRequest
Create DataAttributeBinding request.
Fields | |
---|---|
parent |
Required. The resource name of the parent data taxonomy projects/{project_number}/locations/{location_id} Authorization requires the following IAM permission on the specified resource
|
data_ |
Required. DataAttributeBinding identifier. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must be between 1-63 characters. * Must end with a number or a letter. * Must be unique within the Location. |
data_ |
Required. DataAttributeBinding resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateDataAttributeRequest
Create DataAttribute request.
Fields | |
---|---|
parent |
Required. The resource name of the parent data taxonomy projects/{project_number}/locations/{location_id}/dataTaxonomies/{data_taxonomy_id} Authorization requires the following IAM permission on the specified resource
|
data_ |
Required. DataAttribute identifier. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must be between 1-63 characters. * Must end with a number or a letter. * Must be unique within the DataTaxonomy. |
data_ |
Required. DataAttribute resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateDataScanRequest
Create dataScan request.
Fields | |
---|---|
parent |
Required. The resource name of the parent location: Authorization requires the following IAM permission on the specified resource
|
data_ |
Required. DataScan resource. |
data_ |
Required. DataScan identifier.
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is |
CreateDataTaxonomyRequest
Create DataTaxonomy request.
Fields | |
---|---|
parent |
Required. The resource name of the data taxonomy location, of the form: projects/{project_number}/locations/{location_id} where Authorization requires the following IAM permission on the specified resource
|
data_ |
Required. DataTaxonomy identifier. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must be between 1-63 characters. * Must end with a number or a letter. * Must be unique within the Project. |
data_ |
Required. DataTaxonomy resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateEntityRequest
Create a metadata entity request.
Fields | |
---|---|
parent |
Required. The resource name of the parent zone: Authorization requires the following IAM permission on the specified resource
|
entity |
Required. Entity resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateEntryGroupRequest
Create EntryGroup Request.
Fields | |
---|---|
parent |
Required. The resource name of the entryGroup, of the form: projects/{project_number}/locations/{location_id} where Authorization requires the following IAM permission on the specified resource
|
entry_ |
Required. EntryGroup identifier. |
entry_ |
Required. EntryGroup Resource. |
validate_ |
Optional. The service validates the request without performing any mutations. The default is false. |
CreateEntryRequest
Create Entry request.
Fields | |
---|---|
parent |
Required. The resource name of the parent Entry Group: |
entry_ |
Required. Entry identifier. It has to be unique within an Entry Group. Entries corresponding to Google Cloud resources use an Entry ID format based on full resource names. The format is a full resource name of the resource without the prefix double slashes in the API service name part of the full resource name. This allows retrieval of entries using their associated resource name. For example, if the full resource name of a resource is It is also suggested to follow the same convention for entries corresponding to resources from providers or systems other than Google Cloud. The maximum size of the field is 4000 characters. |
entry |
Required. Entry resource. |
CreateEntryTypeRequest
Create EntryType Request.
Fields | |
---|---|
parent |
Required. The resource name of the EntryType, of the form: projects/{project_number}/locations/{location_id} where Authorization requires the following IAM permission on the specified resource
|
entry_ |
Required. EntryType identifier. |
entry_ |
Required. EntryType Resource. |
validate_ |
Optional. The service validates the request without performing any mutations. The default is false. |
CreateEnvironmentRequest
Create environment request.
Fields | |
---|---|
parent |
Required. The resource name of the parent lake: Authorization requires the following IAM permission on the specified resource
|
environment_ |
Required. Environment identifier. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must be between 1-63 characters. * Must end with a number or a letter. * Must be unique within the lake. |
environment |
Required. Environment resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateLakeRequest
Create lake request.
Fields | |
---|---|
parent |
Required. The resource name of the lake location, of the form: projects/{project_number}/locations/{location_id} where Authorization requires the following IAM permission on the specified resource
|
lake_ |
Required. Lake identifier. This ID will be used to generate names such as database and dataset names when publishing metadata to Hive Metastore and BigQuery. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must end with a number or a letter. * Must be between 1-63 characters. * Must be unique within the customer project / location. |
lake |
Required. Lake resource |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateMetadataJobRequest
Create metadata job request.
Fields | |
---|---|
parent |
Required. The resource name of the parent location, in the format Authorization requires the following IAM permission on the specified resource
|
metadata_ |
Required. The metadata job resource. |
metadata_ |
Optional. The metadata job ID. If not provided, a unique ID is generated with the prefix |
validate_ |
Optional. The service validates the request without performing any mutations. The default is false. |
CreatePartitionRequest
Create metadata partition request.
Fields | |
---|---|
parent |
Required. The resource name of the parent zone: Authorization requires the following IAM permission on the specified resource
|
partition |
Required. Partition resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateTaskRequest
Create task request.
Fields | |
---|---|
parent |
Required. The resource name of the parent lake: Authorization requires the following IAM permission on the specified resource
|
task_ |
Required. Task identifier. |
task |
Required. Task resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
CreateZoneRequest
Create zone request.
Fields | |
---|---|
parent |
Required. The resource name of the parent lake: Authorization requires the following IAM permission on the specified resource
|
zone_ |
Required. Zone identifier. This ID will be used to generate names such as database and dataset names when publishing metadata to Hive Metastore and BigQuery. * Must contain only lowercase letters, numbers and hyphens. * Must start with a letter. * Must end with a number or a letter. * Must be between 1-63 characters. * Must be unique across all lakes from all locations in a project. * Must not be one of the reserved IDs (i.e. "default", "global-temp") |
zone |
Required. Zone resource. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
DataAccessSpec
DataAccessSpec holds the access control configuration to be enforced on data stored within resources (eg: rows, columns in BigQuery Tables). When associated with data, the data is only accessible to principals explicitly granted access through the DataAccessSpec. Principals with access to the containing resource are not implicitly granted access.
Fields | |
---|---|
readers[] |
Optional. The format of strings follows the pattern followed by IAM in the bindings. user:{email}, serviceAccount:{email} group:{email}. The set of principals to be granted reader role on data stored within resources. |
DataAttribute
Denotes one dataAttribute in a dataTaxonomy, for example, PII. DataAttribute resources can be defined in a hierarchy. A single dataAttribute resource can contain specs of multiple types
PII
- ResourceAccessSpec :
- readers :foo@bar.com
- DataAccessSpec :
- readers :bar@foo.com
Fields | |
---|---|
name |
Output only. The relative resource name of the dataAttribute, of the form: projects/{project_number}/locations/{location_id}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id}. |
uid |
Output only. System generated globally unique ID for the DataAttribute. This ID will be different if the DataAttribute is deleted and re-created with the same name. |
create_ |
Output only. The time when the DataAttribute was created. |
update_ |
Output only. The time when the DataAttribute was last updated. |
description |
Optional. Description of the DataAttribute. |
display_ |
Optional. User friendly display name. |
labels |
Optional. User-defined labels for the DataAttribute. |
parent_ |
Optional. The ID of the parent DataAttribute resource, should belong to the same data taxonomy. Circular dependency in parent chain is not valid. Maximum depth of the hierarchy allowed is 4. [a -> b -> c -> d -> e, depth = 4] |
attribute_ |
Output only. The number of child attributes present for this attribute. |
etag |
This checksum is computed by the server based on the value of other fields, and may be sent on update and delete requests to ensure the client has an up-to-date value before proceeding. |
resource_ |
Optional. Specified when applied to a resource (eg: Cloud Storage bucket, BigQuery dataset, BigQuery table). |
data_ |
Optional. Specified when applied to data stored on the resource (eg: rows, columns in BigQuery Tables). |
DataAttributeBinding
DataAttributeBinding represents binding of attributes to resources. Eg: Bind 'CustomerInfo' entity with 'PII' attribute.
Fields | |
---|---|
name |
Output only. The relative resource name of the Data Attribute Binding, of the form: projects/{project_number}/locations/{location}/dataAttributeBindings/{data_attribute_binding_id} |
uid |
Output only. System generated globally unique ID for the DataAttributeBinding. This ID will be different if the DataAttributeBinding is deleted and re-created with the same name. |
create_ |
Output only. The time when the DataAttributeBinding was created. |
update_ |
Output only. The time when the DataAttributeBinding was last updated. |
description |
Optional. Description of the DataAttributeBinding. |
display_ |
Optional. User friendly display name. |
labels |
Optional. User-defined labels for the DataAttributeBinding. |
etag |
This checksum is computed by the server based on the value of other fields, and may be sent on update and delete requests to ensure the client has an up-to-date value before proceeding. Etags must be used when calling the DeleteDataAttributeBinding and the UpdateDataAttributeBinding method. |
attributes[] |
Optional. List of attributes to be associated with the resource, provided in the form: projects/{project}/locations/{location}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id} |
paths[] |
Optional. The list of paths for items within the associated resource (eg. columns and partitions within a table) along with attribute bindings. |
Union field resource_reference . The reference to the resource that is associated to attributes, or the query to match resources and associate attributes. resource_reference can be only one of the following: |
|
resource |
Optional. Immutable. The resource name of the resource that is associated to attributes. Presently, only entity resource is supported in the form: projects/{project}/locations/{location}/lakes/{lake}/zones/{zone}/entities/{entity_id} Must belong in the same project and region as the attribute binding, and there can only exist one active binding for a resource. |
Path
Represents a subresource of the given resource, and associated bindings with it. Currently supported subresources are column and partition schema fields within a table.
Fields | |
---|---|
name |
Required. The name identifier of the path. Nested columns should be of the form: 'address.city'. |
attributes[] |
Optional. List of attributes to be associated with the path of the resource, provided in the form: projects/{project}/locations/{location}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id} |
DataDiscoveryResult
The output of a data discovery scan.
Fields | |
---|---|
bigquery_ |
Output only. Configuration for metadata publishing. |
BigQueryPublishing
Describes BigQuery publishing configurations.
Fields | |
---|---|
dataset |
Output only. The BigQuery dataset to publish to. It takes the form |
DataDiscoverySpec
Spec for a data discovery scan.
Fields | |
---|---|
bigquery_ |
Optional. Configuration for metadata publishing. |
Union field resource_config . The configurations of the data discovery scan resource. resource_config can be only one of the following: |
|
storage_ |
Cloud Storage related configurations. |
BigQueryPublishingConfig
Describes BigQuery publishing configurations.
Fields | |
---|---|
table_ |
Optional. Determines whether to publish discovered tables as BigLake external tables or non-BigLake external tables. |
connection |
Optional. The BigQuery connection used to create BigLake tables. Must be in the form |
TableType
Determines how discovered tables are published.
Enums | |
---|---|
TABLE_TYPE_UNSPECIFIED |
Table type unspecified. |
EXTERNAL |
Default. Discovered tables are published as BigQuery external tables whose data is accessed using the credentials of the user querying the table. |
BIGLAKE |
Discovered tables are published as BigLake external tables whose data is accessed using the credentials of the associated BigQuery connection. |
StorageConfig
Configurations related to Cloud Storage as the data source.
Fields | |
---|---|
include_ |
Optional. Defines the data to include during discovery when only a subset of the data should be considered. Provide a list of patterns that identify the data to include. For Cloud Storage bucket assets, these patterns are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these patterns are interpreted as patterns to match table names. |
exclude_ |
Optional. Defines the data to exclude during discovery. Provide a list of patterns that identify the data to exclude. For Cloud Storage bucket assets, these patterns are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these patterns are interpreted as patterns to match table names. |
csv_ |
Optional. Configuration for CSV data. |
json_ |
Optional. Configuration for JSON data. |
CsvOptions
Describes CSV and similar semi-structured data formats.
Fields | |
---|---|
header_ |
Optional. The number of rows to interpret as header rows that should be skipped when reading data rows. |
delimiter |
Optional. The delimiter that is used to separate values. The default is |
encoding |
Optional. The character encoding of the data. The default is UTF-8. |
type_ |
Optional. Whether to disable the inference of data types for CSV data. If true, all columns are registered as strings. |
quote |
Optional. The character used to quote column values. Accepts |
JsonOptions
Describes JSON data format.
Fields | |
---|---|
encoding |
Optional. The character encoding of the data. The default is UTF-8. |
type_ |
Optional. Whether to disable the inference of data types for JSON data. If true, all columns are registered as their primitive types (strings, number, or boolean). |
DataProfileResult
DataProfileResult defines the output of DataProfileScan. Each field of the table will have field type specific profile result.
Fields | |
---|---|
row_ |
The count of rows scanned. |
profile |
The profile information per field. |
scanned_ |
The data scanned for this result. |
post_ |
Output only. The result of post scan actions. |
PostScanActionsResult
The result of post scan actions of DataProfileScan job.
Fields | |
---|---|
bigquery_ |
Output only. The result of BigQuery export post scan action. |
BigQueryExportResult
The result of BigQuery export post scan action.
Fields | |
---|---|
state |
Output only. Execution state for the BigQuery exporting. |
message |
Output only. Additional information about the BigQuery exporting. |
State
Execution state for the exporting.
Enums | |
---|---|
STATE_UNSPECIFIED |
The exporting state is unspecified. |
SUCCEEDED |
The exporting completed successfully. |
FAILED |
The exporting is no longer running due to an error. |
SKIPPED |
The exporting is skipped due to no valid scan result to export (usually caused by scan failed). |
Profile
Contains name, type, mode and field type specific profile information.
Fields | |
---|---|
fields[] |
List of fields with structural and profile information for each field. |
Field
A field within a table.
Fields | |
---|---|
name |
The name of the field. |
type |
The data type retrieved from the schema of the data source. For instance, for a BigQuery native table, it is the BigQuery Table Schema. For a Dataplex Entity, it is the Entity Schema. |
mode |
The mode of the field. Possible values include:
|
profile |
Profile information for the corresponding field. |
ProfileInfo
The profile information for each field type.
Fields | |
---|---|
null_ |
Ratio of rows with null value against total scanned rows. |
distinct_ |
Ratio of rows with distinct values against total scanned rows. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode. |
top_ |
The list of top N non-null values, frequency and ratio with which they occur in the scanned data. N is 10 or equal to the number of distinct values in the field, whichever is smaller. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode. |
Union field field_info . Structural and profile information for specific field type. Not available, if mode is REPEATABLE. field_info can be only one of the following: |
|
string_ |
String type field information. |
integer_ |
Integer type field information. |
double_ |
Double type field information. |
DoubleFieldInfo
The profile information for a double type field.
Fields | |
---|---|
average |
Average of non-null values in the scanned data. NaN, if the field has a NaN. |
standard_ |
Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN. |
min |
Minimum of non-null values in the scanned data. NaN, if the field has a NaN. |
quartiles[] |
A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of quartile values for the scanned data, occurring in order Q1, median, Q3. |
max |
Maximum of non-null values in the scanned data. NaN, if the field has a NaN. |
IntegerFieldInfo
The profile information for an integer type field.
Fields | |
---|---|
average |
Average of non-null values in the scanned data. NaN, if the field has a NaN. |
standard_ |
Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN. |
min |
Minimum of non-null values in the scanned data. NaN, if the field has a NaN. |
quartiles[] |
A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of approximate quartile values for the scanned data, occurring in order Q1, median, Q3. |
max |
Maximum of non-null values in the scanned data. NaN, if the field has a NaN. |
StringFieldInfo
The profile information for a string type field.
Fields | |
---|---|
min_ |
Minimum length of non-null values in the scanned data. |
max_ |
Maximum length of non-null values in the scanned data. |
average_ |
Average length of non-null values in the scanned data. |
TopNValue
Top N non-null values in the scanned data.
Fields | |
---|---|
value |
String value of a top N non-null value. |
count |
Count of the corresponding value in the scanned data. |
ratio |
Ratio of the corresponding value in the field against the total number of rows in the scanned data. |
DataProfileSpec
DataProfileScan related setting.
Fields | |
---|---|
sampling_ |
Optional. The percentage of the records to be selected from the dataset for DataScan.
|
row_ |
Optional. A filter applied to all rows in a single DataScan job. The filter needs to be a valid SQL expression for a WHERE clause in BigQuery standard SQL syntax. Example: col1 >= 0 AND col2 < 10 |
post_ |
Optional. Actions to take upon job completion.. |
include_ |
Optional. The fields to include in data profile. If not specified, all fields at the time of profile scan job execution are included, except for ones listed in |
exclude_ |
Optional. The fields to exclude from data profile. If specified, the fields will be excluded from data profile, regardless of |
PostScanActions
The configuration of post scan actions of DataProfileScan job.
Fields | |
---|---|
bigquery_ |
Optional. If set, results will be exported to the provided BigQuery table. |
BigQueryExport
The configuration of BigQuery export post scan action.
Fields | |
---|---|
results_ |
Optional. The BigQuery table to export DataProfileScan results to. Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID |
SelectedFields
The specification for fields to include or exclude in data profile scan.
Fields | |
---|---|
field_ |
Optional. Expected input is a list of fully qualified names of fields as in the schema. Only top-level field names for nested fields are supported. For instance, if 'x' is of nested field type, listing 'x' is supported but 'x.y.z' is not supported. Here 'y' and 'y.z' are nested fields of 'x'. |
DataQualityColumnResult
DataQualityColumnResult provides a more detailed, per-column view of the results.
Fields | |
---|---|
column |
Output only. The column specified in the DataQualityRule. |
score |
Output only. The column-level data quality score for this data scan job if and only if the 'column' field is set. The score ranges between between [0, 100] (up to two decimal points). |
DataQualityDimension
A dimension captures data quality intent about a defined subset of the rules specified.
Fields | |
---|---|
name |
The dimension name a rule belongs to. Supported dimensions are ["COMPLETENESS", "ACCURACY", "CONSISTENCY", "VALIDITY", "UNIQUENESS", "FRESHNESS", "VOLUME"] |
DataQualityDimensionResult
DataQualityDimensionResult provides a more detailed, per-dimension view of the results.
Fields | |
---|---|
dimension |
Output only. The dimension config specified in the DataQualitySpec, as is. |
passed |
Whether the dimension passed or failed. |
score |
Output only. The dimension-level data quality score for this data scan job if and only if the 'dimension' field is set. The score ranges between [0, 100] (up to two decimal points). |
DataQualityResult
The output of a DataQualityScan.
Fields | |
---|---|
passed |
Overall data quality result -- |
dimensions[] |
A list of results at the dimension level. A dimension will have a corresponding |
columns[] |
Output only. A list of results at the column level. A column will have a corresponding |
rules[] |
A list of all the rules in a job, and their results. |
row_ |
The count of rows processed. |
scanned_ |
The data scanned for this result. |
post_ |
Output only. The result of post scan actions. |
score |
Output only. The overall data quality score. The score ranges between [0, 100] (up to two decimal points). |
PostScanActionsResult
The result of post scan actions of DataQualityScan job.
Fields | |
---|---|
bigquery_ |
Output only. The result of BigQuery export post scan action. |
BigQueryExportResult
The result of BigQuery export post scan action.
Fields | |
---|---|
state |
Output only. Execution state for the BigQuery exporting. |
message |
Output only. Additional information about the BigQuery exporting. |
State
Execution state for the exporting.
Enums | |
---|---|
STATE_UNSPECIFIED |
The exporting state is unspecified. |
SUCCEEDED |
The exporting completed successfully. |
FAILED |
The exporting is no longer running due to an error. |
SKIPPED |
The exporting is skipped due to no valid scan result to export (usually caused by scan failed). |
DataQualityRule
A rule captures data quality intent about a data source.
Fields | |
---|---|
column |
Optional. The unnested column which this rule is evaluated against. |
ignore_ |
Optional. Rows with This field is only valid for the following type of rules:
|
dimension |
Required. The dimension a rule belongs to. Results are also aggregated at the dimension level. Supported dimensions are ["COMPLETENESS", "ACCURACY", "CONSISTENCY", "VALIDITY", "UNIQUENESS", "FRESHNESS", "VOLUME"] |
threshold |
Optional. The minimum ratio of passing_rows / total_rows required to pass this rule, with a range of [0.0, 1.0]. 0 indicates default value (i.e. 1.0). This field is only valid for row-level type rules. |
name |
Optional. A mutable name for the rule.
|
description |
Optional. Description of the rule.
|
suspended |
Optional. Whether the Rule is active or suspended. Default is false. |
Union field rule_type . The rule-specific configuration. rule_type can be only one of the following: |
|
range_ |
Row-level rule which evaluates whether each column value lies between a specified range. |
non_ |
Row-level rule which evaluates whether each column value is null. |
set_ |
Row-level rule which evaluates whether each column value is contained by a specified set. |
regex_ |
Row-level rule which evaluates whether each column value matches a specified regex. |
uniqueness_ |
Row-level rule which evaluates whether each column value is unique. |
statistic_ |
Aggregate rule which evaluates whether the column aggregate statistic lies between a specified range. |
row_ |
Row-level rule which evaluates whether each row in a table passes the specified condition. |
table_ |
Aggregate rule which evaluates whether the provided expression is true for a table. |
sql_ |
Aggregate rule which evaluates the number of rows returned for the provided statement. If any rows are returned, this rule fails. |
NonNullExpectation
This type has no fields.
Evaluates whether each column value is null.
RangeExpectation
Evaluates whether each column value lies between a specified range.
Fields | |
---|---|
min_ |
Optional. The minimum column value allowed for a row to pass this validation. At least one of |
max_ |
Optional. The maximum column value allowed for a row to pass this validation. At least one of |
strict_ |
Optional. Whether each value needs to be strictly greater than ('>') the minimum, or if equality is allowed. Only relevant if a |
strict_ |
Optional. Whether each value needs to be strictly lesser than ('<') the maximum, or if equality is allowed. Only relevant if a |
RegexExpectation
Evaluates whether each column value matches a specified regex.
Fields | |
---|---|
regex |
Optional. A regular expression the column value is expected to match. |
RowConditionExpectation
Evaluates whether each row passes the specified condition.
The SQL expression needs to use BigQuery standard SQL syntax and should produce a boolean value per row as the result.
Example: col1 >= 0 AND col2 < 10
Fields | |
---|---|
sql_ |
Optional. The SQL expression. |
SetExpectation
Evaluates whether each column value is contained by a specified set.
Fields | |
---|---|
values[] |
Optional. Expected values for the column value. |
SqlAssertion
A SQL statement that is evaluated to return rows that match an invalid state. If any rows are are returned, this rule fails.
The SQL statement must use BigQuery standard SQL syntax, and must not contain any semicolons.
You can use the data reference parameter ${data()}
to reference the source table with all of its precondition filters applied. Examples of precondition filters include row filters, incremental data filters, and sampling. For more information, see Data reference parameter.
Example: SELECT * FROM ${data()} WHERE price < 0
Fields | |
---|---|
sql_ |
Optional. The SQL statement. |
StatisticRangeExpectation
Evaluates whether the column aggregate statistic lies between a specified range.
Fields | |
---|---|
statistic |
Optional. The aggregate metric to evaluate. |
min_ |
Optional. The minimum column statistic value allowed for a row to pass this validation. At least one of |
max_ |
Optional. The maximum column statistic value allowed for a row to pass this validation. At least one of |
strict_ |
Optional. Whether column statistic needs to be strictly greater than ('>') the minimum, or if equality is allowed. Only relevant if a |
strict_ |
Optional. Whether column statistic needs to be strictly lesser than ('<') the maximum, or if equality is allowed. Only relevant if a |
ColumnStatistic
The list of aggregate metrics a rule can be evaluated against.
Enums | |
---|---|
STATISTIC_UNDEFINED |
Unspecified statistic type |
MEAN |
Evaluate the column mean |
MIN |
Evaluate the column min |
MAX |
Evaluate the column max |
TableConditionExpectation
Evaluates whether the provided expression is true.
The SQL expression needs to use BigQuery standard SQL syntax and should produce a scalar boolean result.
Example: MIN(col1) >= 0
Fields | |
---|---|
sql_ |
Optional. The SQL expression. |
UniquenessExpectation
This type has no fields.
Evaluates whether the column has duplicates.
DataQualityRuleResult
DataQualityRuleResult provides a more detailed, per-rule view of the results.
Fields | |
---|---|
rule |
The rule specified in the DataQualitySpec, as is. |
passed |
Whether the rule passed or failed. |
evaluated_ |
The number of rows a rule was evaluated against. This field is only valid for row-level type rules. Evaluated count can be configured to either
|
passed_ |
The number of rows which passed a rule evaluation. This field is only valid for row-level type rules. |
null_ |
The number of rows with null values in the specified column. |
pass_ |
The ratio of passed_count / evaluated_count. This field is only valid for row-level type rules. |
failing_ |
The query to find rows that did not pass this rule. This field is only valid for row-level type rules. |
assertion_ |
Output only. The number of rows returned by the SQL statement in a SQL assertion rule. This field is only valid for SQL assertion rules. |
DataQualityScanRuleResult
Information about the result of a data quality rule for data quality scan. The monitored resource is 'DataScan'.
Fields | |
---|---|
job_ |
Identifier of the specific data scan job this log entry is for. |
data_ |
The data source of the data scan (e.g. BigQuery table name). |
column |
The column which this rule is evaluated against. |
rule_ |
The name of the data quality rule. |
rule_ |
The type of the data quality rule. |
evalution_ |
The evaluation type of the data quality rule. |
rule_ |
The dimension of the data quality rule. |
threshold_ |
The passing threshold ([0.0, 100.0]) of the data quality rule. |
result |
The result of the data quality rule. |
evaluated_ |
The number of rows evaluated against the data quality rule. This field is only valid for rules of PER_ROW evaluation type. |
passed_ |
The number of rows which passed a rule evaluation. This field is only valid for rules of PER_ROW evaluation type. |
null_ |
The number of rows with null values in the specified column. |
assertion_ |
The number of rows returned by the SQL statement in a SQL assertion rule. This field is only valid for SQL assertion rules. |
EvaluationType
The evaluation type of the data quality rule.
Enums | |
---|---|
EVALUATION_TYPE_UNSPECIFIED |
An unspecified evaluation type. |
PER_ROW |
The rule evaluation is done at per row level. |
AGGREGATE |
The rule evaluation is done for an aggregate of rows. |
Result
Whether the data quality rule passed or failed.
Enums | |
---|---|
RESULT_UNSPECIFIED |
An unspecified result. |
PASSED |
The data quality rule passed. |
FAILED |
The data quality rule failed. |
RuleType
The type of the data quality rule.
Enums | |
---|---|
RULE_TYPE_UNSPECIFIED |
An unspecified rule type. |
NON_NULL_EXPECTATION |
See DataQualityRule.NonNullExpectation . |
RANGE_EXPECTATION |
See DataQualityRule.RangeExpectation . |
REGEX_EXPECTATION |
See DataQualityRule.RegexExpectation . |
ROW_CONDITION_EXPECTATION |
See DataQualityRule.RowConditionExpectation . |
SET_EXPECTATION |
See DataQualityRule.SetExpectation . |
STATISTIC_RANGE_EXPECTATION |
See DataQualityRule.StatisticRangeExpectation . |
TABLE_CONDITION_EXPECTATION |
See DataQualityRule.TableConditionExpectation . |
UNIQUENESS_EXPECTATION |
See DataQualityRule.UniquenessExpectation . |
SQL_ASSERTION |
See DataQualityRule.SqlAssertion . |
DataQualitySpec
DataQualityScan related setting.
Fields | |
---|---|
rules[] |
Required. The list of rules to evaluate against a data source. At least one rule is required. |
sampling_ |
Optional. The percentage of the records to be selected from the dataset for DataScan.
|
row_ |
Optional. A filter applied to all rows in a single DataScan job. The filter needs to be a valid SQL expression for a WHERE clause in BigQuery standard SQL syntax. Example: col1 >= 0 AND col2 < 10 |
post_ |
Optional. Actions to take upon job completion. |
PostScanActions
The configuration of post scan actions of DataQualityScan.
Fields | |
---|---|
bigquery_ |
Optional. If set, results will be exported to the provided BigQuery table. |
notification_ |
Optional. If set, results will be sent to the provided notification receipts upon triggers. |
BigQueryExport
The configuration of BigQuery export post scan action.
Fields | |
---|---|
results_ |
Optional. The BigQuery table to export DataQualityScan results to. Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID |
JobEndTrigger
This type has no fields.
This trigger is triggered whenever a scan job run ends, regardless of the result.
JobFailureTrigger
This type has no fields.
This trigger is triggered when the scan job itself fails, regardless of the result.
NotificationReport
The configuration of notification report post scan action.
Fields | |
---|---|
recipients |
Required. The recipients who will receive the notification report. |
score_ |
Optional. If set, report will be sent when score threshold is met. |
job_ |
Optional. If set, report will be sent when a scan job fails. |
job_ |
Optional. If set, report will be sent when a scan job ends. |
Recipients
The individuals or groups who are designated to receive notifications upon triggers.
Fields | |
---|---|
emails[] |
Optional. The email recipients who will receive the DataQualityScan results report. |
ScoreThresholdTrigger
This trigger is triggered when the DQ score in the job result is less than a specified input score.
Fields | |
---|---|
score_ |
Optional. The score range is in [0,100]. |
DataScan
Represents a user-visible job which provides the insights for the related data source.
For example:
- Data Quality: generates queries based on the rules and runs against the data to get data quality check results.
- Data Profile: analyzes the data in table(s) and generates insights about the structure, content and relationships (such as null percent, cardinality, min/max/mean, etc).
Fields | |
---|---|
name |
Output only. Identifier. The relative resource name of the scan, of the form: |
uid |
Output only. System generated globally unique ID for the scan. This ID will be different if the scan is deleted and re-created with the same name. |
description |
Optional. Description of the scan.
|
display_ |
Optional. User friendly display name.
|
labels |
Optional. User-defined labels for the scan. |
state |
Output only. Current state of the DataScan. |
create_ |
Output only. The time when the scan was created. |
update_ |
Output only. The time when the scan was last updated. |
data |
Required. The data source for DataScan. |
execution_ |
Optional. DataScan execution settings. If not specified, the fields in it will use their default values. |
execution_ |
Output only. Status of the data scan execution. |
type |
Output only. The type of DataScan. |
Union field spec . Data scan related setting. The settings are required and immutable. After you configure the settings for one type of data scan, you can't change the data scan to a different type of data scan. spec can be only one of the following: |
|
data_ |
Settings for a data quality scan. |
data_ |
Settings for a data profile scan. |
data_ |
Settings for a data discovery scan. |
Union field result . The result of the data scan. result can be only one of the following: |
|
data_ |
Output only. The result of a data quality scan. |
data_ |
Output only. The result of a data profile scan. |
data_ |
Output only. The result of a data discovery scan. |
ExecutionSpec
DataScan execution settings.
Fields | |
---|---|
trigger |
Optional. Spec related to how often and when a scan should be triggered. If not specified, the default is |
Union field When an option is selected for incremental scan, it cannot be unset or changed. If not specified, a data scan will run for all data in the table. |
|
field |
Immutable. The unnested field (of type Date or Timestamp) that contains values which monotonically increase over time. If not specified, a data scan will run for all data in the table. |
ExecutionStatus
Status of the data scan execution.
Fields | |
---|---|
latest_ |
Optional. The time when the latest DataScanJob started. |
latest_ |
Optional. The time when the latest DataScanJob ended. |
latest_ |
Optional. The time when the DataScanJob execution was created. |
DataScanEvent
These messages contain information about the execution of a datascan. The monitored resource is 'DataScan'
Fields | |
---|---|
data_ |
The data source of the data scan |
job_ |
The identifier of the specific data scan job this log entry is for. |
create_ |
The time when the data scan job was created. |
start_ |
The time when the data scan job started to run. |
end_ |
The time when the data scan job finished. |
type |
The type of the data scan. |
state |
The status of the data scan job. |
message |
The message describing the data scan job event. |
spec_ |
A version identifier of the spec which was used to execute this job. |
trigger |
The trigger type of the data scan job. |
scope |
The scope of the data scan (e.g. full, incremental). |
post_ |
The result of post scan actions. |
Union field result . The result of the data scan job. result can be only one of the following: |
|
data_ |
Data profile result for data profile type data scan. |
data_ |
Data quality result for data quality type data scan. |
Union field appliedConfigs . The applied configs in the data scan job. appliedConfigs can be only one of the following: |
|
data_ |
Applied configs for data profile type data scan. |
data_ |
Applied configs for data quality type data scan. |
DataProfileAppliedConfigs
Applied configs for data profile type data scan job.
Fields | |
---|---|
sampling_ |
The percentage of the records selected from the dataset for DataScan.
|
row_ |
Boolean indicating whether a row filter was applied in the DataScan job. |
column_ |
Boolean indicating whether a column filter was applied in the DataScan job. |
DataProfileResult
Data profile result for data scan job.
Fields | |
---|---|
row_ |
The count of rows processed in the data scan job. |
DataQualityAppliedConfigs
Applied configs for data quality type data scan job.
Fields | |
---|---|
sampling_ |
The percentage of the records selected from the dataset for DataScan.
|
row_ |
Boolean indicating whether a row filter was applied in the DataScan job. |
DataQualityResult
Data quality result for data scan job.
Fields | |
---|---|
row_ |
The count of rows processed in the data scan job. |
passed |
Whether the data quality result was |
dimension_ |
The result of each dimension for data quality result. The key of the map is the name of the dimension. The value is the bool value depicting whether the dimension result was |
score |
The table-level data quality score for the data scan job. The data quality score ranges between [0, 100] (up to two decimal points). |
dimension_ |
The score of each dimension for data quality result. The key of the map is the name of the dimension. The value is the data quality score for the dimension. The score ranges between [0, 100] (up to two decimal points). |
column_ |
The score of each column scanned in the data scan job. The key of the map is the name of the column. The value is the data quality score for the column. The score ranges between [0, 100] (up to two decimal points). |
PostScanActionsResult
Post scan actions result for data scan job.
Fields | |
---|---|
bigquery_ |
The result of BigQuery export post scan action. |
BigQueryExportResult
The result of BigQuery export post scan action.
Fields | |
---|---|
state |
Execution state for the BigQuery exporting. |
message |
Additional information about the BigQuery exporting. |
State
Execution state for the exporting.
Enums | |
---|---|
STATE_UNSPECIFIED |
The exporting state is unspecified. |
SUCCEEDED |
The exporting completed successfully. |
FAILED |
The exporting is no longer running due to an error. |
SKIPPED |
The exporting is skipped due to no valid scan result to export (usually caused by scan failed). |
ScanType
The type of the data scan.
Enums | |
---|---|
SCAN_TYPE_UNSPECIFIED |
An unspecified data scan type. |
DATA_PROFILE |
Data scan for data profile. |
DATA_QUALITY |
Data scan for data quality. |
DATA_DISCOVERY |
Data scan for data discovery. |
Scope
The scope of job for the data scan.
Enums | |
---|---|
SCOPE_UNSPECIFIED |
An unspecified scope type. |
FULL |
Data scan runs on all of the data. |
INCREMENTAL |
Data scan runs on incremental data. |
State
The job state of the data scan.
Enums | |
---|---|
STATE_UNSPECIFIED |
Unspecified job state. |
STARTED |
Data scan job started. |
SUCCEEDED |
Data scan job successfully completed. |
FAILED |
Data scan job was unsuccessful. |
CANCELLED |
Data scan job was cancelled. |
CREATED |
Data scan job was createed. |
Trigger
The trigger type for the data scan.
Enums | |
---|---|
TRIGGER_UNSPECIFIED |
An unspecified trigger type. |
ON_DEMAND |
Data scan triggers on demand. |
SCHEDULE |
Data scan triggers as per schedule. |
DataScanJob
A DataScanJob represents an instance of DataScan execution.
Fields | |
---|---|
name |
Output only. Identifier. The relative resource name of the DataScanJob, of the form: |
uid |
Output only. System generated globally unique ID for the DataScanJob. |
create_ |
Output only. The time when the DataScanJob was created. |
start_ |
Output only. The time when the DataScanJob was started. |
end_ |
Output only. The time when the DataScanJob ended. |
state |
Output only. Execution state for the DataScanJob. |
message |
Output only. Additional information about the current state. |
type |
Output only. The type of the parent DataScan. |
Union field spec . Data scan related setting. spec can be only one of the following: |
|
data_ |
Output only. Settings for a data quality scan. |
data_ |
Output only. Settings for a data profile scan. |
data_ |
Output only. Settings for a data discovery scan. |
Union field result . The result of the data scan. result can be only one of the following: |
|
data_ |
Output only. The result of a data quality scan. |
data_ |
Output only. The result of a data profile scan. |
data_ |
Output only. The result of a data discovery scan. |
State
Execution state for the DataScanJob.
Enums | |
---|---|
STATE_UNSPECIFIED |
The DataScanJob state is unspecified. |
RUNNING |
The DataScanJob is running. |
CANCELING |
The DataScanJob is canceling. |
CANCELLED |
The DataScanJob cancellation was successful. |
SUCCEEDED |
The DataScanJob completed successfully. |
FAILED |
The DataScanJob is no longer running due to an error. |
PENDING |
The DataScanJob has been created but not started to run yet. |
DataScanType
The type of data scan.
Enums | |
---|---|
DATA_SCAN_TYPE_UNSPECIFIED |
The data scan type is unspecified. |
DATA_QUALITY |
Data quality scan. |
DATA_PROFILE |
Data profile scan. |
DATA_DISCOVERY |
Data discovery scan. |
DataSource
The data source for DataScan.
Fields | |
---|---|
Union field source . The source is required and immutable. Once it is set, it cannot be change to others. source can be only one of the following: |
|
entity |
Immutable. The Dataplex entity that represents the data source (e.g. BigQuery table) for DataScan, of the form: |
resource |
Immutable. The service-qualified full resource name of the cloud resource for a DataScan job to scan against. The field could be: BigQuery table of type "TABLE" for DataProfileScan/DataQualityScan Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID |
DataTaxonomy
DataTaxonomy represents a set of hierarchical DataAttributes resources, grouped with a common theme Eg: 'SensitiveDataTaxonomy' can have attributes to manage PII data. It is defined at project level.
Fields | |
---|---|
name |
Output only. The relative resource name of the DataTaxonomy, of the form: projects/{project_number}/locations/{location_id}/dataTaxonomies/{data_taxonomy_id}. |
uid |
Output only. System generated globally unique ID for the dataTaxonomy. This ID will be different if the DataTaxonomy is deleted and re-created with the same name. |
create_ |
Output only. The time when the DataTaxonomy was created. |
update_ |
Output only. The time when the DataTaxonomy was last updated. |
description |
Optional. Description of the DataTaxonomy. |
display_ |
Optional. User friendly display name. |
labels |
Optional. User-defined labels for the DataTaxonomy. |
attribute_ |
Output only. The number of attributes in the DataTaxonomy. |
etag |
This checksum is computed by the server based on the value of other fields, and may be sent on update and delete requests to ensure the client has an up-to-date value before proceeding. |
class_ |
Output only. The number of classes in the DataTaxonomy. |
DeleteAspectTypeRequest
Delele AspectType Request.
Fields | |
---|---|
name |
Required. The resource name of the AspectType: Authorization requires the following IAM permission on the specified resource
|
etag |
Optional. If the client provided etag value does not match the current etag value, the DeleteAspectTypeRequest method returns an ABORTED error response. |
DeleteAssetRequest
Delete asset request.
Fields | |
---|---|
name |
Required. The resource name of the asset: Authorization requires the following IAM permission on the specified resource
|
DeleteContentRequest
Delete content request.
Fields | |
---|---|
name |
Required. The resource name of the content: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/content/{content_id} Authorization requires the following IAM permission on the specified resource
|
DeleteDataAttributeBindingRequest
Delete DataAttributeBinding request.
Fields | |
---|---|
name |
Required. The resource name of the DataAttributeBinding: projects/{project_number}/locations/{location_id}/dataAttributeBindings/{data_attribute_binding_id} Authorization requires the following IAM permission on the specified resource
|
etag |
Required. If the client provided etag value does not match the current etag value, the DeleteDataAttributeBindingRequest method returns an ABORTED error response. Etags must be used when calling the DeleteDataAttributeBinding. |
DeleteDataAttributeRequest
Delete DataAttribute request.
Fields | |
---|---|
name |
Required. The resource name of the DataAttribute: projects/{project_number}/locations/{location_id}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id} Authorization requires the following IAM permission on the specified resource
|
etag |
Optional. If the client provided etag value does not match the current etag value, the DeleteDataAttribute method returns an ABORTED error response. |
DeleteDataScanRequest
Delete dataScan request.
Fields | |
---|---|
name |
Required. The resource name of the dataScan: Authorization requires the following IAM permission on the specified resource
|
force |
Optional. If set to true, any child resources of this data scan will also be deleted. (Otherwise, the request will only work if the data scan has no child resources.) |
DeleteDataTaxonomyRequest
Delete DataTaxonomy request.
Fields | |
---|---|
name |
Required. The resource name of the DataTaxonomy: projects/{project_number}/locations/{location_id}/dataTaxonomies/{data_taxonomy_id} Authorization requires the following IAM permission on the specified resource
|
etag |
Optional. If the client provided etag value does not match the current etag value,the DeleteDataTaxonomy method returns an ABORTED error. |
DeleteEntityRequest
Delete a metadata entity request.
Fields | |
---|---|
name |
Required. The resource name of the entity: Authorization requires the following IAM permission on the specified resource
|
etag |
Required. The etag associated with the entity, which can be retrieved with a [GetEntity][] request. |
DeleteEntryGroupRequest
Delete EntryGroup Request.
Fields | |
---|---|
name |
Required. The resource name of the EntryGroup: Authorization requires the following IAM permission on the specified resource
|
etag |
Optional. If the client provided etag value does not match the current etag value, the DeleteEntryGroupRequest method returns an ABORTED error response. |
DeleteEntryRequest
Delete Entry request.
Fields | |
---|---|
name |
Required. The resource name of the Entry: |
DeleteEntryTypeRequest
Delele EntryType Request.
Fields | |
---|---|
name |
Required. The resource name of the EntryType: Authorization requires the following IAM permission on the specified resource
|
etag |
Optional. If the client provided etag value does not match the current etag value, the DeleteEntryTypeRequest method returns an ABORTED error response. |
DeleteEnvironmentRequest
Delete environment request.
Fields | |
---|---|
name |
Required. The resource name of the environment: Authorization requires the following IAM permission on the specified resource
|
DeleteLakeRequest
Delete lake request.
Fields | |
---|---|
name |
Required. The resource name of the lake: Authorization requires the following IAM permission on the specified resource
|
DeletePartitionRequest
Delete metadata partition request.
Fields | |
---|---|
name |
Required. The resource name of the partition. format: Authorization requires the following IAM permission on the specified resource
|
etag |
Optional. The etag associated with the partition. |
DeleteTaskRequest
Delete task request.
Fields | |
---|---|
name |
Required. The resource name of the task: Authorization requires the following IAM permission on the specified resource
|
DeleteZoneRequest
Delete zone request.
Fields | |
---|---|
name |
Required. The resource name of the zone: Authorization requires the following IAM permission on the specified resource
|
DiscoveryEvent
The payload associated with Discovery data processing.
Fields | |
---|---|
message |
The log message. |
lake_ |
The id of the associated lake. |
zone_ |
The id of the associated zone. |
asset_ |
The id of the associated asset. |
data_ |
The data location associated with the event. |
datascan_ |
The id of the associated datascan for standalone discovery. |
type |
The type of the event being logged. |
Union field details . Additional details about the event. details can be only one of the following: |
|
config |
Details about discovery configuration in effect. |
entity |
Details about the entity associated with the event. |
partition |
Details about the partition associated with the event. |
action |
Details about the action associated with the event. |
table |
Details about the BigQuery table publishing associated with the event. |
ActionDetails
Details about the action.
Fields | |
---|---|
type |
The type of action. Eg. IncompatibleDataSchema, InvalidDataFormat |
issue |
The human readable issue associated with the action. |
ConfigDetails
Details about configuration events.
Fields | |
---|---|
parameters |
A list of discovery configuration parameters in effect. The keys are the field paths within DiscoverySpec. Eg. includePatterns, excludePatterns, csvOptions.disableTypeInference, etc. |
EntityDetails
Details about the entity.
Fields | |
---|---|
entity |
The name of the entity resource. The name is the fully-qualified resource name. |
type |
The type of the entity resource. |
EntityType
The type of the entity.
Enums | |
---|---|
ENTITY_TYPE_UNSPECIFIED |
An unspecified event type. |
TABLE |
Entities representing structured data. |
FILESET |
Entities representing unstructured data. |
EventType
The type of the event.
Enums | |
---|---|
EVENT_TYPE_UNSPECIFIED |
An unspecified event type. |
CONFIG |
An event representing discovery configuration in effect. |
ENTITY_CREATED |
An event representing a metadata entity being created. |
ENTITY_UPDATED |
An event representing a metadata entity being updated. |
ENTITY_DELETED |
An event representing a metadata entity being deleted. |
PARTITION_CREATED |
An event representing a partition being created. |
PARTITION_UPDATED |
An event representing a partition being updated. |
PARTITION_DELETED |
An event representing a partition being deleted. |
TABLE_PUBLISHED |
An event representing a table being published. |
TABLE_UPDATED |
An event representing a table being updated. |
TABLE_IGNORED |
An event representing a table being skipped in publishing. |
TABLE_DELETED |
An event representing a table being deleted. |
PartitionDetails
Details about the partition.
Fields | |
---|---|
partition |
The name to the partition resource. The name is the fully-qualified resource name. |
entity |
The name to the containing entity resource. The name is the fully-qualified resource name. |
type |
The type of the containing entity resource. |
sampled_ |
The locations of the data items (e.g., a Cloud Storage objects) sampled for metadata inference. |
TableDetails
Details about the published table.
Fields | |
---|---|
table |
The fully-qualified resource name of the table resource. |
type |
The type of the table resource. |
TableType
The type of the published table.
Enums | |
---|---|
TABLE_TYPE_UNSPECIFIED |
An unspecified table type. |
EXTERNAL_TABLE |
External table type. |
BIGLAKE_TABLE |
BigLake table type. |
OBJECT_TABLE |
Object table type for unstructured data. |
Entity
Represents tables and fileset metadata contained within a zone.
Fields | |
---|---|
name |
Output only. The resource name of the entity, of the form: |
display_ |
Optional. Display name must be shorter than or equal to 256 characters. |
description |
Optional. User friendly longer description text. Must be shorter than or equal to 1024 characters. |
create_ |
Output only. The time when the entity was created. |
update_ |
Output only. The time when the entity was last updated. |
id |
Required. A user-provided entity ID. It is mutable, and will be used as the published table name. Specifying a new ID in an update entity request will override the existing value. The ID must contain only letters (a-z, A-Z), numbers (0-9), and underscores, and consist of 256 or fewer characters. |
etag |
Optional. The etag associated with the entity, which can be retrieved with a [GetEntity][] request. Required for update and delete requests. |
type |
Required. Immutable. The type of entity. |
asset |
Required. Immutable. The ID of the asset associated with the storage location containing the entity data. The entity must be with in the same zone with the asset. |
data_ |
Required. Immutable. The storage path of the entity data. For Cloud Storage data, this is the fully-qualified path to the entity, such as |
data_ |
Optional. The set of items within the data path constituting the data in the entity, represented as a glob path. Example: |
catalog_ |
Output only. The name of the associated Data Catalog entry. |
system |
Required. Immutable. Identifies the storage system of the entity data. |
format |
Required. Identifies the storage format of the entity data. It does not apply to entities with data stored in BigQuery. |
compatibility |
Output only. Metadata stores that the entity is compatible with. |
access |
Output only. Identifies the access mechanism to the entity. Not user settable. |
uid |
Output only. System generated unique ID for the Entity. This ID will be different if the Entity is deleted and re-created with the same name. |
schema |
Required. The description of the data structure and layout. The schema is not included in list responses. It is only included in |
CompatibilityStatus
Provides compatibility information for various metadata stores.
Fields | |
---|---|
hive_ |
Output only. Whether this entity is compatible with Hive Metastore. |
bigquery |
Output only. Whether this entity is compatible with BigQuery. |
Compatibility
Provides compatibility information for a specific metadata store.
Fields | |
---|---|
compatible |
Output only. Whether the entity is compatible and can be represented in the metadata store. |
reason |
Output only. Provides additional detail if the entity is incompatible with the metadata store. |
Type
The type of entity.
Enums | |
---|---|
TYPE_UNSPECIFIED |
Type unspecified. |
TABLE |
Structured and semi-structured data. |
FILESET |
Unstructured data. |
Entry
An entry is a representation of a data resource that can be described by various metadata.
Fields | |
---|---|
name |
Identifier. The relative resource name of the entry, in the format |
entry_ |
Required. Immutable. The relative resource name of the entry type that was used to create this entry, in the format |
create_ |
Output only. The time when the entry was created in Dataplex. |
update_ |
Output only. The time when the entry was last updated in Dataplex. |
aspects |
Optional. The aspects that are attached to the entry. Depending on how the aspect is attached to the entry, the format of the aspect key can be one of the following:
|
parent_ |
Optional. Immutable. The resource name of the parent entry. |
fully_ |
Optional. A name for the entry that can be referenced by an external system. For more information, see Fully qualified names. The maximum size of the field is 4000 characters. |
entry_ |
Optional. Information related to the source system of the data resource that is represented by the entry. |
EntryGroup
An Entry Group represents a logical grouping of one or more Entries.
Fields | |
---|---|
name |
Output only. The relative resource name of the EntryGroup, in the format projects/{project_id_or_number}/locations/{location_id}/entryGroups/{entry_group_id}. |
uid |
Output only. System generated globally unique ID for the EntryGroup. If you delete and recreate the EntryGroup with the same name, this ID will be different. |
create_ |
Output only. The time when the EntryGroup was created. |
update_ |
Output only. The time when the EntryGroup was last updated. |
description |
Optional. Description of the EntryGroup. |
display_ |
Optional. User friendly display name. |
labels |
Optional. User-defined labels for the EntryGroup. |
etag |
This checksum is computed by the service, and might be sent on update and delete requests to ensure the client has an up-to-date value before proceeding. |
transfer_ |
Output only. Denotes the transfer status of the Entry Group. It is unspecified for Entry Group created from Dataplex API. |
EntrySource
Information related to the source system of the data resource that is represented by the entry.
Fields | |
---|---|
resource |
The name of the resource in the source system. Maximum length is 4,000 characters. |
system |
The name of the source system. Maximum length is 64 characters. |
platform |
The platform containing the source system. Maximum length is 64 characters. |
display_ |
A user-friendly display name. Maximum length is 500 characters. |
description |
A description of the data resource. Maximum length is 2,000 characters. |
labels |
User-defined labels. The maximum size of keys and values is 128 characters each. |
ancestors[] |
Immutable. The entries representing the ancestors of the data resource in the source system. |
create_ |
The time when the resource was created in the source system. |
update_ |
The time when the resource was last updated in the source system. If the entry exists in the system and its |
location |
Output only. Location of the resource in the source system. You can search the entry by this location. By default, this should match the location of the entry group containing this entry. A different value allows capturing the source location for data external to Google Cloud. |
Ancestor
Information about individual items in the hierarchy that is associated with the data resource.
Fields | |
---|---|
name |
Optional. The name of the ancestor resource. |
type |
Optional. The type of the ancestor resource. |
EntryType
Entry Type is a template for creating Entries.
Fields | |
---|---|
name |
Output only. The relative resource name of the EntryType, of the form: projects/{project_number}/locations/{location_id}/entryTypes/{entry_type_id}. |
uid |
Output only. System generated globally unique ID for the EntryType. This ID will be different if the EntryType is deleted and re-created with the same name. |
create_ |
Output only. The time when the EntryType was created. |
update_ |
Output only. The time when the EntryType was last updated. |
description |
Optional. Description of the EntryType. |
display_ |
Optional. User friendly display name. |
labels |
Optional. User-defined labels for the EntryType. |
etag |
Optional. This checksum is computed by the service, and might be sent on update and delete requests to ensure the client has an up-to-date value before proceeding. |
type_ |
Optional. Indicates the classes this Entry Type belongs to, for example, TABLE, DATABASE, MODEL. |
platform |
Optional. The platform that Entries of this type belongs to. |
system |
Optional. The system that Entries of this type belongs to. Examples include CloudSQL, MariaDB etc |
required_ |
AspectInfo for the entry type. |
authorization |
Immutable. Authorization defined for this type. |
AspectInfo
Fields | |
---|---|
type |
Required aspect type for the entry type. |
Authorization
Authorization for an Entry Type.
Fields | |
---|---|
alternate_ |
Immutable. The IAM permission grantable on the Entry Group to allow access to instantiate Entries of Dataplex owned Entry Types, only settable for Dataplex owned Types. |
EntryView
View for controlling which parts of an entry are to be returned.
Enums | |
---|---|
ENTRY_VIEW_UNSPECIFIED |
Unspecified EntryView. Defaults to FULL. |
BASIC |
Returns entry only, without aspects. |
FULL |
Returns all required aspects as well as the keys of all non-required aspects. |
CUSTOM |
Returns aspects matching custom fields in GetEntryRequest. If the number of aspects exceeds 100, the first 100 will be returned. |
ALL |
Returns all aspects. If the number of aspects exceeds 100, the first 100 will be returned. |
Environment
Environment represents a user-visible compute infrastructure for analytics within a lake.
Fields | |
---|---|
name |
Output only. The relative resource name of the environment, of the form: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/environment/{environment_id} |
display_ |
Optional. User friendly display name. |
uid |
Output only. System generated globally unique ID for the environment. This ID will be different if the environment is deleted and re-created with the same name. |
create_ |
Output only. Environment creation time. |
update_ |
Output only. The time when the environment was last updated. |
labels |
Optional. User defined labels for the environment. |
description |
Optional. Description of the environment. |
state |
Output only. Current state of the environment. |
infrastructure_ |
Required. Infrastructure specification for the Environment. |
session_ |
Optional. Configuration for sessions created for this environment. |
session_ |
Output only. Status of sessions created for this environment. |
endpoints |
Output only. URI Endpoints to access sessions associated with the Environment. |
Endpoints
URI Endpoints to access sessions associated with the Environment.
Fields | |
---|---|
notebooks |
Output only. URI to serve notebook APIs |
sql |
Output only. URI to serve SQL APIs |
InfrastructureSpec
Configuration for the underlying infrastructure used to run workloads.
Fields | |
---|---|
Union field resources . Hardware config resources can be only one of the following: |
|
compute |
Optional. Compute resources needed for analyze interactive workloads. |
Union field runtime . Software config runtime can be only one of the following: |
|
os_ |
Required. Software Runtime Configuration for analyze interactive workloads. |
ComputeResources
Compute resources associated with the analyze interactive workloads.
Fields | |
---|---|
disk_ |
Optional. Size in GB of the disk. Default is 100 GB. |
node_ |
Optional. Total number of nodes in the sessions created for this environment. |
max_ |
Optional. Max configurable nodes. If max_node_count > node_count, then auto-scaling is enabled. |
OsImageRuntime
Software Runtime Configuration to run Analyze.
Fields | |
---|---|
image_ |
Required. Dataplex Image version. |
java_ |
Optional. List of Java jars to be included in the runtime environment. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar |
python_ |
Optional. A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz |
properties |
Optional. Spark properties to provide configuration for use in sessions created for this environment. The properties to set on daemon config files. Property keys are specified in |
SessionSpec
Configuration for sessions created for this environment.
Fields | |
---|---|
max_ |
Optional. The idle time configuration of the session. The session will be auto-terminated at the end of this period. |
enable_ |
Optional. If True, this causes sessions to be pre-created and available for faster startup to enable interactive exploration use-cases. This defaults to False to avoid additional billed charges. These can only be set to True for the environment with name set to "default", and with default configuration. |
SessionStatus
Status of sessions created for this environment.
Fields | |
---|---|
active |
Output only. Queries over sessions to mark whether the environment is currently active or not |
GenerateDataQualityRulesRequest
Request details for generating data quality rule recommendations.
Fields | |
---|---|
name |
Required. The name must be one of the following:
|
GenerateDataQualityRulesResponse
Response details for data quality rule recommendations.
Fields | |
---|---|
rule[] |
The data quality rules that Dataplex generates based on the results of a data profiling scan. |
GetAspectTypeRequest
Get AspectType request.
Fields | |
---|---|
name |
Required. The resource name of the AspectType: Authorization requires the following IAM permission on the specified resource
|
GetAssetRequest
Get asset request.
Fields | |
---|---|
name |
Required. The resource name of the asset: Authorization requires the following IAM permission on the specified resource
|
GetContentRequest
Get content request.
Fields | |
---|---|
name |
Required. The resource name of the content: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/content/{content_id} Authorization requires the following IAM permission on the specified resource
|
view |
Optional. Specify content view to make a partial request. |
ContentView
Specifies whether the request should return the full or the partial representation.
Enums | |
---|---|
CONTENT_VIEW_UNSPECIFIED |
Content view not specified. Defaults to BASIC. The API will default to the BASIC view. |
BASIC |
Will not return the data_text field. |
FULL |
Returns the complete proto. |
GetDataAttributeBindingRequest
Get DataAttributeBinding request.
Fields | |
---|---|
name |
Required. The resource name of the DataAttributeBinding: projects/{project_number}/locations/{location_id}/dataAttributeBindings/{data_attribute_binding_id} Authorization requires the following IAM permission on the specified resource
|
GetDataAttributeRequest
Get DataAttribute request.
Fields | |
---|---|
name |
Required. The resource name of the dataAttribute: projects/{project_number}/locations/{location_id}/dataTaxonomies/{dataTaxonomy}/attributes/{data_attribute_id} Authorization requires the following IAM permission on the specified resource
|
GetDataScanJobRequest
Get DataScanJob request.
Fields | |
---|---|
name |
Required. The resource name of the DataScanJob: Authorization requires the following IAM permission on the specified resource
|
view |
Optional. Select the DataScanJob view to return. Defaults to |
DataScanJobView
DataScanJob view options.
Enums | |
---|---|
DATA_SCAN_JOB_VIEW_UNSPECIFIED |
The API will default to the BASIC view. |
BASIC |
Basic view that does not include spec and result. |
FULL |
Include everything. |
GetDataScanRequest
Get dataScan request.
Fields | |
---|---|
name |
Required. The resource name of the dataScan: Authorization requires the following IAM permission on the specified resource
|
view |
Optional. Select the DataScan view to return. Defaults to |
DataScanView
DataScan view options.
Enums | |
---|---|
DATA_SCAN_VIEW_UNSPECIFIED |
The API will default to the BASIC view. |
BASIC |
Basic view that does not include spec and result. |
FULL |
Include everything. |
GetDataTaxonomyRequest
Get DataTaxonomy request.
Fields | |
---|---|
name |
Required. The resource name of the DataTaxonomy: projects/{project_number}/locations/{location_id}/dataTaxonomies/{data_taxonomy_id} Authorization requires the following IAM permission on the specified resource
|
GetEntityRequest
Get metadata entity request.
Fields | |
---|---|
name |
Required. The resource name of the entity: Authorization requires the following IAM permission on the specified resource
|
view |
Optional. Used to select the subset of entity information to return. Defaults to |
EntityView
Entity views for get entity partial result.
Enums | |
---|---|
ENTITY_VIEW_UNSPECIFIED |
The API will default to the BASIC view. |
BASIC |
Minimal view that does not include the schema. |
SCHEMA |
Include basic information and schema. |
FULL |
Include everything. Currently, this is the same as the SCHEMA view. |
GetEntryGroupRequest
Get EntryGroup request.
Fields | |
---|---|
name |
Required. The resource name of the EntryGroup: Authorization requires the following IAM permission on the specified resource
|
GetEntryRequest
Get Entry request.
Fields | |
---|---|
name |
Required. The resource name of the Entry: |
view |
Optional. View to control which parts of an entry the service should return. |
aspect_ |
Optional. Limits the aspects returned to the provided aspect types. It only works for CUSTOM view. |
paths[] |
Optional. Limits the aspects returned to those associated with the provided paths within the Entry. It only works for CUSTOM view. |
GetEntryTypeRequest
Get EntryType request.
Fields | |
---|---|
name |
Required. The resource name of the EntryType: Authorization requires the following IAM permission on the specified resource
|
GetEnvironmentRequest
Get environment request.
Fields | |
---|---|
name |
Required. The resource name of the environment: Authorization requires the following IAM permission on the specified resource
|
GetJobRequest
Get job request.
Fields | |
---|---|
name |
Required. The resource name of the job: Authorization requires the following IAM permission on the specified resource
|
GetLakeRequest
Get lake request.
Fields | |
---|---|
name |
Required. The resource name of the lake: Authorization requires the following IAM permission on the specified resource
|
GetMetadataJobRequest
Get metadata job request.
Fields | |
---|---|
name |
Required. The resource name of the metadata job, in the format Authorization requires the following IAM permission on the specified resource
|
GetPartitionRequest
Get metadata partition request.
Fields | |
---|---|
name |
Required. The resource name of the partition: Authorization requires the following IAM permission on the specified resource
|
GetTaskRequest
Get task request.
Fields | |
---|---|
name |
Required. The resource name of the task: Authorization requires the following IAM permission on the specified resource
|
GetZoneRequest
Get zone request.
Fields | |
---|---|
name |
Required. The resource name of the zone: Authorization requires the following IAM permission on the specified resource
|
GovernanceEvent
Payload associated with Governance related log events.
Fields | |
---|---|
message |
The log message. |
event_ |
The type of the event. |
entity |
Entity resource information if the log event is associated with a specific entity. |
Entity
Information about Entity resource that the log event is associated with.
Fields | |
---|---|
entity |
The Entity resource the log event is associated with. Format: |
entity_ |
Type of entity. |
EntityType
Type of entity.
Enums | |
---|---|
ENTITY_TYPE_UNSPECIFIED |
An unspecified Entity type. |
TABLE |
Table entity type. |
FILESET |
Fileset entity type. |
EventType
Type of governance log event.
Enums | |
---|---|
EVENT_TYPE_UNSPECIFIED |
An unspecified event type. |
RESOURCE_IAM_POLICY_UPDATE |
Resource IAM policy update event. |
BIGQUERY_TABLE_CREATE |
BigQuery table create event. |
BIGQUERY_TABLE_UPDATE |
BigQuery table update event. |
BIGQUERY_TABLE_DELETE |
BigQuery table delete event. |
BIGQUERY_CONNECTION_CREATE |
BigQuery connection create event. |
BIGQUERY_CONNECTION_UPDATE |
BigQuery connection update event. |
BIGQUERY_CONNECTION_DELETE |
BigQuery connection delete event. |
BIGQUERY_TAXONOMY_CREATE |
BigQuery taxonomy created. |
BIGQUERY_POLICY_TAG_CREATE |
BigQuery policy tag created. |
BIGQUERY_POLICY_TAG_DELETE |
BigQuery policy tag deleted. |
BIGQUERY_POLICY_TAG_SET_IAM_POLICY |
BigQuery set iam policy for policy tag. |
ACCESS_POLICY_UPDATE |
Access policy update event. |
GOVERNANCE_RULE_MATCHED_RESOURCES |
Number of resources matched with particular Query. |
GOVERNANCE_RULE_SEARCH_LIMIT_EXCEEDS |
Rule processing exceeds the allowed limit. |
GOVERNANCE_RULE_ERRORS |
Rule processing errors. |
GOVERNANCE_RULE_PROCESSING |
Governance rule processing Event. |
ImportItem
An object that describes the values that you want to set for an entry and its attached aspects when you import metadata. Used when you run a metadata import job. See CreateMetadataJob
.
You provide a collection of import items in a metadata import file. For more information about how to create a metadata import file, see Metadata import file.
Fields | |
---|---|
entry |
Information about an entry and its attached aspects. |
update_ |
The fields to update, in paths that are relative to the In The Dataplex also determines which entries and aspects to modify by comparing the values and timestamps that you provide in the metadata import file with the values and timestamps that exist in your project. For more information, see Comparison logic. |
aspect_ |
The aspects to modify. Supports the following syntaxes:
If you leave this field empty, it is treated as specifying exactly those aspects that are present within the specified entry. In |
Job
A job represents an instance of a task.
Fields | |
---|---|
name |
Output only. The relative resource name of the job, of the form: |
uid |
Output only. System generated globally unique ID for the job. |
start_ |
Output only. The time when the job was started. |
end_ |
Output only. The time when the job ended. |
state |
Output only. Execution state for the job. |
retry_ |
Output only. The number of times the job has been retried (excluding the initial attempt). |
service |
Output only. The underlying service running a job. |
service_ |
Output only. The full resource name for the job run under a particular service. |
message |
Output only. Additional information about the current state. |
labels |
Output only. User-defined labels for the task. |
trigger |
Output only. Job execution trigger. |
execution_ |
Output only. Spec related to how a task is executed. |
Service
Enums | |
---|---|
SERVICE_UNSPECIFIED |
Service used to run the job is unspecified. |
DATAPROC |
Dataproc service is used to run this job. |
State
Enums | |
---|---|
STATE_UNSPECIFIED |
The job state is unknown. |
RUNNING |
The job is running. |
CANCELLING |
The job is cancelling. |
CANCELLED |
The job cancellation was successful. |
SUCCEEDED |
The job completed successfully. |
FAILED |
The job is no longer running due to an error. |
ABORTED |
The job was cancelled outside of Dataplex. |
Trigger
Job execution trigger.
Enums | |
---|---|
TRIGGER_UNSPECIFIED |
The trigger is unspecified. |
TASK_CONFIG |
The job was triggered by Dataplex based on trigger spec from task definition. |
RUN_REQUEST |
The job was triggered by the explicit call of Task API. |
JobEvent
The payload associated with Job logs that contains events describing jobs that have run within a Lake.
Fields | |
---|---|
message |
The log message. |
job_ |
The unique id identifying the job. |
start_ |
The time when the job started running. |
end_ |
The time when the job ended running. |
state |
The job state on completion. |
retries |
The number of retries. |
type |
The type of the job. |
service |
The service used to execute the job. |
service_ |
The reference to the job within the service. |
execution_ |
Job execution trigger. |
ExecutionTrigger
Job Execution trigger.
Enums | |
---|---|
EXECUTION_TRIGGER_UNSPECIFIED |
The job execution trigger is unspecified. |
TASK_CONFIG |
The job was triggered by Dataplex based on trigger spec from task definition. |
RUN_REQUEST |
The job was triggered by the explicit call of Task API. |
Service
The service used to execute the job.
Enums | |
---|---|
SERVICE_UNSPECIFIED |
Unspecified service. |
DATAPROC |
Cloud Dataproc. |
State
The completion status of the job.
Enums | |
---|---|
STATE_UNSPECIFIED |
Unspecified job state. |
SUCCEEDED |
Job successfully completed. |
FAILED |
Job was unsuccessful. |
CANCELLED |
Job was cancelled by the user. |
ABORTED |
Job was cancelled or aborted via the service executing the job. |
Type
The type of the job.
Enums | |
---|---|
TYPE_UNSPECIFIED |
Unspecified job type. |
SPARK |
Spark jobs. |
NOTEBOOK |
Notebook jobs. |
Lake
A lake is a centralized repository for managing enterprise data across the organization distributed across many cloud projects, and stored in a variety of storage services such as Google Cloud Storage and BigQuery. The resources attached to a lake are referred to as managed resources. Data within these managed resources can be structured or unstructured. A lake provides data admins with tools to organize, secure and manage their data at scale, and provides data scientists and data engineers an integrated experience to easily search, discover, analyze and transform data and associated metadata.
Fields | |
---|---|
name |
Output only. The relative resource name of the lake, of the form: |
display_ |
Optional. User friendly display name. |
uid |
Output only. System generated globally unique ID for the lake. This ID will be different if the lake is deleted and re-created with the same name. |
create_ |
Output only. The time when the lake was created. |
update_ |
Output only. The time when the lake was last updated. |
labels |
Optional. User-defined labels for the lake. |
description |
Optional. Description of the lake. |
state |
Output only. Current state of the lake. |
service_ |
Output only. Service account associated with this lake. This service account must be authorized to access or operate on resources managed by the lake. |
metastore |
Optional. Settings to manage lake and Dataproc Metastore service instance association. |
asset_ |
Output only. Aggregated status of the underlying assets of the lake. |
metastore_ |
Output only. Metastore status of the lake. |
Metastore
Settings to manage association of Dataproc Metastore with a lake.
Fields | |
---|---|
service |
Optional. A relative reference to the Dataproc Metastore (https://cloud.google.com/dataproc-metastore/docs) service associated with the lake: |
MetastoreStatus
Status of Lake and Dataproc Metastore service instance association.
Fields | |
---|---|
state |
Current state of association. |
message |
Additional information about the current status. |
update_ |
Last update time of the metastore status of the lake. |
endpoint |
The URI of the endpoint used to access the Metastore service. |
State
Current state of association.
Enums | |
---|---|
STATE_UNSPECIFIED |
Unspecified. |
NONE |
A Metastore service instance is not associated with the lake. |
READY |
A Metastore service instance is attached to the lake. |
UPDATING |
Attach/detach is in progress. |
ERROR |
Attach/detach could not be done due to errors. |
ListActionsResponse
List actions response.
Fields | |
---|---|
actions[] |
Actions under the given parent lake/zone/asset. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
ListAspectTypesRequest
List AspectTypes request.
Fields | |
---|---|
parent |
Required. The resource name of the AspectType location, of the form: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of AspectTypes to return. The service may return fewer than this value. If unspecified, the service returns at most 10 AspectTypes. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. Filters are case-sensitive. The service supports the following formats:
These restrictions can be conjoined with AND, OR, and NOT conjunctions. |
order_ |
Optional. Orders the result by |
ListAspectTypesResponse
List AspectTypes response.
Fields | |
---|---|
aspect_ |
AspectTypes under the given parent location. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable_ |
Locations that the service couldn't reach. |
ListAssetActionsRequest
List asset actions request.
Fields | |
---|---|
parent |
Required. The resource name of the parent asset: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of actions to return. The service may return fewer than this value. If unspecified, at most 10 actions will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
ListAssetsRequest
List assets request.
Fields | |
---|---|
parent |
Required. The resource name of the parent zone: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of asset to return. The service may return fewer than this value. If unspecified, at most 10 assets will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. |
order_ |
Optional. Order by fields for the result. |
ListAssetsResponse
List assets response.
Fields | |
---|---|
assets[] |
Asset under the given parent zone. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
ListContentRequest
List content request. Returns the BASIC Content view.
Fields | |
---|---|
parent |
Required. The resource name of the parent lake: projects/{project_id}/locations/{location_id}/lakes/{lake_id} Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of content to return. The service may return fewer than this value. If unspecified, at most 10 content will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. Filters are case-sensitive. The following formats are supported: labels.key1 = "value1" labels:key1 type = "NOTEBOOK" type = "SQL_SCRIPT" These restrictions can be coinjoined with AND, OR and NOT conjunctions. |
ListContentResponse
List content response.
Fields | |
---|---|
content[] |
Content under the given parent lake. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
ListDataAttributeBindingsRequest
List DataAttributeBindings request.
Fields | |
---|---|
parent |
Required. The resource name of the Location: projects/{project_number}/locations/{location_id} Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of DataAttributeBindings to return. The service may return fewer than this value. If unspecified, at most 10 DataAttributeBindings will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. Filter using resource: filter=resource:"resource-name" Filter using attribute: filter=attributes:"attribute-name" Filter using attribute in paths list: filter=paths.attributes:"attribute-name" |
order_ |
Optional. Order by fields for the result. |
ListDataAttributeBindingsResponse
List DataAttributeBindings response.
Fields | |
---|---|
data_ |
DataAttributeBindings under the given parent Location. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable_ |
Locations that could not be reached. |
ListDataAttributesRequest
List DataAttributes request.
Fields | |
---|---|
parent |
Required. The resource name of the DataTaxonomy: projects/{project_number}/locations/{location_id}/dataTaxonomies/{data_taxonomy_id} Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of DataAttributes to return. The service may return fewer than this value. If unspecified, at most 10 dataAttributes will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. |
order_ |
Optional. Order by fields for the result. |
ListDataAttributesResponse
List DataAttributes response.
Fields | |
---|---|
data_ |
DataAttributes under the given parent DataTaxonomy. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable_ |
Locations that could not be reached. |
ListDataScanJobsRequest
List DataScanJobs request.
Fields | |
---|---|
parent |
Required. The resource name of the parent environment: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of DataScanJobs to return. The service may return fewer than this value. If unspecified, at most 10 DataScanJobs will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. An expression for filtering the results of the ListDataScanJobs request. If unspecified, all datascan jobs will be returned. Multiple filters can be applied (with Allowed fields are:
For instance, 'start_time > 2018-10-08T00:00:00.123456789Z AND end_time < 2018-10-09T00:00:00.123456789Z' limits results to DataScanJobs between specified start and end times. |
ListDataScanJobsResponse
List DataScanJobs response.
Fields | |
---|---|
data_ |
DataScanJobs ( |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
ListDataScansRequest
List dataScans request.
Fields | |
---|---|
parent |
Required. The resource name of the parent location: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of dataScans to return. The service may return fewer than this value. If unspecified, at most 500 scans will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. |
order_ |
Optional. Order by fields ( |
ListDataScansResponse
List dataScans response.
Fields | |
---|---|
data_ |
DataScans ( |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable[] |
Locations that could not be reached. |
ListDataTaxonomiesRequest
List DataTaxonomies request.
Fields | |
---|---|
parent |
Required. The resource name of the DataTaxonomy location, of the form: projects/{project_number}/locations/{location_id} where Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of DataTaxonomies to return. The service may return fewer than this value. If unspecified, at most 10 DataTaxonomies will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. |
order_ |
Optional. Order by fields for the result. |
ListDataTaxonomiesResponse
List DataTaxonomies response.
Fields | |
---|---|
data_ |
DataTaxonomies under the given parent location. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable_ |
Locations that could not be reached. |
ListEntitiesRequest
List metadata entities request.
Fields | |
---|---|
parent |
Required. The resource name of the parent zone: Authorization requires the following IAM permission on the specified resource
|
view |
Required. Specify the entity view to make a partial list request. |
page_ |
Optional. Maximum number of entities to return. The service may return fewer than this value. If unspecified, 100 entities will be returned by default. The maximum value is 500; larger values will will be truncated to 500. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. The following filter parameters can be added to the URL to limit the entities returned by the API:
|
EntityView
Entity views.
Enums | |
---|---|
ENTITY_VIEW_UNSPECIFIED |
The default unset value. Return both table and fileset entities if unspecified. |
TABLES |
Only list table entities. |
FILESETS |
Only list fileset entities. |
ListEntitiesResponse
List metadata entities response.
Fields | |
---|---|
entities[] |
Entities in the specified parent zone. |
next_ |
Token to retrieve the next page of results, or empty if there are no remaining results in the list. |
ListEntriesRequest
List Entries request.
Fields | |
---|---|
parent |
Required. The resource name of the parent Entry Group: |
page_ |
Optional. Number of items to return per page. If there are remaining results, the service returns a next_page_token. If unspecified, the service returns at most 10 Entries. The maximum value is 100; values above 100 will be coerced to 100. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. A filter on the entries to return. Filters are case-sensitive. You can filter the request by the following fields:
The comparison operators are =, !=, <, >, <=, >=. The service compares strings according to lexical order. You can use the logical operators AND, OR, NOT in the filter. You can use Wildcard "*", but for entry_type you need to provide the full project id or number. Example filter expressions:
|
ListEntriesResponse
List Entries response.
Fields | |
---|---|
entries[] |
The list of entries under the given parent location. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
ListEntryGroupsRequest
List entryGroups request.
Fields | |
---|---|
parent |
Required. The resource name of the entryGroup location, of the form: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of EntryGroups to return. The service may return fewer than this value. If unspecified, the service returns at most 10 EntryGroups. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. |
order_ |
Optional. Order by fields for the result. |
ListEntryGroupsResponse
List entry groups response.
Fields | |
---|---|
entry_ |
Entry groups under the given parent location. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable_ |
Locations that the service couldn't reach. |
ListEntryTypesRequest
List EntryTypes request
Fields | |
---|---|
parent |
Required. The resource name of the EntryType location, of the form: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of EntryTypes to return. The service may return fewer than this value. If unspecified, the service returns at most 10 EntryTypes. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. Filters are case-sensitive. The service supports the following formats:
These restrictions can be conjoined with AND, OR, and NOT conjunctions. |
order_ |
Optional. Orders the result by |
ListEntryTypesResponse
List EntryTypes response.
Fields | |
---|---|
entry_ |
EntryTypes under the given parent location. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable_ |
Locations that the service couldn't reach. |
ListEnvironmentsRequest
List environments request.
Fields | |
---|---|
parent |
Required. The resource name of the parent lake: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of environments to return. The service may return fewer than this value. If unspecified, at most 10 environments will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. |
order_ |
Optional. Order by fields for the result. |
ListEnvironmentsResponse
List environments response.
Fields | |
---|---|
environments[] |
Environments under the given parent lake. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
ListJobsRequest
List jobs request.
Fields | |
---|---|
parent |
Required. The resource name of the parent environment: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of jobs to return. The service may return fewer than this value. If unspecified, at most 10 jobs will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
ListJobsResponse
List jobs response.
Fields | |
---|---|
jobs[] |
Jobs under a given task. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
ListLakeActionsRequest
List lake actions request.
Fields | |
---|---|
parent |
Required. The resource name of the parent lake: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of actions to return. The service may return fewer than this value. If unspecified, at most 10 actions will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
ListLakesRequest
List lakes request.
Fields | |
---|---|
parent |
Required. The resource name of the lake location, of the form: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of Lakes to return. The service may return fewer than this value. If unspecified, at most 10 lakes will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. |
order_ |
Optional. Order by fields for the result. |
ListLakesResponse
List lakes response.
Fields | |
---|---|
lakes[] |
Lakes under the given parent location. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable_ |
Locations that could not be reached. |
ListMetadataJobsRequest
List metadata jobs request.
Fields | |
---|---|
parent |
Required. The resource name of the parent location, in the format Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. The maximum number of metadata jobs to return. The service might return fewer jobs than this value. If unspecified, at most 10 jobs are returned. The maximum value is 1,000. |
page_ |
Optional. The page token received from a previous |
filter |
Optional. Filter request. Filters are case-sensitive. The service supports the following formats:
You can combine filters with |
order_ |
Optional. The field to sort the results by, either |
ListMetadataJobsResponse
List metadata jobs response.
Fields | |
---|---|
metadata_ |
Metadata jobs under the specified parent location. |
next_ |
A token to retrieve the next page of results. If there are no more results in the list, the value is empty. |
unreachable_ |
Locations that the service couldn't reach. |
ListPartitionsRequest
List metadata partitions request.
Fields | |
---|---|
parent |
Required. The resource name of the parent entity: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of partitions to return. The service may return fewer than this value. If unspecified, 100 partitions will be returned by default. The maximum page size is 500; larger values will will be truncated to 500. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter the partitions returned to the caller using a key value pair expression. Supported operators and syntax:
Sample filter expression: `?filter="key1 < value1 OR key2 > value2" Notes:
|
ListPartitionsResponse
List metadata partitions response.
Fields | |
---|---|
partitions[] |
Partitions under the specified parent entity. |
next_ |
Token to retrieve the next page of results, or empty if there are no remaining results in the list. |
ListSessionsRequest
List sessions request.
Fields | |
---|---|
parent |
Required. The resource name of the parent environment: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of sessions to return. The service may return fewer than this value. If unspecified, at most 10 sessions will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. The following mode = ADMIN | USER |
ListSessionsResponse
List sessions response.
Fields | |
---|---|
sessions[] |
Sessions under a given environment. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
ListTasksRequest
List tasks request.
Fields | |
---|---|
parent |
Required. The resource name of the parent lake: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of tasks to return. The service may return fewer than this value. If unspecified, at most 10 tasks will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. |
order_ |
Optional. Order by fields for the result. |
ListTasksResponse
List tasks response.
Fields | |
---|---|
tasks[] |
Tasks under the given parent lake. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable_ |
Locations that could not be reached. |
ListZoneActionsRequest
List zone actions request.
Fields | |
---|---|
parent |
Required. The resource name of the parent zone: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of actions to return. The service may return fewer than this value. If unspecified, at most 10 actions will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
ListZonesRequest
List zones request.
Fields | |
---|---|
parent |
Required. The resource name of the parent lake: Authorization requires the following IAM permission on the specified resource
|
page_ |
Optional. Maximum number of zones to return. The service may return fewer than this value. If unspecified, at most 10 zones will be returned. The maximum value is 1000; values above 1000 will be coerced to 1000. |
page_ |
Optional. Page token received from a previous |
filter |
Optional. Filter request. |
order_ |
Optional. Order by fields for the result. |
ListZonesResponse
List zones response.
Fields | |
---|---|
zones[] |
Zones under the given parent lake. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
LookupEntryRequest
Lookup Entry request using permissions in the source system.
Fields | |
---|---|
name |
Required. The project to which the request should be attributed in the following form: |
view |
Optional. View to control which parts of an entry the service should return. |
aspect_ |
Optional. Limits the aspects returned to the provided aspect types. It only works for CUSTOM view. |
paths[] |
Optional. Limits the aspects returned to those associated with the provided paths within the Entry. It only works for CUSTOM view. |
entry |
Required. The resource name of the Entry: |
MetadataJob
A metadata job resource.
Fields | |
---|---|
name |
Output only. Identifier. The name of the resource that the configuration is applied to, in the format |
uid |
Output only. A system-generated, globally unique ID for the metadata job. If the metadata job is deleted and then re-created with the same name, this ID is different. |
create_ |
Output only. The time when the metadata job was created. |
update_ |
Output only. The time when the metadata job was updated. |
labels |
Optional. User-defined labels. |
type |
Required. Metadata job type. |
status |
Output only. Metadata job status. |
Union field
|
|
import_ |
Import job specification. |
Union field
|
|
import_ |
Output only. Import job result. |
ImportJobResult
Results from a metadata import job.
Fields | |
---|---|
deleted_ |
Output only. The total number of entries that were deleted. |
updated_ |
Output only. The total number of entries that were updated. |
created_ |
Output only. The total number of entries that were created. |
unchanged_ |
Output only. The total number of entries that were unchanged. |
recreated_ |
Output only. The total number of entries that were recreated. |
update_ |
Output only. The time when the status was updated. |
ImportJobSpec
Job specification for a metadata import job
Fields | |
---|---|
source_ |
Optional. The URI of a Cloud Storage bucket or folder (beginning with A metadata import file defines the values to set for each of the entries and aspects in a metadata job. For more information about how to create a metadata import file and the file requirements, see Metadata import file. You can provide multiple metadata import files in the same metadata job. The bucket or folder must contain at least one metadata import file, in JSON Lines format (either In Caution: If the metadata import file contains no data, all entries and aspects that belong to the job's scope are deleted. |
source_ |
Optional. The time when the process that created the metadata import files began. |
scope |
Required. A boundary on the scope of impact that the metadata import job can have. |
entry_ |
Required. The sync mode for entries. Only |
aspect_ |
Required. The sync mode for aspects. Only |
log_ |
Optional. The level of logs to write to Cloud Logging for this job. Debug-level logs provide highly-detailed information for troubleshooting, but their increased verbosity could incur additional costs that might not be merited for all jobs. If unspecified, defaults to |
ImportJobScope
A boundary on the scope of impact that the metadata import job can have.
Fields | |
---|---|
entry_ |
Required. The entry group that is in scope for the import job, specified as a relative resource name in the format Must contain exactly one element. The entry group and the job must be in the same location. |
entry_ |
Required. The entry types that are in scope for the import job, specified as relative resource names in the format If the metadata import file attempts to modify an entry whose type isn't included in this list, the import job is halted before modifying any entries or aspects. The location of an entry type must either match the location of the job, or the entry type must be global. |
aspect_ |
Optional. The aspect types that are in scope for the import job, specified as relative resource names in the format If the metadata import file attempts to modify an aspect whose type isn't included in this list, the import job is halted before modifying any entries or aspects. The location of an aspect type must either match the location of the job, or the aspect type must be global. |
LogLevel
The level of logs to write to Cloud Logging for this job.
Enums | |
---|---|
LOG_LEVEL_UNSPECIFIED |
Log level unspecified. |
DEBUG |
Debug-level logging. Captures detailed logs for each import item. Use debug-level logging to troubleshoot issues with specific import items. For example, use debug-level logging to identify resources that are missing from the job scope, entries or aspects that don't conform to the associated entry type or aspect type, or other misconfigurations with the metadata import file. Depending on the size of your metadata job and the number of logs that are generated, debug-level logging might incur additional costs. |
INFO |
Info-level logging. Captures logs at the overall job level. Includes aggregate logs about import items, but doesn't specify which import item has an error. |
SyncMode
Specifies how the entries and aspects in a metadata job are updated.
Enums | |
---|---|
SYNC_MODE_UNSPECIFIED |
Sync mode unspecified. |
FULL |
All resources in the job's scope are modified. If a resource exists in Dataplex but isn't included in the metadata import file, the resource is deleted when you run the metadata job. Use this mode to perform a full sync of the set of entries in the job scope. |
INCREMENTAL |
Only the entries and aspects that are explicitly included in the metadata import file are modified. Use this mode to modify a subset of resources while leaving unreferenced resources unchanged. |
Status
Metadata job status.
Fields | |
---|---|
state |
Output only. State of the metadata job. |
message |
Output only. Message relating to the progression of a metadata job. |
completion_ |
Output only. Progress tracking. |
update_ |
Output only. The time when the status was updated. |
State
State of a metadata job.
Enums | |
---|---|
STATE_UNSPECIFIED |
State unspecified. |
QUEUED |
The job is queued. |
RUNNING |
The job is running. |
CANCELING |
The job is being canceled. |
CANCELED |
The job is canceled. |
SUCCEEDED |
The job succeeded. |
FAILED |
The job failed. |
SUCCEEDED_WITH_ERRORS |
The job completed with some errors. |
Type
Metadata job type.
Enums | |
---|---|
TYPE_UNSPECIFIED |
Unspecified. |
IMPORT |
Import job. |
OperationMetadata
Represents the metadata of a long-running operation.
Fields | |
---|---|
create_ |
Output only. The time the operation was created. |
end_ |
Output only. The time the operation finished running. |
target |
Output only. Server-defined resource path for the target of the operation. |
verb |
Output only. Name of the verb executed by the operation. |
status_ |
Output only. Human-readable status of the operation, if any. |
requested_ |
Output only. Identifies whether the user has requested cancellation of the operation. Operations that have successfully been cancelled have [Operation.error][] value with a |
api_ |
Output only. API version used to start the operation. |
Partition
Represents partition metadata contained within entity instances.
Fields | |
---|---|
name |
Output only. Partition values used in the HTTP URL must be double encoded. For example, |
values[] |
Required. Immutable. The set of values representing the partition, which correspond to the partition schema defined in the parent entity. |
location |
Required. Immutable. The location of the entity data within the partition, for example, |
etag |
Optional. The etag for this partition. |
ResourceAccessSpec
ResourceAccessSpec holds the access control configuration to be enforced on the resources, for example, Cloud Storage bucket, BigQuery dataset, BigQuery table.
Fields | |
---|---|
readers[] |
Optional. The format of strings follows the pattern followed by IAM in the bindings. user:{email}, serviceAccount:{email} group:{email}. The set of principals to be granted reader role on the resource. |
writers[] |
Optional. The set of principals to be granted writer role on the resource. |
owners[] |
Optional. The set of principals to be granted owner role on the resource. |
RunDataScanRequest
Run DataScan Request
Fields | |
---|---|
name |
Required. The resource name of the DataScan: Only OnDemand data scans are allowed. Authorization requires the following IAM permission on the specified resource
|
RunDataScanResponse
Run DataScan Response.
Fields | |
---|---|
job |
DataScanJob created by RunDataScan request. |
RunTaskRequest
Fields | |
---|---|
name |
Required. The resource name of the task: Authorization requires the following IAM permission on the specified resource
|
labels |
Optional. User-defined labels for the task. If the map is left empty, the task will run with existing labels from task definition. If the map contains an entry with a new key, the same will be added to existing set of labels. If the map contains an entry with an existing label key in task definition, the task will run with new label value for that entry. Clearing an existing label will require label value to be explicitly set to a hyphen "-". The label value cannot be empty. |
args |
Optional. Execution spec arguments. If the map is left empty, the task will run with existing execution spec args from task definition. If the map contains an entry with a new key, the same will be added to existing set of args. If the map contains an entry with an existing arg key in task definition, the task will run with new arg value for that entry. Clearing an existing arg will require arg value to be explicitly set to a hyphen "-". The arg value cannot be empty. |
RunTaskResponse
Fields | |
---|---|
job |
Jobs created by RunTask API. |
ScannedData
The data scanned during processing (e.g. in incremental DataScan)
Fields | |
---|---|
Union field data_range . The range of scanned data data_range can be only one of the following: |
|
incremental_ |
The range denoted by values of an incremental field |
IncrementalField
A data range denoted by a pair of start/end values of a field.
Fields | |
---|---|
field |
The field that contains values which monotonically increases over time (e.g. a timestamp column). |
start |
Value that marks the start of the range. |
end |
Value that marks the end of the range. |
Schema
Schema information describing the structure and layout of the data.
Fields | |
---|---|
user_ |
Required. Set to
|
fields[] |
Optional. The sequence of fields describing data in table entities. Note: BigQuery SchemaFields are immutable. |
partition_ |
Optional. The sequence of fields describing the partition structure in entities. If this field is empty, there are no partitions within the data. |
partition_ |
Optional. The structure of paths containing partition data within the entity. |
Mode
Additional qualifiers to define field semantics.
Enums | |
---|---|
MODE_UNSPECIFIED |
Mode unspecified. |
REQUIRED |
The field has required semantics. |
NULLABLE |
The field has optional semantics, and may be null. |
REPEATED |
The field has repeated (0 or more) semantics, and is a list of values. |
PartitionField
Represents a key field within the entity's partition structure. You could have up to 20 partition fields, but only the first 10 partitions have the filtering ability due to performance consideration. Note: Partition fields are immutable.
Fields | |
---|---|
name |
Required. Partition field name must consist of letters, numbers, and underscores only, with a maximum of length of 256 characters, and must begin with a letter or underscore.. |
type |
Required. Immutable. The type of field. |
PartitionStyle
The structure of paths within the entity, which represent partitions.
Enums | |
---|---|
PARTITION_STYLE_UNSPECIFIED |
PartitionStyle unspecified |
HIVE_COMPATIBLE |
Partitions are hive-compatible. Examples: gs://bucket/path/to/table/dt=2019-10-31/lang=en , gs://bucket/path/to/table/dt=2019-10-31/lang=en/late . |
SchemaField
Represents a column field within a table schema.
Fields | |
---|---|
name |
Required. The name of the field. Must contain only letters, numbers and underscores, with a maximum length of 767 characters, and must begin with a letter or underscore. |
description |
Optional. User friendly field description. Must be less than or equal to 1024 characters. |
type |
Required. The type of field. |
mode |
Required. Additional field semantics. |
fields[] |
Optional. Any nested field for complex types. |
Type
Type information for fields in schemas and partition schemas.
Enums | |
---|---|
TYPE_UNSPECIFIED |
SchemaType unspecified. |
BOOLEAN |
Boolean field. |
BYTE |
Single byte numeric field. |
INT16 |
16-bit numeric field. |
INT32 |
32-bit numeric field. |
INT64 |
64-bit numeric field. |
FLOAT |
Floating point numeric field. |
DOUBLE |
Double precision numeric field. |
DECIMAL |
Real value numeric field. |
STRING |
Sequence of characters field. |
BINARY |
Sequence of bytes field. |
TIMESTAMP |
Date and time field. |
DATE |
Date field. |
TIME |
Time field. |
RECORD |
Structured field. Nested fields that define the structure of the map. If all nested fields are nullable, this field represents a union. |
NULL |
Null field that does not have values. |
SearchEntriesRequest
Fields | |
---|---|
name |
Required. The project to which the request should be attributed in the following form: Authorization requires the following IAM permission on the specified resource
|
query |
Required. The query against which entries in scope should be matched. The query syntax is defined in Search syntax for Dataplex Catalog. |
page_ |
Optional. Number of results in the search page. If <=0, then defaults to 10. Max limit for page_size is 1000. Throws an invalid argument for page_size > 1000. |
page_ |
Optional. Page token received from a previous |
order_ |
Optional. Specifies the ordering of results. Supported values are: * |
scope |
Optional. The scope under which the search should be operating. It must either be |
SearchEntriesResponse
Fields | |
---|---|
results[] |
The results matching the search query. |
total_ |
The estimated total number of matching entries. This number isn't guaranteed to be accurate. |
next_ |
Token to retrieve the next page of results, or empty if there are no more results in the list. |
unreachable[] |
Locations that the service couldn't reach. Search results don't include data from these locations. |
SearchEntriesResult
A single result of a SearchEntries request.
Fields | |
---|---|
linked_resource |
Linked resource name. |
dataplex_ |
|
snippets |
Snippets. |
Snippets
Snippets for the entry, contains HTML-style highlighting for matched tokens, will be used in UI.
Fields | |
---|---|
dataplex_entry |
Entry |
Session
Represents an active analyze session running for a user.
Fields | |
---|---|
name |
Output only. The relative resource name of the content, of the form: projects/{project_id}/locations/{location_id}/lakes/{lake_id}/environment/{environment_id}/sessions/{session_id} |
user_ |
Output only. Email of user running the session. |
create_ |
Output only. Session start time. |
state |
Output only. State of Session |
SessionEvent
These messages contain information about sessions within an environment. The monitored resource is 'Environment'.
Fields | |
---|---|
message |
The log message. |
user_ |
The information about the user that created the session. It will be the email address of the user. |
session_ |
Unique identifier for the session. |
type |
The type of the event. |
event_ |
The status of the event. |
fast_ |
If the session is associated with an environment with fast startup enabled, and was created before being assigned to a user. |
unassigned_ |
The idle duration of a warm pooled session before it is assigned to user. |
Union field detail . Additional information about the Query metadata. detail can be only one of the following: |
|
query |
The execution details of the query. |
EventType
The type of the event.
Enums | |
---|---|
EVENT_TYPE_UNSPECIFIED |
An unspecified event type. |
START |
Event when the session is assigned to a user. |
STOP |
Event for stop of a session. |
QUERY |
Query events in the session. |
CREATE |
Event for creation of a cluster. It is not yet assigned to a user. This comes before START in the sequence |
QueryDetail
Execution details of the query.
Fields | |
---|---|
query_ |
The unique Query id identifying the query. |
query_ |
The query text executed. |
engine |
Query Execution engine. |
duration |
Time taken for execution of the query. |
result_ |
The size of results the query produced. |
data_ |
The data processed by the query. |
Engine
Query Execution engine.
Enums | |
---|---|
ENGINE_UNSPECIFIED |
An unspecified Engine type. |
SPARK_SQL |
Spark-sql engine is specified in Query. |
BIGQUERY |
BigQuery engine is specified in Query. |
State
State of a resource.
Enums | |
---|---|
STATE_UNSPECIFIED |
State is not specified. |
ACTIVE |
Resource is active, i.e., ready to use. |
CREATING |
Resource is under creation. |
DELETING |
Resource is under deletion. |
ACTION_REQUIRED |
Resource is active but has unresolved actions. |
StorageAccess
Describes the access mechanism of the data within its storage location.
Fields | |
---|---|
read |
Output only. Describes the read access mechanism of the data. Not user settable. |
AccessMode
Access Mode determines how data stored within the Entity is read.
Enums | |
---|---|
ACCESS_MODE_UNSPECIFIED |
Access mode unspecified. |
DIRECT |
Default. Data is accessed directly using storage APIs. |
MANAGED |
Data is accessed through a managed interface using BigQuery APIs. |
StorageFormat
Describes the format of the data within its storage location.
Fields | |
---|---|
format |
Output only. The data format associated with the stored data, which represents content type values. The value is inferred from mime type. |
compression_ |
Optional. The compression type associated with the stored data. If unspecified, the data is uncompressed. |
mime_ |
Required. The mime type descriptor for the data. Must match the pattern {type}/{subtype}. Supported values:
|
Union field options . Additional format-specific options. options can be only one of the following: |
|
csv |
Optional. Additional information about CSV formatted data. |
json |
Optional. Additional information about CSV formatted data. |
iceberg |
Optional. Additional information about iceberg tables. |
CompressionFormat
The specific compressed file format of the data.
Enums | |
---|---|
COMPRESSION_FORMAT_UNSPECIFIED |
CompressionFormat unspecified. Implies uncompressed data. |
GZIP |
GZip compressed set of files. |
BZIP2 |
BZip2 compressed set of files. |
CsvOptions
Describes CSV and similar semi-structured data formats.
Fields | |
---|---|
encoding |
Optional. The character encoding of the data. Accepts "US-ASCII", "UTF-8", and "ISO-8859-1". Defaults to UTF-8 if unspecified. |
header_ |
Optional. The number of rows to interpret as header rows that should be skipped when reading data rows. Defaults to 0. |
delimiter |
Optional. The delimiter used to separate values. Defaults to ','. |
quote |
Optional. The character used to quote column values. Accepts '"' (double quotation mark) or ''' (single quotation mark). Defaults to '"' (double quotation mark) if unspecified. |
Format
The specific file format of the data.
Enums | |
---|---|
FORMAT_UNSPECIFIED |
Format unspecified. |
PARQUET |
Parquet-formatted structured data. |
AVRO |
Avro-formatted structured data. |
ORC |
Orc-formatted structured data. |
CSV |
Csv-formatted semi-structured data. |
JSON |
Json-formatted semi-structured data. |
IMAGE |
Image data formats (such as jpg and png). |
AUDIO |
Audio data formats (such as mp3, and wav). |
VIDEO |
Video data formats (such as mp4 and mpg). |
TEXT |
Textual data formats (such as txt and xml). |
TFRECORD |
TensorFlow record format. |
OTHER |
Data that doesn't match a specific format. |
UNKNOWN |
Data of an unknown format. |
IcebergOptions
Describes Iceberg data format.
Fields | |
---|---|
metadata_ |
Optional. The location of where the iceberg metadata is present, must be within the table path |
JsonOptions
Describes JSON data format.
Fields | |
---|---|
encoding |
Optional. The character encoding of the data. Accepts "US-ASCII", "UTF-8" and "ISO-8859-1". Defaults to UTF-8 if not specified. |
StorageSystem
Identifies the cloud system that manages the data storage.
Enums | |
---|---|
STORAGE_SYSTEM_UNSPECIFIED |
Storage system unspecified. |
CLOUD_STORAGE |
The entity data is contained within a Cloud Storage bucket. |
BIGQUERY |
The entity data is contained within a BigQuery dataset. |
Task
A task represents a user-visible job.
Fields | |
---|---|
name |
Output only. The relative resource name of the task, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/ tasks/{task_id}. |
uid |
Output only. System generated globally unique ID for the task. This ID will be different if the task is deleted and re-created with the same name. |
create_ |
Output only. The time when the task was created. |
update_ |
Output only. The time when the task was last updated. |
description |
Optional. Description of the task. |
display_ |
Optional. User friendly display name. |
state |
Output only. Current state of the task. |
labels |
Optional. User-defined labels for the task. |
trigger_ |
Required. Spec related to how often and when a task should be triggered. |
execution_ |
Required. Spec related to how a task is executed. |
execution_ |
Output only. Status of the latest task executions. |
Union field config . Task template specific user-specified config. config can be only one of the following: |
|
spark |
Config related to running custom Spark tasks. |
notebook |
Config related to running scheduled Notebooks. |
ExecutionSpec
Execution related settings, like retry and service_account.
Fields | |
---|---|
args |
Optional. The arguments to pass to the task. The args can use placeholders of the format ${placeholder} as part of key/value string. These will be interpolated before passing the args to the driver. Currently supported placeholders: - ${task_id} - ${job_time} To pass positional args, set the key as TASK_ARGS. The value should be a comma-separated string of all the positional arguments. To use a delimiter other than comma, refer to https://cloud.google.com/sdk/gcloud/reference/topic/escaping. In case of other keys being present in the args, then TASK_ARGS will be passed as the last argument. |
service_ |
Required. Service account to use to execute a task. If not provided, the default Compute service account for the project is used. |
project |
Optional. The project in which jobs are run. By default, the project containing the Lake is used. If a project is provided, the |
max_ |
Optional. The maximum duration after which the job execution is expired. |
kms_ |
Optional. The Cloud KMS key to use for encryption, of the form: |
ExecutionStatus
Status of the task execution (e.g. Jobs).
Fields | |
---|---|
update_ |
Output only. Last update time of the status. |
latest_ |
Output only. latest job execution |
InfrastructureSpec
Configuration for the underlying infrastructure used to run workloads.
Fields | |
---|---|
Union field resources . Hardware config. resources can be only one of the following: |
|
batch |
Compute resources needed for a Task when using Dataproc Serverless. |
Union field runtime . Software config. runtime can be only one of the following: |
|
container_ |
Container Image Runtime Configuration. |
Union field network . Networking config. network can be only one of the following: |
|
vpc_ |
Vpc network. |
BatchComputeResources
Batch compute resources associated with the task.
Fields | |
---|---|
executors_ |
Optional. Total number of job executors. Executor Count should be between 2 and 100. [Default=2] |
max_ |
Optional. Max configurable executors. If max_executors_count > executors_count, then auto-scaling is enabled. Max Executor Count should be between 2 and 1000. [Default=1000] |
ContainerImageRuntime
Container Image Runtime Configuration used with Batch execution.
Fields | |
---|---|
image |
Optional. Container image to use. |
java_ |
Optional. A list of Java JARS to add to the classpath. Valid input includes Cloud Storage URIs to Jar binaries. For example, gs://bucket-name/my/path/to/file.jar |
python_ |
Optional. A list of python packages to be installed. Valid formats include Cloud Storage URI to a PIP installable library. For example, gs://bucket-name/my/path/to/lib.tar.gz |
properties |
Optional. Override to common configuration of open source components installed on the Dataproc cluster. The properties to set on daemon config files. Property keys are specified in |
VpcNetwork
Cloud VPC Network used to run the infrastructure.
Fields | |
---|---|
network_ |
Optional. List of network tags to apply to the job. |
Union field network_name . The Cloud VPC network identifier. network_name can be only one of the following: |
|
network |
Optional. The Cloud VPC network in which the job is run. By default, the Cloud VPC network named Default within the project is used. |
sub_ |
Optional. The Cloud VPC sub-network in which the job is run. |
NotebookTaskConfig
Config for running scheduled notebooks.
Fields | |
---|---|
notebook |
Required. Path to input notebook. This can be the Cloud Storage URI of the notebook file or the path to a Notebook Content. The execution args are accessible as environment variables ( |
infrastructure_ |
Optional. Infrastructure specification for the execution. |
file_ |
Optional. Cloud Storage URIs of files to be placed in the working directory of each executor. |
archive_ |
Optional. Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip. |
SparkTaskConfig
User-specified config for running a Spark task.
Fields | |
---|---|
file_ |
Optional. Cloud Storage URIs of files to be placed in the working directory of each executor. |
archive_ |
Optional. Cloud Storage URIs of archives to be extracted into the working directory of each executor. Supported file types: .jar, .tar, .tar.gz, .tgz, and .zip. |
infrastructure_ |
Optional. Infrastructure specification for the execution. |
Union field driver . Required. The specification of the main method to call to drive the job. Specify either the jar file that contains the main class or the main class name. driver can be only one of the following: |
|
main_ |
The Cloud Storage URI of the jar file that contains the main class. The execution args are passed in as a sequence of named process arguments ( |
main_ |
The name of the driver's main class. The jar file that contains the class must be in the default CLASSPATH or specified in |
python_ |
The Gcloud Storage URI of the main Python file to use as the driver. Must be a .py file. The execution args are passed in as a sequence of named process arguments ( |
sql_ |
A reference to a query file. This should be the Cloud Storage URI of the query file. The execution args are used to declare a set of script variables ( |
sql_ |
The query text. The execution args are used to declare a set of script variables ( |
TriggerSpec
Task scheduling and trigger settings.
Fields | |
---|---|
type |
Required. Immutable. Trigger type of the user-specified Task. |
start_ |
Optional. The first run of the task will be after this time. If not specified, the task will run shortly after being submitted if ON_DEMAND and based on the schedule if RECURRING. |
disabled |
Optional. Prevent the task from executing. This does not cancel already running tasks. It is intended to temporarily disable RECURRING tasks. |
max_ |
Optional. Number of retry attempts before aborting. Set to zero to never attempt to retry a failed task. |
Union field trigger . Trigger only applies for RECURRING tasks. trigger can be only one of the following: |
|
schedule |
Optional. Cron schedule (https://en.wikipedia.org/wiki/Cron) for running tasks periodically. To explicitly set a timezone to the cron tab, apply a prefix in the cron tab: "CRON_TZ=${IANA_TIME_ZONE}" or "TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database. For example, |
Type
Determines how often and when the job will run.
Enums | |
---|---|
TYPE_UNSPECIFIED |
Unspecified trigger type. |
ON_DEMAND |
The task runs one-time shortly after Task Creation. |
RECURRING |
The task is scheduled to run periodically. |
TransferStatus
Denotes the transfer status of a resource. It is unspecified for resources created from Dataplex API.
Enums | |
---|---|
TRANSFER_STATUS_UNSPECIFIED |
The default value. It is set for resources that were not subject for migration from Data Catalog service. |
TRANSFER_STATUS_MIGRATED |
Indicates that a resource was migrated from Data Catalog service but it hasn't been transferred yet. In particular the resource cannot be updated from Dataplex API. |
TRANSFER_STATUS_TRANSFERRED |
Indicates that a resource was transferred from Data Catalog service. The resource can only be updated from Dataplex API. |
Trigger
DataScan scheduling and trigger settings.
Fields | |
---|---|
Union field If not specified, the default is |
|
on_ |
The scan runs once via |
schedule |
The scan is scheduled to run periodically. |
OnDemand
This type has no fields.
The scan runs once via RunDataScan
API.
Schedule
The scan is scheduled to run periodically.
Fields | |
---|---|
cron |
Required. Cron schedule for running scans periodically. To explicitly set a timezone in the cron tab, apply a prefix in the cron tab: "CRON_TZ=${IANA_TIME_ZONE}" or "TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database (wikipedia). For example, This field is required for Schedule scans. |
UpdateAspectTypeRequest
Update AspectType Request
Fields | |
---|---|
aspect_ |
Required. AspectType Resource Authorization requires the following IAM permission on the specified resource
|
update_ |
Required. Mask of fields to update. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateAssetRequest
Update asset request.
Fields | |
---|---|
update_ |
Required. Mask of fields to update. |
asset |
Required. Update description. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateContentRequest
Update content request.
Fields | |
---|---|
update_ |
Required. Mask of fields to update. |
content |
Required. Update description. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateDataAttributeBindingRequest
Update DataAttributeBinding request.
Fields | |
---|---|
update_ |
Required. Mask of fields to update. |
data_ |
Required. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateDataAttributeRequest
Update DataAttribute request.
Fields | |
---|---|
update_ |
Required. Mask of fields to update. |
data_ |
Required. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateDataScanRequest
Update dataScan request.
Fields | |
---|---|
data_ |
Required. DataScan resource to be updated. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
update_ |
Optional. Mask of fields to update. |
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is |
UpdateDataTaxonomyRequest
Update DataTaxonomy request.
Fields | |
---|---|
update_ |
Required. Mask of fields to update. |
data_ |
Required. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateEntityRequest
Update a metadata entity request. The exiting entity will be fully replaced by the entity in the request. The entity ID is mutable. To modify the ID, use the current entity ID in the request URL and specify the new ID in the request body.
Fields | |
---|---|
entity |
Required. Update description. Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateEntryGroupRequest
Update EntryGroup Request.
Fields | |
---|---|
entry_ |
Required. EntryGroup Resource. Authorization requires the following IAM permission on the specified resource
|
update_ |
Required. Mask of fields to update. |
validate_ |
Optional. The service validates the request, without performing any mutations. The default is false. |
UpdateEntryRequest
Update Entry request.
Fields | |
---|---|
entry |
Required. Entry resource. |
update_ |
Optional. Mask of fields to update. To update Aspects, the update_mask must contain the value "aspects". If the update_mask is empty, the service will update all modifiable fields present in the request. |
allow_ |
Optional. If set to true and the entry doesn't exist, the service will create it. |
delete_ |
Optional. If set to true and the aspect_keys specify aspect ranges, the service deletes any existing aspects from that range that weren't provided in the request. |
aspect_ |
Optional. The map keys of the Aspects which the service should modify. It supports the following syntaxes:
The service will not remove existing aspects matching the syntax unless If this field is left empty, the service treats it as specifying exactly those Aspects present in the request. |
UpdateEntryTypeRequest
Update EntryType Request.
Fields | |
---|---|
entry_ |
Required. EntryType Resource. Authorization requires the following IAM permission on the specified resource
|
update_ |
Required. Mask of fields to update. |
validate_ |
Optional. The service validates the request without performing any mutations. The default is false. |
UpdateEnvironmentRequest
Update environment request.
Fields | |
---|---|
update_ |
Required. Mask of fields to update. |
environment |
Required. Update description. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateLakeRequest
Update lake request.
Fields | |
---|---|
update_ |
Required. Mask of fields to update. |
lake |
Required. Update description. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateTaskRequest
Update task request.
Fields | |
---|---|
update_ |
Required. Mask of fields to update. |
task |
Required. Update description. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
UpdateZoneRequest
Update zone request.
Fields | |
---|---|
update_ |
Required. Mask of fields to update. |
zone |
Required. Update description. Only fields specified in Authorization requires the following IAM permission on the specified resource
|
validate_ |
Optional. Only validate the request, but do not perform mutations. The default is false. |
Zone
A zone represents a logical group of related assets within a lake. A zone can be used to map to organizational structure or represent stages of data readiness from raw to curated. It provides managing behavior that is shared or inherited by all contained assets.
Fields | |
---|---|
name |
Output only. The relative resource name of the zone, of the form: |
display_ |
Optional. User friendly display name. |
uid |
Output only. System generated globally unique ID for the zone. This ID will be different if the zone is deleted and re-created with the same name. |
create_ |
Output only. The time when the zone was created. |
update_ |
Output only. The time when the zone was last updated. |
labels |
Optional. User defined labels for the zone. |
description |
Optional. Description of the zone. |
state |
Output only. Current state of the zone. |
type |
Required. Immutable. The type of the zone. |
discovery_ |
Optional. Specification of the discovery feature applied to data in this zone. |
resource_ |
Required. Specification of the resources that are referenced by the assets within this zone. |
asset_ |
Output only. Aggregated status of the underlying assets of the zone. |
DiscoverySpec
Settings to manage the metadata discovery and publishing in a zone.
Fields | |
---|---|
enabled |
Required. Whether discovery is enabled. |
include_ |
Optional. The list of patterns to apply for selecting data to include during discovery if only a subset of the data should considered. For Cloud Storage bucket assets, these are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these are interpreted as patterns to match table names. |
exclude_ |
Optional. The list of patterns to apply for selecting data to exclude during discovery. For Cloud Storage bucket assets, these are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these are interpreted as patterns to match table names. |
csv_ |
Optional. Configuration for CSV data. |
json_ |
Optional. Configuration for Json data. |
Union field trigger . Determines when discovery is triggered. trigger can be only one of the following: |
|
schedule |
Optional. Cron schedule (https://en.wikipedia.org/wiki/Cron) for running discovery periodically. Successive discovery runs must be scheduled at least 60 minutes apart. The default value is to run discovery every 60 minutes. To explicitly set a timezone to the cron tab, apply a prefix in the cron tab: "CRON_TZ=${IANA_TIME_ZONE}" or TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database. For example, |
CsvOptions
Describe CSV and similar semi-structured data formats.
Fields | |
---|---|
header_ |
Optional. The number of rows to interpret as header rows that should be skipped when reading data rows. |
delimiter |
Optional. The delimiter being used to separate values. This defaults to ','. |
encoding |
Optional. The character encoding of the data. The default is UTF-8. |
disable_ |
Optional. Whether to disable the inference of data type for CSV data. If true, all columns will be registered as strings. |
JsonOptions
Describe JSON data format.
Fields | |
---|---|
encoding |
Optional. The character encoding of the data. The default is UTF-8. |
disable_ |
Optional. Whether to disable the inference of data type for Json data. If true, all columns will be registered as their primitive types (strings, number or boolean). |
ResourceSpec
Settings for resources attached as assets within a zone.
Fields | |
---|---|
location_ |
Required. Immutable. The location type of the resources that are allowed to be attached to the assets within this zone. |
LocationType
Location type of the resources attached to a zone.
Enums | |
---|---|
LOCATION_TYPE_UNSPECIFIED |
Unspecified location type. |
SINGLE_REGION |
Resources that are associated with a single region. |
MULTI_REGION |
Resources that are associated with a multi-region location. |
Type
Type of zone.
Enums | |
---|---|
TYPE_UNSPECIFIED |
Zone type not specified. |
RAW |
A zone that contains data that needs further processing before it is considered generally ready for consumption and analytics workloads. |
CURATED |
A zone that contains data that is considered to be ready for broader consumption and analytics workloads. Curated structured data stored in Cloud Storage must conform to certain file formats (parquet, avro and orc) and organized in a hive-compatible directory layout. |