REST Resource: projects.locations.entryGroups.entries

Resource: Entry

Entry Metadata. A Data Catalog Entry resource represents another resource in Google Cloud Platform (such as a BigQuery dataset or a Pub/Sub topic) or outside of Google Cloud Platform. Clients can use the linkedResource field in the Entry resource to refer to the original resource ID of the source system.

An Entry resource contains resource details, such as its schema. An Entry can also be used to attach flexible metadata, such as a Tag.

JSON representation
{
  "name": string,
  "linkedResource": string,
  "displayName": string,
  "description": string,
  "schema": {
    object (Schema)
  },
  "sourceSystemTimestamps": {
    object (SystemTimestamps)
  },
  "usageSignal": {
    object (UsageSignal)
  },
  "iamPolicyNamespace": enum (IamNamespace),
  "labels": {
    string: string,
    ...
  },
  "relationshipsInfo": {
    object (RelationshipsInfo)
  },
  "lakeInfo": {
    object (LakeInfo)
  },
  "acl": {
    object (Acl)
  },

  // Union field entry_type can be only one of the following:
  "type": enum (EntryType),
  "userSpecifiedType": string
  // End of list of possible types for union field entry_type.

  // Union field system can be only one of the following:
  "integratedSystem": enum (IntegratedSystem),
  "userSpecifiedSystem": string
  // End of list of possible types for union field system.

  // Union field type_spec can be only one of the following:
  "gcsFilesetSpec": {
    object (GcsFilesetSpec)
  },
  "bigqueryTableSpec": {
    object (BigQueryTableSpec)
  },
  "bigqueryDateShardedSpec": {
    object (BigQueryDateShardedSpec)
  }
  // End of list of possible types for union field type_spec.

  // Union field spec can be only one of the following:
  "dataStreamSpec": {
    object (DataStreamSpec)
  },
  "clusterSpec": {
    object (ClusterSpec)
  }
  // End of list of possible types for union field spec.
}
Fields
name

string

The Data Catalog resource name of the entry in URL format. Example:

  • projects/{project_id}/locations/{location}/entryGroups/{entryGroupId}/entries/{entryId}

Note that this Entry and its child resources may not actually be stored in the location in this name.

linkedResource

string

The resource this metadata entry refers to.

For Google Cloud Platform resources, linkedResource is the full name of the resource. For example, the linkedResource for a table resource from BigQuery is:

  • //bigquery.googleapis.com/projects/projectId/datasets/datasetId/tables/tableId

Output only when Entry is of type in the EntryType enum. For entries with userSpecifiedType, this field is optional and defaults to an empty string.

displayName

string

Display information such as title and description. A short name to identify the entry, for example, "Analytics Data - Jan 2011". Default value is an empty string.

description

string

Entry description, which can consist of several sentences or paragraphs that describe entry contents. Default value is an empty string.

schema

object (Schema)

Schema of the entry. An entry might not have any schema attached to it.

sourceSystemTimestamps

object (SystemTimestamps)

Timestamps about the underlying resource, not about this Data Catalog entry. Output only when Entry is of type in the EntryType enum. For entries with userSpecifiedType, this field is optional and defaults to an empty timestamp.

usageSignal

object (UsageSignal)

Output only. Statistics on the usage level of the resource.

iamPolicyNamespace

enum (IamNamespace)

Output only. IAM Namespace used for entry's policy. Intended to be used by Pantheon IAM panels.

labels

map (key: string, value: string)

Cloud labels on this entry. For synced entries, these labels are synced from the source system and cannot be modified in Data Catalog. For native entries, labels can be created/modified in Data Catalog.

relationshipsInfo

object (RelationshipsInfo)

Output only. Relationships information and statistics related to the entry.

lakeInfo

object (LakeInfo)

Output only. Lake information related to the entry.

acl

object (Acl)

Policy name for the entry from the synced system.

Union field entry_type. Required. Entry type. entry_type can be only one of the following:
type

enum (EntryType)

The type of the entry. Only used for Entries with types in the EntryType enum.

userSpecifiedType

string

Entry type if it does not fit any of the input-allowed values listed in EntryType enum above. When creating an entry, users should check the enum values first, if nothing matches the entry to be created, then provide a custom value, for example "my_special_type". userSpecifiedType strings must begin with a letter or underscore and can only contain letters, numbers, and underscores; are case insensitive; must be at least 1 character and at most 64 characters long.

Currently, only FILESET enum value is allowed. All other entries created through Data Catalog must use userSpecifiedType.

Union field system. The source system of the entry. system can be only one of the following:
integratedSystem

enum (IntegratedSystem)

Output only. This field indicates the entry's source system that Data Catalog integrates with, such as BigQuery or Pub/Sub.

userSpecifiedSystem

string

This field indicates the entry's source system that Data Catalog does not integrate with. userSpecifiedSystem strings must begin with a letter or underscore and can only contain letters, numbers, and underscores; are case insensitive; must be at least 1 character and at most 64 characters long.

Union field type_spec. Type specification information. type_spec can be only one of the following:
gcsFilesetSpec

object (GcsFilesetSpec)

Specification that applies to a Cloud Storage fileset. This is only valid on entries of type FILESET.

bigqueryTableSpec

object (BigQueryTableSpec)

Specification that applies to a BigQuery table. This is only valid on entries of type TABLE.

bigqueryDateShardedSpec

object (BigQueryDateShardedSpec)

Specification for a group of BigQuery tables with name pattern [prefix]YYYYMMDD. Context: https://cloud.google.com/bigquery/docs/partitioned-tables#partitioning_versus_sharding.

Union field spec. Type- and system- specific information. Specifications for types contain fields common to all entries of a given type, and sub-specs with fields specific to a given source system. When extending the API with new types and systems please use this instead of legacy type_spec field. spec can be only one of the following:
dataStreamSpec

object (DataStreamSpec)

Additional specification of a data stream. Present on entries representing non-pubsub data streams.

clusterSpec

object (ClusterSpec)

Additional specification of a cluster. Present on entries representing clusters.

EntryType

Entry resources in Data Catalog can be of different types e.g. a BigQuery Table entry is of type TABLE. This enum describes all the possible types Data Catalog contains.

Enums
ENTRY_TYPE_UNSPECIFIED Default unknown type.
DATASET Output only. An entry that serves as namespace, it is the parent of other entries in resource/acl hierarchy.
TABLE Output only. The type of entry that has a GoogleSQL schema, including logical views.
MODEL Output only. The type of models, examples include https://cloud.google.com/bigquery-ml/docs/bigqueryml-intro
DATA_STREAM An entry type which is used for streaming entries. Example: Pub/Sub topic.
FILESET An entry type which is a set of files or objects. Example: Cloud Storage fileset.
CLUSTER A group of servers that work together. Example: Kafka cluster.

GcsFilesetSpec

Describes a Cloud Storage fileset entry.

JSON representation
{
  "filePatterns": [
    string
  ],
  "sampleGcsFileSpecs": [
    {
      object (GcsFileSpec)
    }
  ],
  "totalFileCount": string,
  "totalFileSizeBytes": string,

  // Union field FileFormat can be only one of the following:
  "jsonFormat": {
    object (JsonFormat)
  },
  "csvFormat": {
    object (CsvFormat)
  }
  // End of list of possible types for union field FileFormat.
}
Fields
filePatterns[]

string

Required. Patterns to identify a set of files in Google Cloud Storage. See Cloud Storage documentation for more information. Note that bucket wildcards are currently not supported.

Examples of valid filePatterns:

  • gs://bucket_name/dir/*: matches all files within bucket_name/dir directory.
  • gs://bucket_name/dir/**: matches all files in bucket_name/dir spanning all subdirectories.
  • gs://bucket_name/file*: matches files prefixed by file in bucket_name
  • gs://bucket_name/??.txt: matches files with two characters followed by .txt in bucket_name
  • gs://bucket_name/[aeiou].txt: matches files that contain a single vowel character followed by .txt in bucket_name
  • gs://bucket_name/[a-m].txt: matches files that contain a, b, ... or m followed by .txt in bucket_name
  • gs://bucket_name/a/*/b: matches all files in bucket_name that match a/*/b pattern, such as a/c/b, a/d/b
  • gs://another_bucket/a.txt: matches gs://another_bucket/a.txt

You can combine wildcards to provide more powerful matches, for example:

  • gs://bucket_name/[a-m]??.j*g
sampleGcsFileSpecs[]

object (GcsFileSpec)

Output only. Sample files contained in this fileset, not all files contained in this fileset are represented here.

totalFileCount

string (int64 format)

Output only. Total number of files in this fileset.

totalFileSizeBytes

string (int64 format)

Output only. Total size of all of the files in this fileset in bytes.

Union field FileFormat. Format specification of Cloud Storage files. FileFormat can be only one of the following:
jsonFormat

object (JsonFormat)

This field indicates that the Cloud Storage files are in JSON format.

csvFormat

object (CsvFormat)

This field indicates that the Cloud Storage files are in CSV format.

JsonFormat

Format specification for JSON files.

CsvFormat

Format specification for CSV files.

JSON representation
{
  "leadingNonDataRowCount": string,
  "delimiter": string
}
Fields
leadingNonDataRowCount

string (int64 format)

Number of leading rows that do not contain data (e.g. header row).

delimiter

string

Delimiter of the CSV file, must be exactly one character. If empty, comma (,) will be used.

GcsFileSpec

Specifications of a single file in Cloud Storage.

JSON representation
{
  "filePath": string,
  "gcsTimestamps": {
    object (SystemTimestamps)
  },
  "sizeBytes": string
}
Fields
filePath

string

Required. The full file path. Example: gs://bucket_name/a/b.txt.

gcsTimestamps

object (SystemTimestamps)

Output only. Timestamps about the Cloud Storage file.

sizeBytes

string (int64 format)

Output only. The size of the file, in bytes.

BigQueryTableSpec

Describes a BigQuery table.

JSON representation
{
  "tableSourceType": enum (TableSourceType),

  // Union field type_spec can be only one of the following:
  "viewSpec": {
    object (ViewSpec)
  },
  "tableSpec": {
    object (TableSpec)
  }
  // End of list of possible types for union field type_spec.
}
Fields
tableSourceType

enum (TableSourceType)

Output only. The table source type.

Union field type_spec. Output only. type_spec can be only one of the following:
viewSpec

object (ViewSpec)

Table view specification. This field should only be populated if tableSourceType is BIGQUERY_VIEW.

tableSpec

object (TableSpec)

Spec of a BigQuery table. This field should only be populated if tableSourceType is BIGQUERY_TABLE.

TableSourceType

Table source type.

Enums
TABLE_SOURCE_TYPE_UNSPECIFIED Default unknown type.
BIGQUERY_VIEW Table view.
BIGQUERY_TABLE BigQuery native table.
BIGQUERY_MATERIALIZED_VIEW BigQuery materialized view.

ViewSpec

Table view specification.

JSON representation
{
  "viewQuery": string
}
Fields
viewQuery

string

Output only. The query that defines the table view.

TableSpec

Normal BigQuery table spec.

JSON representation
{
  "groupedEntry": string,
  "externalTableSourceUris": [
    string
  ]
}
Fields
groupedEntry

string

Output only. If the table is a dated shard, i.e., with name pattern [prefix]YYYYMMDD, groupedEntry is the Data Catalog resource name of the date sharded grouped entry, for example, projects/{project_id}/locations/{location}/entrygroups/{entryGroupId}/entries/{entryId}. Otherwise, groupedEntry is empty.

externalTableSourceUris[]

string

Output only. URIs of sources used by the BigQuery external table.

BigQueryDateShardedSpec

Spec for a group of BigQuery tables with name pattern [prefix]YYYYMMDD. Context: https://cloud.google.com/bigquery/docs/partitioned-tables#partitioning_versus_sharding

JSON representation
{
  "dataset": string,
  "tablePrefix": string,
  "shardCount": string,
  "latestShardResource": string
}
Fields
dataset

string

Output only. The Data Catalog resource name of the dataset entry the current table belongs to, for example, projects/{project_id}/locations/{location}/entrygroups/{entryGroupId}/entries/{entryId}.

tablePrefix

string

Output only. The table name prefix of the shards. The name of any given shard is [tablePrefix]YYYYMMDD, for example, for shard MyTable20180101, the tablePrefix is MyTable.

shardCount

string (int64 format)

Output only. Total number of shards.

latestShardResource

string

Output only. BigQuery resource name for latest shard.

DataStreamSpec

Additional specification of a data stream.

JSON representation
{
  "kafkaTopic": {
    object (KafkaTopicSpec)
  }
}
Fields
kafkaTopic

object (KafkaTopicSpec)

Fields specific to a Kafka topic. Present only on entries representing Kafka topics.

KafkaTopicSpec

Entry spec describing a Kafka topic.

JSON representation
{
  "topic": string,
  "clusterEntry": string
}
Fields
topic

string

Required. Name of the Kafka topic this Entry represents. Example: 'my_topic'.

clusterEntry

string

Required. Name of the entry representing Kafka cluster this topic is a part of. Example: 'projects/my_project/locations/us/entryGroups/kafka/entries/my_cluster'.

ClusterSpec

Additional specification of a cluster.

JSON representation
{
  "kafkaCluster": {
    object (KafkaClusterSpec)
  }
}
Fields
kafkaCluster

object (KafkaClusterSpec)

Fields specific to a Kafka cluster. Present only on entries representing Kafka clusters.

KafkaClusterSpec

Entry spec describing a Kafka cluster.

JSON representation
{
  "bootstrapServers": string,
  "propertiesGcsUri": string
}
Fields
bootstrapServers

string

Required. A comma-separated list of host and port pairs that are the addresses of the Kafka brokers in a "bootstrap" Kafka cluster that a Kafka client connects to initially to bootstrap itself. Format is the same that is used for bootstrap.servers configuration property for Kafka clients. Example: host1:port1,host2:port2,host3:port3. See https://kafka.apache.org/documentation/#bootstrap.servers

propertiesGcsUri

string

URI to Google Cloud Storage properties file with additional properties needed to connect to the cluster represented by this Entry. Properties stored in the file are passed to Kafka Consumer by query engines wishing to connect to the cluster. The full range of properties supported by query engines is dependant on the engine. The properties file should be a valid Kafka consumer config file, see https://kafka.apache.org/documentation/#consumerconfigs. For SSL authorization, if supported by the query engine, keystore and truststore paths will be interpreted relative to the properties file. Example: gs://my_bucket/kafka/consumer.properties

Schema

Represents a schema (e.g. BigQuery, GoogleSQL, Avro schema).

JSON representation
{
  "columns": [
    {
      object (ColumnSchema)
    }
  ],
  "physicalSchema": {
    object (PhysicalSchema)
  }
}
Fields
columns[]

object (ColumnSchema)

The unified GoogleSQL-like schema of columns. A maximum of 10,000 columns and sub-columns can be specified.

physicalSchema

object (PhysicalSchema)

Physical Schema is the native schema used to encode the data represented by this entry.

ColumnSchema

Representation of a column within a schema. Columns could be nested inside other columns.

JSON representation
{
  "column": string,
  "type": string,
  "description": string,
  "mode": string,
  "subcolumns": [
    {
      object (ColumnSchema)
    }
  ]
}
Fields
column

string

Required. Name of the column.

type

string

Required. Type of the column.

description

string

Optional. Description of the column. Default value is an empty string.

mode

string

Optional. A column's mode indicates whether the values in this column are required, nullable, etc. Only NULLABLE, REQUIRED and REPEATED are supported. Default mode is NULLABLE.

subcolumns[]

object (ColumnSchema)

Optional. Schema of sub-columns. A column can have zero or more sub-columns.

PhysicalSchema

Native schema used by a resource represented as an entry. Used by query engines for deserializing and parsing source data.

JSON representation
{

  // Union field schema can be only one of the following:
  "avro": {
    object (AvroSchema)
  },
  "thrift": {
    object (ThriftSchema)
  },
  "protobuf": {
    object (ProtobufSchema)
  },
  "parquet": {
    object (ParquetSchema)
  },
  "orc": {
    object (OrcSchema)
  }
  // End of list of possible types for union field schema.
}
Fields

Union field schema.

schema can be only one of the following:

avro

object (AvroSchema)

Schema in Avro JSON format.

thrift

object (ThriftSchema)

Schema in Thrift format.

protobuf

object (ProtobufSchema)

Schema in Protobuf format.

parquet

object (ParquetSchema)

Marks a Parquet-encoded data source.

orc

object (OrcSchema)

Marks an ORC-encoded data source.

AvroSchema

Schema in Avro JSON format.

JSON representation
{
  "text": string
}
Fields
text

string

Required. JSON source of the Avro schema.

ThriftSchema

Schema in Thrift format.

JSON representation
{
  "text": string
}
Fields
text

string

Required. Thrift IDL source of the schema.

ProtobufSchema

Schema in Protobuf format.

JSON representation
{
  "text": string
}
Fields
text

string

Required. proto source of the schema.

ParquetSchema

Marks a Parquet-encoded data source.

OrcSchema

Marks an ORC-encoded data source.

UsageSignal

The set of all usage signals that we store in Data Catalog.

JSON representation
{
  "updateTime": string,
  "usageWithinTimeRange": {
    string: {
      object (UsageStats)
    },
    ...
  }
}
Fields
updateTime

string (Timestamp format)

The timestamp of the end of the usage statistics duration.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

usageWithinTimeRange

map (key: string, value: object (UsageStats))

Usage statistics over each of the pre-defined time ranges, supported strings for time ranges are {"24H", "7D", "30D"}.

RelationshipsInfo

Information of relationships related to the current entry.

JSON representation
{
  "updateTime": string,
  "relationshipStatsWithinTimeRange": {
    string: {
      object (RelationshipStats)
    },
    ...
  }
}
Fields
updateTime

string (Timestamp format)

The timestamp of the end of the relationship statistics duration.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

relationshipStatsWithinTimeRange

map (key: string, value: object (RelationshipStats))

Relationship statistics over each of the pre-defined time ranges. Note that each RelationshipStats only includes relationships that were established or re-established during this time period (e.g., the script to derive one table from another must have run within the given time period). supported strings {24H, 7D, 30D}

LakeInfo

Information about Lake and Zone assosiation.

JSON representation
{
  "lakeId": string,
  "zoneId": string,
  "zoneType": string
}
Fields
lakeId

string

Required. Lake which this Entry belongs to. Example format: "//dataplex.googleapis.com/projects/{projectId}/location/{location}/lakes/{lakeId}"

zoneId

string

Zone which this Entry belongs to. Example format: "//dataplex.googleapis.com/projects/{projectId}/location/{location}/lakes/{lakeId}/zones/{zoneId}"

zoneType

string

Type of the lake zone. Possible values: Landing, Refined, Raw.

Methods

create

Creates an entry.

delete

Deletes an existing entry.

get

Gets an entry.

getIamPolicy

Gets the access control policy for a resource.

list

Lists entries.

patch

Updates an existing entry.

testIamPermissions

Returns the caller's permissions on a resource.