Puedes llamar a las API de Data Catalog a fin de crear y administrar entradas personalizadas para tipos de recursos de datos. En este documento, una entrada para un tipo de recurso de datos personalizado se denomina “entrada personalizada”.
Crea grupos de entradas y entradas personalizadas
Se deben colocar las entradas personalizadas dentro de un grupo de entradas creado por el usuario. Crea el grupo de entrada y luego, la entrada personalizada dentro del grupo.
Después de crear una entrada, puedes establecer políticas de IAM en el grupo de entrada para definir quién tiene acceso al grupo de entrada y las entradas dentro.
LÍNEA DE REST Y CMD
Consulta los siguientes ejemplos y la documentación de entryGroups.create y entryGroups.entries.create de la API de REST de Data Catalog.
1. Crea un grupo de entrada
Antes de usar cualquiera de los datos de solicitud siguientes, realiza los siguientes reemplazos:
- project-id: Es el ID de tu proyecto de GCP.
- entryGroupId: El ID debe comenzar con una letra o un guion bajo, contener solo letras del alfabeto inglés, números y guiones bajos, y tener 64 caracteres como máximo.
- displayName: El nombre textual del grupo de entrada
Método HTTP y URL:
POST https://datacatalog.googleapis.com/v1/projects/project-id/locations/us-central1/entryGroups?entryGroupId=entryGroupId
Cuerpo JSON de la solicitud:
{ "displayName": "Entry Group display name" }
Para enviar tu solicitud, expande una de estas opciones:
Deberías recibir una respuesta JSON similar a la que se muestra a continuación:
{ "name": "projects/my_projectid/locations/us-central1/entryGroups/my_entry_group", "displayName": "Entry Group display name", "dataCatalogTimestamps": { "createTime": "2019-10-19T16:35:50.135Z", "updateTime": "2019-10-19T16:35:50.135Z" } }
2 Crea una entrada personalizada dentro del grupo de entrada
Antes de usar cualquiera de los datos de solicitud siguientes, realiza los siguientes reemplazos:
- project_id: Es el ID de tu proyecto de GCP.
- entryGroupId: ID del entryGroup existente. La entrada se creará en este EntryGroup.
- entryId: ID de la entrada nueva. La identificación debe comenzar con una letra o un guión bajo, contener solo letras del alfabeto inglés, números y guiones bajos, y tener 64 caracteres como máximo.
- description: Descripción opcional de la entrada
- displayName: Nombre textual opcional para la entrada
- userSpecifiedType: Nombre personalizado El nombre del tipo debe comenzar con una letra o un guión bajo, solo debe contener letras, números y guiones bajos, y debe tener como máximo 64 caracteres.
- userSpecifiedSystem: el sistema de origen que no es de GCP de la entrada, que no está integrado con Data Catalog. El nombre del sistema de origen debe comenzar con una letra o un guión bajo, solo debe contener letras, números y guiones bajos, y debe tener como máximo 64 caracteres.
- linkedResource: Nombre completo del recurso al que hace referencia la entrada
- schema: Esquema de datos opcional
Esquema JSON de ejemplo:
{ ... "schema": { "columns": [ { "column": "first_name", "description": "First name", "mode": "REQUIRED", "type": "STRING" }, { "column": "last_name", "description": "Last name", "mode": "REQUIRED", "type": "STRING" }, { "column": "address", "description": "Address", "mode": "REPEATED", "subcolumns": [ { "column": "city", "description": "City", "mode": "NULLABLE", "type": "STRING" }, { "column": "state", "description": "State", "mode": "NULLABLE", "type": "STRING" } ], "type": "RECORD" } ] } ... }
Método HTTP y URL:
POST https://datacatalog.googleapis.com/v1/projects/project_id/locations/us-central1/entryGroups/entryGroupId/entries?entryId=entryId
Cuerpo JSON de la solicitud:
{ "description": "Description", "displayName": "Display name", "user_specified_type": "my_type", "user_specified_system": "my_system", "linked_resource": "abc.com/def", "schema": { schema } }
Para enviar tu solicitud, expande una de estas opciones:
Deberías recibir una respuesta JSON similar a la que se muestra a continuación:
{ "name": "projects/my_project_id/locations/us-central1/entryGroups/my_entryGroup_id/entries/my_entry_id", "userSpecifiedType": "my-type", "userSpecifiedSystem": "my_system", "displayName": "On-prem entry", "description": "My entry description.", "schema": { "columns": [ { "type": "STRING", "description": "First name", "mode": "REQUIRED", "column": "first_name" }, { "type": "STRING", "description": "Last name", "mode": "REQUIRED", "column": "last_name" }, { "type": "RECORD", "description": "Address", "mode": "REPEATED", "column": "address", "subcolumns": [ { "type": "STRING", "description": "City", "mode": "NULLABLE", "column": "city" }, { "type": "STRING", "description": "State", "mode": "NULLABLE", "column": "state" } ] } ] }, "sourceSystemTimestamps": { "createTime": "2019-10-23T23:11:26.326Z", "updateTime": "2019-10-23T23:11:26.326Z" }, "linkedResource": "abc.com/def" }
Python
- Instala la biblioteca cliente
- Configura credenciales predeterminadas de la aplicación
- Ejecuta el código.
""" This application demonstrates how to perform core operations with the Data Catalog API. For more information, see the README.md and the official documentation at https://cloud.google.com/data-catalog/docs. """ # ------------------------------- # Import required modules. # ------------------------------- from google.api_core.exceptions import NotFound, PermissionDenied from google.cloud import datacatalog_v1 # ------------------------------- # Currently, Data Catalog stores metadata in the # us-central1 region. # ------------------------------- location = 'us-central1' # ------------------------------- # TODO: Set these values before running the sample. # ------------------------------- project_id = 'my-project' entry_group_id = 'onprem_entry_group' entry_id = 'onprem_entry_id' tag_template_id = 'onprem_tag_template' # ------------------------------- # Use Application Default Credentials to create a new # Data Catalog client. GOOGLE_APPLICATION_CREDENTIALS # environment variable must be set with the location # of a service account key file. # ------------------------------- datacatalog = datacatalog_v1.DataCatalogClient() # ------------------------------- # 1. Environment cleanup: delete pre-existing data. # ------------------------------- # Delete any pre-existing Entry with the same name # that will be used in step 3. expected_entry_name = datacatalog_v1.DataCatalogClient \ .entry_path(project_id, location, entry_group_id, entry_id) try: datacatalog.delete_entry(name=expected_entry_name) except (NotFound, PermissionDenied): pass # Delete any pre-existing Entry Group with the same name # that will be used in step 2. expected_entry_group_name = datacatalog_v1.DataCatalogClient \ .entry_group_path(project_id, location, entry_group_id) try: datacatalog.delete_entry_group(name=expected_entry_group_name) except (NotFound, PermissionDenied): pass # Delete any pre-existing Template with the same name # that will be used in step 4. expected_template_name = datacatalog_v1.DataCatalogClient \ .tag_template_path(project_id, location, tag_template_id) try: datacatalog.delete_tag_template(name=expected_template_name, force=True) except (NotFound, PermissionDenied): pass # ------------------------------- # 2. Create an Entry Group. # ------------------------------- entry_group_obj = datacatalog_v1.types.EntryGroup() entry_group_obj.display_name = 'My awesome Entry Group' entry_group_obj.description = 'This Entry Group represents an external system' entry_group = datacatalog.create_entry_group( parent=datacatalog_v1.DataCatalogClient.location_path(project_id, location), entry_group_id=entry_group_id, entry_group=entry_group_obj) print('Created entry group: {}'.format(entry_group.name)) # ------------------------------- # 3. Create an Entry. # ------------------------------- entry = datacatalog_v1.types.Entry() entry.user_specified_system = 'onprem_data_system' entry.user_specified_type = 'onprem_data_asset' entry.display_name = 'My awesome data asset' entry.description = 'This data asset is managed by an external system.' entry.linked_resource = '//my-onprem-server.com/dataAssets/my-awesome-data-asset' # Create the Schema, this is optional. columns = [] columns.append(datacatalog_v1.types.ColumnSchema( column='first_column', type='STRING', description='This columns consists of ....', mode=None)) columns.append(datacatalog_v1.types.ColumnSchema( column='second_column', type='DOUBLE', description='This columns consists of ....', mode=None)) entry.schema.columns.extend(columns) entry = datacatalog.create_entry( parent=entry_group.name, entry_id=entry_id, entry=entry) print('Created entry: {}'.format(entry.name)) # ------------------------------- # 4. Create a Tag Template. # For more field types, including ENUM, please refer to # https://cloud.google.com/data-catalog/docs/quickstarts/quickstart-search-tag#data-catalog # -quickstart-python. # ------------------------------- tag_template = datacatalog_v1.types.TagTemplate() tag_template.display_name = 'On-premises Tag Template' tag_template.fields['source'].display_name = 'Source of the data asset' tag_template.fields['source'].type.primitive_type = \ datacatalog_v1.enums.FieldType.PrimitiveType.STRING.value tag_template = datacatalog.create_tag_template( parent=datacatalog_v1.DataCatalogClient.location_path(project_id, location), tag_template_id=tag_template_id, tag_template=tag_template) print('Created template: {}'.format(tag_template.name)) # ------------------------------- # 5. Attach a Tag to the custom Entry. # ------------------------------- tag = datacatalog_v1.types.Tag() tag.template = tag_template.name tag.fields['source'].string_value = 'On-premises system name' tag = datacatalog.create_tag(parent=entry.name, tag=tag) print('Created tag: {}'.format(tag.name))
Java
- Instala la biblioteca cliente
- Configura credenciales predeterminadas de la aplicación
- Ejecuta el código.
/* This application demonstrates how to perform core operations with the Data Catalog API. For more information, see the README.md and the official documentation at https://cloud.google.com/data-catalog/docs. */ package com.example.datacatalog; import com.google.api.gax.rpc.AlreadyExistsException; import com.google.api.gax.rpc.NotFoundException; import com.google.api.gax.rpc.PermissionDeniedException; import com.google.cloud.datacatalog.v1.ColumnSchema; import com.google.cloud.datacatalog.v1.CreateEntryGroupRequest; import com.google.cloud.datacatalog.v1.CreateEntryRequest; import com.google.cloud.datacatalog.v1.CreateTagRequest; import com.google.cloud.datacatalog.v1.CreateTagTemplateRequest; import com.google.cloud.datacatalog.v1.DataCatalogClient; import com.google.cloud.datacatalog.v1.DeleteTagTemplateRequest; import com.google.cloud.datacatalog.v1.Entry; import com.google.cloud.datacatalog.v1.EntryGroup; import com.google.cloud.datacatalog.v1.EntryGroupName; import com.google.cloud.datacatalog.v1.EntryName; import com.google.cloud.datacatalog.v1.FieldType; import com.google.cloud.datacatalog.v1.LocationName; import com.google.cloud.datacatalog.v1.Schema; import com.google.cloud.datacatalog.v1.Tag; import com.google.cloud.datacatalog.v1.TagField; import com.google.cloud.datacatalog.v1.TagTemplate; import com.google.cloud.datacatalog.v1.TagTemplateField; import com.google.cloud.datacatalog.v1.TagTemplateName; import java.io.IOException; public class CreateCustomType { public static void createCustomType() { // TODO(developer): Replace these variables before running the sample. String projectId = "my-project"; String entryGroupId = "onprem_entry_group"; String entryId = "onprem_entry_id"; String tagTemplateId = "onprem_tag_template"; createCustomType(projectId, entryGroupId, entryId, tagTemplateId); } public static void createCustomType(String projectId, String entryGroupId, String entryId, String tagTemplateId) { // Currently, Data Catalog stores metadata in the us-central1 region. String location = "us-central1"; // Initialize client that will be used to send requests. This client only needs to be created // once, and can be reused for multiple requests. After completing all of your requests, call // the "close" method on the client to safely clean up any remaining background resources. try (DataCatalogClient dataCatalogClient = DataCatalogClient.create()) { // 1. Environment cleanup: delete pre-existing data. // Delete any pre-existing Entry with the same name // that will be used in step 3. try { String entryName = EntryName.of(projectId, location, entryGroupId, entryId).toString(); dataCatalogClient.deleteEntry(entryName); System.out.printf("\nDeleted Entry: %s", entryName); } catch (PermissionDeniedException | NotFoundException e) { // PermissionDeniedException or NotFoundException are thrown if // Entry does not exist. System.out.println("Entry does not exist."); } // Delete any pre-existing Entry Group with the same name // that will be used in step 2. try { String entryGroupName = EntryGroupName.of(projectId, location, entryGroupId).toString(); dataCatalogClient.deleteEntryGroup(entryGroupName); System.out.printf("\nDeleted Entry Group: %s", entryGroupName); } catch (PermissionDeniedException | NotFoundException e) { // PermissionDeniedException or NotFoundException are thrown if // Entry Group does not exist. System.out.println("Entry Group does not exist."); } String tagTemplateName = TagTemplateName.newBuilder() .setProject(projectId) .setLocation(location) .setTagTemplate(tagTemplateId) .build() .toString(); // Delete any pre-existing Template with the same name // that will be used in step 4. try { dataCatalogClient.deleteTagTemplate( DeleteTagTemplateRequest.newBuilder() .setName(tagTemplateName) .setForce(true) .build()); System.out.printf("\nDeleted template: %s", tagTemplateName); } catch (Exception e) { System.out.printf("\nCannot delete template: %s", tagTemplateName); } // 2. Create an Entry Group. // Construct the EntryGroup for the EntryGroup request. EntryGroup entryGroup = EntryGroup.newBuilder() .setDisplayName("My awesome Entry Group") .setDescription("This Entry Group represents an external system") .build(); // Construct the EntryGroup request to be sent by the client. CreateEntryGroupRequest entryGroupRequest = CreateEntryGroupRequest.newBuilder() .setParent(LocationName.of(projectId, location).toString()) .setEntryGroupId(entryGroupId) .setEntryGroup(entryGroup) .build(); // Use the client to send the API request. EntryGroup createdEntryGroup = dataCatalogClient.createEntryGroup(entryGroupRequest); System.out.printf("\nEntry Group created with name: %s", createdEntryGroup.getName()); // 3. Create an Entry. // Construct the Entry for the Entry request. Entry entry = Entry.newBuilder() .setUserSpecifiedSystem("onprem_data_system") .setUserSpecifiedType("onprem_data_asset") .setDisplayName("My awesome data asset") .setDescription("This data asset is managed by an external system.") .setLinkedResource("//my-onprem-server.com/dataAssets/my-awesome-data-asset") .setSchema( Schema.newBuilder() .addColumns( ColumnSchema.newBuilder() .setColumn("first_column") .setDescription("This columns consists of ....") .setMode("NULLABLE") .setType("DOUBLE") .build()) .addColumns( ColumnSchema.newBuilder() .setColumn("second_column") .setDescription("This columns consists of ....") .setMode("REQUIRED") .setType("STRING") .build()) .build()) .build(); // Construct the Entry request to be sent by the client. CreateEntryRequest entryRequest = CreateEntryRequest.newBuilder() .setParent(createdEntryGroup.getName()) .setEntryId(entryId) .setEntry(entry) .build(); // Use the client to send the API request. Entry createdEntry = dataCatalogClient.createEntry(entryRequest); System.out.printf("\nEntry created with name: %s", createdEntry.getName()); // 4. Create a Tag Template. // For more field types, including ENUM, please refer to // https://cloud.google.com/data-catalog/docs/quickstarts/quickstart-search-tag#data-catalog-quickstart-java. TagTemplateField sourceField = TagTemplateField.newBuilder() .setDisplayName("Source of data asset") .setType(FieldType.newBuilder().setPrimitiveType( FieldType.PrimitiveType.STRING).build()) .build(); TagTemplate tagTemplate = TagTemplate.newBuilder() .setDisplayName("Demo Tag Template") .putFields("source", sourceField) .build(); CreateTagTemplateRequest createTagTemplateRequest = CreateTagTemplateRequest.newBuilder() .setParent( LocationName.newBuilder() .setProject(projectId) .setLocation(location) .build() .toString()) .setTagTemplateId(tagTemplateId) .setTagTemplate(tagTemplate) .build(); TagTemplate createdTagTemplate = dataCatalogClient .createTagTemplate(createTagTemplateRequest); System.out.printf("\nTemplate created with name: %s", createdTagTemplate.getName()); TagField sourceValue = TagField.newBuilder().setStringValue("On-premises system name").build(); Tag tag = Tag.newBuilder() .setTemplate(createdTagTemplate.getName()) .putFields("source", sourceValue) .build(); CreateTagRequest createTagRequest = CreateTagRequest.newBuilder().setParent(createdEntry.getName()).setTag(tag).build(); Tag createdTag = dataCatalogClient.createTag(createTagRequest); System.out.printf("\nCreated tag: %s", createdTag.getName()); } catch (AlreadyExistsException | IOException e) { // AlreadyExistsException is thrown if the EntryGroup or Entry already exists. // IOException is thrown when unable to create the DataCatalogClient, // for example an invalid Service Account path. System.out.println("Error creating entry:\n" + e.toString()); } } }
Node.js
- Instala la biblioteca cliente
- Configura credenciales predeterminadas de la aplicación
- Ejecuta el código.
/** * This application demonstrates how to perform core operations with the * Data Catalog API. * For more information, see the README.md and the official documentation at * https://cloud.google.com/data-catalog/docs. */ const main = async ( projectId = process.env.GCLOUD_PROJECT, entryGroupId, entryId, tagTemplateId ) => { // ------------------------------- // Import required modules. // ------------------------------- const { DataCatalogClient } = require('@google-cloud/datacatalog').v1; const datacatalog = new DataCatalogClient(); // ------------------------------- // Currently, Data Catalog stores metadata in the // us-central1 region. // ------------------------------- const location = "us-central1"; // ------------------------------- // 1. Environment cleanup: delete pre-existing data. // ------------------------------- // Delete any pre-existing Entry with the same name // that will be used in step 3. try { const entryName = datacatalog.entryPath(projectId, location, entryGroupId, entryId); await datacatalog.deleteEntry({ name: entryName }); console.log(`Deleted Entry: ${entryName}`); } catch (err) { console.log('Entry does not exist.'); } // Delete any pre-existing Entry Group with the same name // that will be used in step 2. try { const entryGroupName = datacatalog.entryGroupPath(projectId, location, entryGroupId); await datacatalog.deleteEntryGroup({ name: entryGroupName }); console.log(`Deleted Entry Group: ${entryGroupName}`); } catch (err) { console.log('Entry Group does not exist.'); } // Delete any pre-existing Template with the same name // that will be used in step 4. const tagTemplateName = datacatalog.tagTemplatePath( projectId, location, tagTemplateId, ); try { const tagTemplateRequest = { name: tagTemplateName, force: true, }; await datacatalog.deleteTagTemplate(tagTemplateRequest); console.log(`Deleted template: ${tagTemplateName}`); } catch (error) { console.log(`Cannot delete template: ${tagTemplateName}`); } // ------------------------------- // 2. Create an Entry Group. // ------------------------------- // Construct the EntryGroup for the EntryGroup request. const entryGroup = { displayName: 'My awesome Entry Group', description: 'This Entry Group represents an external system', } // Construct the EntryGroup request to be sent by the client. const entryGroupRequest = { parent: datacatalog.locationPath(projectId, location), entryGroupId: entryGroupId, entryGroup: entryGroup, }; // Use the client to send the API request. const [createdEntryGroup] = await datacatalog.createEntryGroup(entryGroupRequest) console.log(`Created entry group: ${createdEntryGroup.name}`); // ------------------------------- // 3. Create an Entry. // ------------------------------- // Construct the Entry for the Entry request. const entry = { userSpecifiedSystem: 'onprem_data_system', userSpecifiedType: 'onprem_data_asset', displayName: 'My awesome data asset', description: 'This data asset is managed by an external system.', linkedResource: '//my-onprem-server.com/dataAssets/my-awesome-data-asset', schema: { columns: [ { column: 'first_column', description: 'This columns consists of ....', mode: 'NULLABLE', type: 'STRING', }, { column: 'second_column', description: 'This columns consists of ....', mode: 'NULLABLE', type: 'DOUBLE', } ], }, }; // Construct the Entry request to be sent by the client. const entryRequest = { parent: datacatalog.entryGroupPath(projectId, location, entryGroupId), entryId: entryId, entry: entry, }; // Use the client to send the API request. const [createdEntry] = await datacatalog.createEntry(entryRequest) console.log(`Created entry: ${createdEntry.name}`); // ------------------------------- // 4. Create a Tag Template. // For more field types, including ENUM, please refer to // https://cloud.google.com/data-catalog/docs/quickstarts/quickstart-search-tag#data-catalog-quickstart-nodejs. // ------------------------------- const fieldSource = { displayName: 'Source of data asset', type: { primitiveType: 'STRING', }, }; const tagTemplate = { displayName: 'Demo Tag Template', fields: { source: fieldSource, }, }; tagTemplateRequest = { parent: datacatalog.locationPath(projectId, location), tagTemplateId: tagTemplateId, tagTemplate: tagTemplate, }; // Use the client to send the API request. const [createdTagTemplate] = await datacatalog.createTagTemplate(tagTemplateRequest); console.log(`Created template: ${createdTagTemplate.name}`); // ------------------------------- // 5. Attach a Tag to the custom Entry. // ------------------------------- const tag = { template: createdTagTemplate.name, fields: { source: { stringValue: 'On-premises system name', }, }, }; const tagRequest = { parent: createdEntry.name, tag: tag, }; // Use the client to send the API request. const [createdTag] = await datacatalog.createTag(tagRequest); console.log(`Created tag: ${createdTag.name}`); // [END datacatalog_custom_entries_tag] }; // TODO: Change these values before running the sample // node createCustomType.js my-project onprem_entry_group onprem_entry_id onprem_tag_template main(...process.argv.slice(2));