애플리케이션이 만드는 각 HTTP 연결에는 어느 정도의 오버헤드가 필요합니다. Data Catalog API 요청은 일괄 처리를 지원하므로 여러 API 호출을 단일 HTTP 요청으로 결합할 수 있습니다. 실행할 작은 요청이 여러 개 있고 HTTP 요청 오버헤드를 최소화하려면 HTTP 일괄 처리를 사용하는 것이 좋습니다.
new_batch_http_request()를 호출하거나 BatchHttpRequest() 생성자를 사용하여 BatchHttpRequest 객체를 만듭니다. 각 요청에 대한 응답으로 호출되는 콜백을 전달할 수 있습니다.
실행하려는 각 요청에 대해 BatchHttpRequest 객체에서 add()를 호출합니다. BatchHttpRequest 객체를 만들 때 콜백을 전달한 경우 각 add()에 콜백에 전달할 매개변수가 포함될 수 있습니다.
요청을 추가한 후 BatchHttpRequest 객체에서 execute()를 호출하여 실행합니다. execute() 함수는 모든 콜백이 호출될 때까지 차단됩니다.
BatchHttpRequest의 요청은 실행 순서대로가 아니라 병렬로 실행될 수 있습니다. 즉, 동일한 배치에 있는 요청이 서로 종속되지 않아야 합니다. 예를 들어 EntryGroup을 만들고 여기에 속하는 Entry를 동일한 요청으로 만들지 않아야 합니다. EntryGroup 만들기가 실행되기 전 Entry 만들기가 실행되어 결국 실행이 실패할 수 있기 때문입니다.
리전 엔드포인트가 있는 일괄 요청
Data Catalog 리전 API 엔드포인트에 HTTP 일괄 요청을 사용할 때 모든 API 요청은 동일한 리전에 속해야 합니다. 배치를 실행할 때는 올바른 리전 엔드포인트를 호출해야 합니다. 예를 들어 리소스가 us-central1에 있으면 https://us-central1-datacatalog.googleapis.com/batch를 호출합니다.
리전 독립 API
catalog.lookup 및 entries.search와 같은 리전 독립 API는 서로 그룹으로 묶을 수 있지만 리전 종속 API와 그룹으로 묶지 않아야 합니다.
리전 독립 API의 경우 https://datacatalog.googleapis.com/batch 엔드포인트를 사용합니다.
예
이 샘플 Python 애플리케이션은 HTTP 일괄 요청을 사용하여 Data Catalog API로 태그 템플릿으로부터 여러 태그를 만드는 방법을 보여줍니다.
fromgoogleapiclient.discoveryimportbuildfromgoogleapiclient.httpimportBatchHttpRequestfromoauth2client.service_accountimportServiceAccountCredentialsimportuuid#-------------------------------------------------------------## 0. Helper and initialization logic#-------------------------------------------------------------## Set the environment configuration.service_key_file_location='[SA_PATH]'project_id='[MY_PROJECT_ID]'# Helper container to store results.classDataContainer:def__init__(self):self.data={}defcallback(self,request_id,response,exception):ifexceptionisnotNone:print('request_id: {}, exception: {}'.format(request_id,str(exception)))passelse:print(request_id)self.data[request_id]=response# Helper function to build the Discovery Service config.defget_service(api_name,api_version,scopes,key_file_location):""" Get a service that communicates to a Google API. Args: api_name: The name of the API to connect to. api_version: The API version to connect to. scopes: A list auth scopes to authorize for the application. key_file_location: The path to a valid service account JSON key file. Returns: A service that is connected to the specified API. """credentials=ServiceAccountCredentials.from_json_keyfile_name(key_file_location,scopes=scopes)# Build the service object.service=build(api_name,api_version,credentials=credentials)returnservice# Helper function to create a UUID for each requestdefgenerated_uui():returnstr(uuid.uuid4())defcreate_batch_request(callback):# For more info on supported regions# check: https://cloud.google.com/data-catalog/docs/concepts/regionsregion='us-datacatalog.googleapis.com'returnBatchHttpRequest(batch_uri='https://{}/batch'.format(region),callback=callback)container=DataContainer()# Scope to set up the Discovery Service config.scope='https://www.googleapis.com/auth/cloud-platform'# Create service.service=get_service(api_name='datacatalog',api_version='v1',scopes=[scope],key_file_location=service_key_file_location)# Create the batch request config.batch=create_batch_request(container.callback)#-------------------------------------------------------------## 1. Start by fetching a list of entries using search call#-------------------------------------------------------------## Create the search request body.# This example searches for all BigQuery tables in a project.search_request_body={'query':'type=TABLE system=BIGQUERY','scope':{'includeProjectIds':[project_id]}}# Generated a unique ID for the request.request_id=generated_uui()# Add the request to the batch client.batch.add(service.catalog().search(body=search_request_body),request_id=request_id)# Execute the batch request.batch.execute()# Uncomment to verify the full response from search.# print(container.data)response=container.data[request_id]results=response['results']first_table=results[0]# Verify that a first table is present.print(first_table)second_table=results[1]# Verify that a second table is presentprint(second_table)#-------------------------------------------------------------------## 2. Send the batch request to attach tags over the entire result set#-------------------------------------------------------------------## Create a new containercontainer=DataContainer()# Create a new batch requestbatch=create_batch_request(container.callback)# Set the template name configtemplate_name='projects/[MY_PROJECT_ID]/locations/[MY-LOCATION]/tagTemplates/[MY-TEMPLATE-NAME]'forresultinresults:# Generated a unique id for request.request_id=generated_uui()# Add the entry name as the tag parent.parent=result['relativeResourceName']# Create the tag request body.create_tag_request_body={'template':template_name,# CHANGE for your template field values.'fields':{'etl_score':{'doubleValue':0.5}}}# Add requests to the batch client.batch.add(service.projects().locations().entryGroups().entries().tags().create(body=create_tag_request_body,parent=parent),request_id=request_id)# Execute the batch request.# Since the Batch Client works with regions# If you receive [HttpError 400 errors]# 1. Verify the region you used to create the Batch client# 2. Verify the region where the Entry is located.# 3. verify the region of the parent tag template used by the tag.batch.execute()# Uncomment to verify the full response from tag creation.# print(container)
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-01-16(UTC)"],[[["\u003cp\u003eHTTP batching in the Data Catalog API allows you to combine multiple API calls into a single HTTP request, minimizing overhead when making numerous small requests.\u003c/p\u003e\n"],["\u003cp\u003eWhile batching reduces overhead, each request within a batch is still counted individually for API quota purposes, and you are limited to a maximum of 1000 calls in one single batch request.\u003c/p\u003e\n"],["\u003cp\u003eWhen using HTTP batch requests with Data Catalog regional API endpoints, all API requests in a batch must belong to the same region, and the correct regional endpoint must be used when executing the batch.\u003c/p\u003e\n"],["\u003cp\u003eRegion-independent APIs like \u003ccode\u003ecatalog.lookup\u003c/code\u003e and \u003ccode\u003eentries.search\u003c/code\u003e can be grouped together in a batch but cannot be combined with region-dependent APIs in the same batch.\u003c/p\u003e\n"],["\u003cp\u003eRequests within a \u003ccode\u003eBatchHttpRequest\u003c/code\u003e may be executed in parallel and are not guaranteed to be executed in order, so requests in the same batch should not depend on one another for correct execution.\u003c/p\u003e\n"]]],[],null,["# Creating HTTP batch requests for Data Catalog\n\nEach HTTP connection that your application makes requires a certain amount of\noverhead. Data Catalog API requests support batching, and lets\nyou combine several API calls into a single HTTP request. If you have many small\nrequests to make and want to minimize HTTP request overhead, you might want to\nuse HTTP batching.\n| **Note:** Batching reduces overhead, but requests within a batch still count as multiple requests for API quota purposes.\n\nFor more information about using HTTP batch with Google Cloud, see the\n[Google API Python client documentation](https://github.com/googleapis/google-api-python-client/blob/master/docs/batch.md).\n| **Note:** You're limited to 1000 calls in a single batch request. If you need to make more calls than that, use multiple batch requests.\n\nCreating HTTP batch requests in Python\n--------------------------------------\n\nTo use batch requests to create or manipulate entries in\nData Catalog, you first need to search for the entries you want\nto change using [`catalog.search()`](/data-catalog/docs/reference/rest/v1/catalog/search)\nor [`entries.lookup()`](/data-catalog/docs/reference/rest/v1/entries/lookup).\n\nNext, follow these steps to build an HTTP batch request using the\n[Google Python API](https://github.com/googleapis/google-api-python-client/blob/master/docs/batch.md):\n\n1. Create a [`BatchHttpRequest`](https://googleapis.github.io/google-api-python-client/docs/epy/googleapiclient.http.BatchHttpRequest-class.html) object by calling `new_batch_http_request()` or with the `BatchHttpRequest()` constructor. You might pass in a callback, which will be called in response to each request.\n2. Call `add()` on the `BatchHttpRequest` object for each request you want to execute. If you passed a callback when creating your `BatchHttpRequest` object, each `add()` might include parameters to be passed to the callback.\n3. After you've added the requests, call `execute()` on the `BatchHttpRequest` object to execute them. The `execute()` function blocks until all callbacks have been called.\n\nRequests in a `BatchHttpRequest` might be executed in parallel, and not in the\norder of execution. This means requests in the same batch shouldn't be dependent\non each other. For example, you shouldn't create an `EntryGroup` and `Entry`\nbelonging to it in the same request, as the creation of the `Entry` might\nexecute before the creation of the `EntryGroup`, causing execution to fail.\n\n### Batch requests with locational endpoints\n\nWhen using HTTP batch requests with Data Catalog\n[locational API endpoints](/data-catalog/docs/concepts/regions), all API requests\nin a batch must belong to the same region. When executing the batch, you must\ncall the correct locational endpoint. For example, if your resources are in\n`us-central1`, call `https://us-central1-datacatalog.googleapis.com/batch`.\n| **Note:** Here \"region\" refers to standard regions as well as multi-regions like \"Global\", \"European Union\", and \"United States\".\n\n#### Region-independent APIs\n\nRegion-independent APIs, such as `catalog.lookup` and `entries.search`, can\nbe grouped with each other, but must not be grouped with region-dependent APIs.\nFor region-independent APIs, use the endpoint: `https://datacatalog.googleapis.com/batch`.\n\nExample\n-------\n\nThis sample Python application demonstrates how to use an HTTP batch request to create multiple tags from a tag template using the Data Catalog API. \n\n```python\nfrom googleapiclient.discovery import build\nfrom googleapiclient.http import BatchHttpRequest\nfrom oauth2client.service_account import ServiceAccountCredentials\nimport uuid\n\n#-------------------------------------------------------------#\n# 0. Helper and initialization logic\n#-------------------------------------------------------------#\n\n# Set the environment configuration.\nservice_key_file_location = '[SA_PATH]'\n\nproject_id = '[MY_PROJECT_ID]'\n\n# Helper container to store results.\nclass DataContainer:\n def __init__(self):\n self.data = {}\n\n def callback(self, request_id, response, exception):\n if exception is not None:\n print('request_id: {}, exception: {}'.format(request_id, str(exception)))\n pass\n else:\n print(request_id)\n self.data[request_id] = response\n\n# Helper function to build the Discovery Service config.\ndef get_service(api_name, api_version, scopes, key_file_location):\n \"\"\"\n Get a service that communicates to a Google API.\n\n Args:\n api_name: The name of the API to connect to.\n api_version: The API version to connect to.\n scopes: A list auth scopes to authorize for the application.\n key_file_location: The path to a valid service account JSON key file.\n\n Returns:\n A service that is connected to the specified API.\n \"\"\"\n credentials = ServiceAccountCredentials.from_json_keyfile_name(\n key_file_location, scopes=scopes)\n\n # Build the service object.\n service = build(api_name, api_version, credentials=credentials)\n\n return service\n\n# Helper function to create a UUID for each request\ndef generated_uui():\n return str(uuid.uuid4())\n\ndef create_batch_request(callback):\n # For more info on supported regions\n # check: https://cloud.google.com/data-catalog/docs/concepts/regions\n\n region='us-datacatalog.googleapis.com'\n\n return BatchHttpRequest(batch_uri='https://{}/batch'.format(region), callback=callback)\n\ncontainer = DataContainer()\n\n# Scope to set up the Discovery Service config.\nscope = 'https://www.googleapis.com/auth/cloud-platform'\n\n# Create service.\nservice = get_service(\n api_name='datacatalog',\n api_version='v1',\n scopes=[scope],\n key_file_location=service_key_file_location)\n\n# Create the batch request config.\nbatch = create_batch_request(container.callback)\n\n#-------------------------------------------------------------#\n# 1. Start by fetching a list of entries using search call\n#-------------------------------------------------------------#\n\n# Create the search request body.\n# This example searches for all BigQuery tables in a project.\nsearch_request_body = {\n 'query': 'type=TABLE system=BIGQUERY',\n 'scope': {'includeProjectIds': [project_id]}\n}\n\n# Generated a unique ID for the request.\nrequest_id = generated_uui()\n\n# Add the request to the batch client.\nbatch.add(service.catalog().search(body=search_request_body), request_id=request_id)\n\n# Execute the batch request.\nbatch.execute()\n\n# Uncomment to verify the full response from search.\n# print(container.data)\n\nresponse = container.data[request_id]\n\nresults = response['results']\n\nfirst_table = results[0]\n\n# Verify that a first table is present.\nprint(first_table)\n\nsecond_table = results[1]\n\n# Verify that a second table is present\nprint(second_table)\n\n#-------------------------------------------------------------------#\n# 2. Send the batch request to attach tags over the entire result set\n#-------------------------------------------------------------------#\n\n# Create a new container\ncontainer = DataContainer()\n\n# Create a new batch request\nbatch = create_batch_request(container.callback)\n\n# Set the template name config\ntemplate_name = 'projects/[MY_PROJECT_ID]/locations/[MY-LOCATION]/tagTemplates/[MY-TEMPLATE-NAME]'\n\nfor result in results:\n # Generated a unique id for request.\n request_id = generated_uui()\n\n # Add the entry name as the tag parent.\n parent=result['relativeResourceName']\n\n # Create the tag request body.\n create_tag_request_body = {\n 'template': template_name,\n # CHANGE for your template field values.\n 'fields': {'etl_score': {'doubleValue': 0.5}}\n }\n\n # Add requests to the batch client.\n batch.add(service.projects().locations().\n entryGroups().entries().tags().\n create(body=create_tag_request_body,\n parent=parent),\n request_id=request_id)\n\n# Execute the batch request.\n\n# Since the Batch Client works with regions\n# If you receive [HttpError 400 errors]\n# 1. Verify the region you used to create the Batch client\n# 2. Verify the region where the Entry is located.\n# 3. verify the region of the parent tag template used by the tag.\n\nbatch.execute()\n\n# Uncomment to verify the full response from tag creation.\n# print(container)\n```"]]