Importar documentos do Cloud Storage
Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Importar documentos do Cloud Storage
Mais informações
Para ver a documentação detalhada que inclui este exemplo de código, consulte:
Exemplo de código
Exceto em caso de indicação contrária, o conteúdo desta página é licenciado de acordo com a Licença de atribuição 4.0 do Creative Commons, e as amostras de código são licenciadas de acordo com a Licença Apache 2.0. Para mais detalhes, consulte as políticas do site do Google Developers. Java é uma marca registrada da Oracle e/ou afiliadas.
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis content provides a Python code sample for importing documents from Google Cloud Storage into a Vertex AI Agent Builder data store.\u003c/p\u003e\n"],["\u003cp\u003eThe process involves setting up Application Default Credentials for authentication and configuring client options based on the data store's location.\u003c/p\u003e\n"],["\u003cp\u003eThe code sample shows how to import both unstructured documents and documents with metadata using different file formats like PDF, JSONL, and CSV, and how to select the right data schema.\u003c/p\u003e\n"],["\u003cp\u003eThe sample uses the \u003ccode\u003eImportDocumentsRequest\u003c/code\u003e with \u003ccode\u003eGcsSource\u003c/code\u003e to specify the location of the files in Cloud Storage and the type of the data, then triggers the import operation with the option of \u003ccode\u003eFULL\u003c/code\u003e or \u003ccode\u003eINCREMENTAL\u003c/code\u003e reconciliation mode.\u003c/p\u003e\n"],["\u003cp\u003eThe documentation also includes instructions for further actions, such as searching for more code samples using the Google Cloud sample browser and links to Vertex AI Agent Builder Python API documentation.\u003c/p\u003e\n"]]],[],null,["# Import documents from Cloud Storage\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Create a custom recommendations data store](/generative-ai-app-builder/docs/create-data-store-recommendations)\n- [Create a search data store](/generative-ai-app-builder/docs/create-data-store-es)\n- [Refresh structured and unstructured data](/agentspace/docs/refresh-data)\n- [Refresh structured and unstructured data](/generative-ai-app-builder/docs/refresh-data)\n\nCode sample\n-----------\n\n### Python\n\n\nFor more information, see the\n[AI Applications Python API\nreference documentation](/python/docs/reference/discoveryengine/latest).\n\n\nTo authenticate to AI Applications, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n from google.api_core.client_options import ClientOptions\n from google.cloud import discoveryengine\n\n # TODO(developer): Uncomment these variables before running the sample.\n # project_id = \"YOUR_PROJECT_ID\"\n # location = \"YOUR_LOCATION\" # Values: \"global\"\n # data_store_id = \"YOUR_DATA_STORE_ID\"\n\n # Examples:\n # - Unstructured documents\n # - `gs://bucket/directory/file.pdf`\n # - `gs://bucket/directory/*.pdf`\n # - Unstructured documents with JSONL Metadata\n # - `gs://bucket/directory/file.json`\n # - Unstructured documents with CSV Metadata\n # - `gs://bucket/directory/file.csv`\n # gcs_uri = \"YOUR_GCS_PATH\"\n\n # For more information, refer to:\n # https://cloud.google.com/generative-ai-app-builder/docs/locations#specify_a_multi-region_for_your_data_store\n client_options = (\n ClientOptions(api_endpoint=f\"{location}-discoveryengine.googleapis.com\")\n if location != \"global\"\n else None\n )\n\n # Create a client\n client = discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.services.document_service.DocumentServiceClient.html(client_options=client_options)\n\n # The full resource name of the search engine branch.\n # e.g. projects/{project}/locations/{location}/dataStores/{data_store_id}/branches/{branch}\n parent = client.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.services.document_service.DocumentServiceClient.html#google_cloud_discoveryengine_v1_services_document_service_DocumentServiceClient_branch_path(\n project=project_id,\n location=location,\n data_store=data_store_id,\n branch=\"default_branch\",\n )\n\n request = discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.ImportDocumentsRequest.html(\n parent=parent,\n gcs_source=discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.GcsSource.html(\n # Multiple URIs are supported\n input_uris=[gcs_uri],\n # Options:\n # - `content` - Unstructured documents (PDF, HTML, DOC, TXT, PPTX)\n # - `custom` - Unstructured documents with custom JSONL metadata\n # - `document` - Structured documents in the discoveryengine.Document format.\n # - `csv` - Unstructured documents with CSV metadata\n data_schema=\"content\",\n ),\n # Options: `FULL`, `INCREMENTAL`\n reconciliation_mode=discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.ImportDocumentsRequest.html.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.ImportDocumentsRequest.ReconciliationMode.html.INCREMENTAL,\n )\n\n # Make the request\n operation = client.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.services.document_service.DocumentServiceClient.html#google_cloud_discoveryengine_v1_services_document_service_DocumentServiceClient_import_documents(request=request)\n\n print(f\"Waiting for operation to complete: {operation.operation.name}\")\n response = operation.result()\n\n # After the operation is complete,\n # get information from operation metadata\n metadata = discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.ImportDocumentsMetadata.html(operation.metadata)\n\n # Handle the response\n print(response)\n print(metadata)\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=genappbuilder)."]]