Dokumente aus Cloud Storage importieren
Mit Sammlungen den Überblick behalten
Sie können Inhalte basierend auf Ihren Einstellungen speichern und kategorisieren.
Dokumente aus Cloud Storage importieren
Weitere Informationen
Eine ausführliche Dokumentation, die dieses Codebeispiel enthält, finden Sie hier:
Codebeispiel
Nächste Schritte
Wenn Sie nach Codebeispielen für andere Google Cloud -Produkte suchen und filtern möchten, können Sie den Google Cloud -Beispielbrowser verwenden.
Sofern nicht anders angegeben, sind die Inhalte dieser Seite unter der Creative Commons Attribution 4.0 License und Codebeispiele unter der Apache 2.0 License lizenziert. Weitere Informationen finden Sie in den Websiterichtlinien von Google Developers. Java ist eine eingetragene Marke von Oracle und/oder seinen Partnern.
[[["Leicht verständlich","easyToUnderstand","thumb-up"],["Mein Problem wurde gelöst","solvedMyProblem","thumb-up"],["Sonstiges","otherUp","thumb-up"]],[["Schwer verständlich","hardToUnderstand","thumb-down"],["Informationen oder Beispielcode falsch","incorrectInformationOrSampleCode","thumb-down"],["Benötigte Informationen/Beispiele nicht gefunden","missingTheInformationSamplesINeed","thumb-down"],["Problem mit der Übersetzung","translationIssue","thumb-down"],["Sonstiges","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis content provides a Python code sample for importing documents from Google Cloud Storage into a Vertex AI Agent Builder data store.\u003c/p\u003e\n"],["\u003cp\u003eThe process involves setting up Application Default Credentials for authentication and configuring client options based on the data store's location.\u003c/p\u003e\n"],["\u003cp\u003eThe code sample shows how to import both unstructured documents and documents with metadata using different file formats like PDF, JSONL, and CSV, and how to select the right data schema.\u003c/p\u003e\n"],["\u003cp\u003eThe sample uses the \u003ccode\u003eImportDocumentsRequest\u003c/code\u003e with \u003ccode\u003eGcsSource\u003c/code\u003e to specify the location of the files in Cloud Storage and the type of the data, then triggers the import operation with the option of \u003ccode\u003eFULL\u003c/code\u003e or \u003ccode\u003eINCREMENTAL\u003c/code\u003e reconciliation mode.\u003c/p\u003e\n"],["\u003cp\u003eThe documentation also includes instructions for further actions, such as searching for more code samples using the Google Cloud sample browser and links to Vertex AI Agent Builder Python API documentation.\u003c/p\u003e\n"]]],[],null,["# Import documents from Cloud Storage\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Create a custom recommendations data store](/generative-ai-app-builder/docs/create-data-store-recommendations)\n- [Create a search data store](/generative-ai-app-builder/docs/create-data-store-es)\n- [Refresh structured and unstructured data](/agentspace/docs/refresh-data)\n- [Refresh structured and unstructured data](/generative-ai-app-builder/docs/refresh-data)\n\nCode sample\n-----------\n\n### Python\n\n\nFor more information, see the\n[AI Applications Python API\nreference documentation](/python/docs/reference/discoveryengine/latest).\n\n\nTo authenticate to AI Applications, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n from google.api_core.client_options import ClientOptions\n from google.cloud import discoveryengine\n\n # TODO(developer): Uncomment these variables before running the sample.\n # project_id = \"YOUR_PROJECT_ID\"\n # location = \"YOUR_LOCATION\" # Values: \"global\"\n # data_store_id = \"YOUR_DATA_STORE_ID\"\n\n # Examples:\n # - Unstructured documents\n # - `gs://bucket/directory/file.pdf`\n # - `gs://bucket/directory/*.pdf`\n # - Unstructured documents with JSONL Metadata\n # - `gs://bucket/directory/file.json`\n # - Unstructured documents with CSV Metadata\n # - `gs://bucket/directory/file.csv`\n # gcs_uri = \"YOUR_GCS_PATH\"\n\n # For more information, refer to:\n # https://cloud.google.com/generative-ai-app-builder/docs/locations#specify_a_multi-region_for_your_data_store\n client_options = (\n ClientOptions(api_endpoint=f\"{location}-discoveryengine.googleapis.com\")\n if location != \"global\"\n else None\n )\n\n # Create a client\n client = discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.services.document_service.DocumentServiceClient.html(client_options=client_options)\n\n # The full resource name of the search engine branch.\n # e.g. projects/{project}/locations/{location}/dataStores/{data_store_id}/branches/{branch}\n parent = client.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.services.document_service.DocumentServiceClient.html#google_cloud_discoveryengine_v1_services_document_service_DocumentServiceClient_branch_path(\n project=project_id,\n location=location,\n data_store=data_store_id,\n branch=\"default_branch\",\n )\n\n request = discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.ImportDocumentsRequest.html(\n parent=parent,\n gcs_source=discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.GcsSource.html(\n # Multiple URIs are supported\n input_uris=[gcs_uri],\n # Options:\n # - `content` - Unstructured documents (PDF, HTML, DOC, TXT, PPTX)\n # - `custom` - Unstructured documents with custom JSONL metadata\n # - `document` - Structured documents in the discoveryengine.Document format.\n # - `csv` - Unstructured documents with CSV metadata\n data_schema=\"content\",\n ),\n # Options: `FULL`, `INCREMENTAL`\n reconciliation_mode=discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.ImportDocumentsRequest.html.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.ImportDocumentsRequest.ReconciliationMode.html.INCREMENTAL,\n )\n\n # Make the request\n operation = client.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.services.document_service.DocumentServiceClient.html#google_cloud_discoveryengine_v1_services_document_service_DocumentServiceClient_import_documents(request=request)\n\n print(f\"Waiting for operation to complete: {operation.operation.name}\")\n response = operation.result()\n\n # After the operation is complete,\n # get information from operation metadata\n metadata = discoveryengine.https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.ImportDocumentsMetadata.html(operation.metadata)\n\n # Handle the response\n print(response)\n print(metadata)\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=genappbuilder)."]]