Toolbox - Export entities to BigQuery
Stay organized with collections
Save and categorize content based on your preferences.
Export entities from a processed document (or document shards) to a BigQuery table.
Explore further
For detailed documentation that includes this code sample, see the following:
Code sample
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis code sample demonstrates how to export entities extracted from a processed document or document shards directly to a BigQuery table.\u003c/p\u003e\n"],["\u003cp\u003eThe process involves using the Document AI Toolbox client library to access and manipulate document data.\u003c/p\u003e\n"],["\u003cp\u003eAuthentication to Document AI is required, and users should set up Application Default Credentials for local development.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eentities_to_bigquery\u003c/code\u003e function from the document class handles the data transfer to a target BigQuery dataset and table.\u003c/p\u003e\n"],["\u003cp\u003eThe code allows users to also use the \u003ccode\u003eform_fields_to_bigquery\u003c/code\u003e function to export the form fields instead of the entities.\u003c/p\u003e\n"]]],[],null,["# Toolbox - Export entities to BigQuery\n\nExport entities from a processed document (or document shards) to a BigQuery table.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Document AI Toolbox client libraries](/document-ai/docs/toolbox)\n- [Handle processing response](/document-ai/docs/handle-response)\n\nCode sample\n-----------\n\n### Python\n\n\nFor more information, see the\n[Document AI Python API\nreference documentation](/python/docs/reference/documentai/latest).\n\n\nTo authenticate to Document AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud.documentai_toolbox import document\n\n # TODO(developer): Uncomment these variables before running the sample.\n # Given a document.proto or sharded document.proto in path gs://bucket/path/to/folder\n # gcs_bucket_name = \"bucket\"\n # gcs_prefix = \"path/to/folder\"\n # dataset_name = \"test_dataset\"\n # table_name = \"test_table\"\n # project_id = \"YOUR_PROJECT_ID\"\n\n\n def entities_to_bigquery_sample(\n gcs_bucket_name: str,\n gcs_prefix: str,\n dataset_name: str,\n table_name: str,\n project_id: str,\n ) -\u003e None:\n wrapped_document = document.Document.from_gcs(\n gcs_bucket_name=gcs_bucket_name, gcs_prefix=gcs_prefix\n )\n\n job = wrapped_document.entities_to_bigquery(\n dataset_name=dataset_name, table_name=table_name, project_id=project_id\n )\n\n # Also supported:\n # job = wrapped_document.form_fields_to_bigquery(\n # dataset_name=dataset_name, table_name=table_name, project_id=project_id\n # )\n\n print(\"Document entities loaded into BigQuery\")\n print(f\"Job ID: {job.job_id}\")\n print(f\"Table: {job.destination.path}\")\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=documentai)."]]