Toolbox: Convierte anotaciones externas al formato de Documento
Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
Convierte las anotaciones externas al formato Document
que usa Document AI Workbench para el entrenamiento.
Explora más
Para obtener documentación en la que se incluye esta muestra de código, consulta lo siguiente:
Muestra de código
Salvo que se indique lo contrario, el contenido de esta página está sujeto a la licencia Atribución 4.0 de Creative Commons, y los ejemplos de código están sujetos a la licencia Apache 2.0. Para obtener más información, consulta las políticas del sitio de Google Developers. Java es una marca registrada de Oracle o sus afiliados.
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis tool converts external annotation formats into the \u003ccode\u003eDocument\u003c/code\u003e format required by Document AI Workbench for training purposes.\u003c/p\u003e\n"],["\u003cp\u003eThe conversion process necessitates that external annotations contain type, text, and one of three supported bounding box types.\u003c/p\u003e\n"],["\u003cp\u003eThe tool allows for conversion of a config file with associated external annotations into the Document format, with a specified input and output path.\u003c/p\u003e\n"],["\u003cp\u003eYou must setup authentication to utilize this tool to access Document AI, as explained in the included link to the documentation.\u003c/p\u003e\n"]]],[],null,["# Toolbox - Convert external annotations to Document format\n\nConvert external annotations to the [`Document`](/document-ai/docs/reference/rest/v1/Document) format used by Document AI Workbench for training.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Document AI Toolbox client libraries](/document-ai/docs/toolbox)\n- [Handle processing response](/document-ai/docs/handle-response)\n\nCode sample\n-----------\n\n### Python\n\n\nFor more information, see the\n[Document AI Python API\nreference documentation](/python/docs/reference/documentai/latest).\n\n\nTo authenticate to Document AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud.documentai_toolbox import converter\n\n # TODO(developer): Uncomment these variables before running the sample.\n # This sample will convert external annotations to the Document.json format used by Document AI Workbench for training.\n # To process this the external annotation must have these type of objects:\n # 1) Type\n # 2) Text\n # 3) Bounding Box (bounding boxes must be 1 of the 3 optional types)\n #\n # This is the bare minimum requirement to convert the annotations but for better accuracy you will need to also have:\n # 1) Document width & height\n #\n # Bounding Box Types:\n # Type 1:\n # bounding_box:[{\"x\":1,\"y\":2},{\"x\":2,\"y\":2},{\"x\":2,\"y\":3},{\"x\":1,\"y\":3}]\n # Type 2:\n # bounding_box:{ \"Width\": 1, \"Height\": 1, \"Left\": 1, \"Top\": 1}\n # Type 3:\n # bounding_box: [1,2,2,2,2,3,1,3]\n #\n # Note: If these types are not sufficient you can propose a feature request or contribute the new type and conversion functionality.\n #\n # Given a folders in gcs_input_path with the following structure :\n #\n # gs://path/to/input/folder\n # ├──test_annotations.json\n # ├──test_config.json\n # └──test.pdf\n #\n # An example of the config is in sample-converter-configs/Azure/form-config.json\n #\n # location = \"us\",\n # processor_id = \"my_processor_id\"\n # gcs_input_path = \"gs://path/to/input/folder\"\n # gcs_output_path = \"gs://path/to/input/folder\"\n\n\n def convert_external_annotations_sample(\n location: str,\n processor_id: str,\n project_id: str,\n gcs_input_path: str,\n gcs_output_path: str,\n ) -\u003e None:\n converter.convert_from_config(\n project_id=project_id,\n location=location,\n processor_id=processor_id,\n gcs_input_path=gcs_input_path,\n gcs_output_path=gcs_output_path,\n )\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=documentai)."]]