Toolbox - Split a PDF
Stay organized with collections
Save and categorize content based on your preferences.
Split a PDF file based on output from a Splitter/Classifier processor.
Explore further
For detailed documentation that includes this code sample, see the following:
Code sample
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis content demonstrates how to split a PDF file using output from a Document AI Splitter/Classifier processor.\u003c/p\u003e\n"],["\u003cp\u003eThe process involves utilizing the \u003ccode\u003edocumentai_toolbox\u003c/code\u003e library in Python to handle the document splitting operation.\u003c/p\u003e\n"],["\u003cp\u003eYou will need to set up Application Default Credentials for authentication with Document AI to run this code successfully.\u003c/p\u003e\n"],["\u003cp\u003eThe code provided takes a \u003ccode\u003edocument.proto\u003c/code\u003e, or sharded \u003ccode\u003edocument.proto\u003c/code\u003e, along with a \u003ccode\u003epdf\u003c/code\u003e and splits the PDF file based on the provided parameters, and outputs the new files to the user defined path.\u003c/p\u003e\n"],["\u003cp\u003eFor further information, including how to authenticate, and reference documentation, see the provided links.\u003c/p\u003e\n"]]],[],null,["# Toolbox - Split a PDF\n\nSplit a PDF file based on output from a Splitter/Classifier processor.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Document AI Toolbox client libraries](/document-ai/docs/toolbox)\n- [Document splitters behavior](/document-ai/docs/splitters)\n- [Handle processing response](/document-ai/docs/handle-response)\n- [Splitters behavior](/document-ai/docs/splitters-behavior)\n\nCode sample\n-----------\n\n### Python\n\n\nFor more information, see the\n[Document AI Python API\nreference documentation](/python/docs/reference/documentai/latest).\n\n\nTo authenticate to Document AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud.documentai_toolbox import document\n\n # TODO(developer): Uncomment these variables before running the sample.\n # Given a local document.proto or sharded document.proto from a splitter/classifier in path\n # document_path = \"path/to/local/document.json\"\n # pdf_path = \"path/to/local/document.pdf\"\n # output_path = \"resources/output/\"\n\n\n def split_pdf_sample(document_path: str, pdf_path: str, output_path: str) -\u003e None:\n wrapped_document = document.Document.from_document_path(document_path=document_path)\n\n output_files = wrapped_document.split_pdf(\n pdf_path=pdf_path, output_path=output_path\n )\n\n print(\"Document Successfully Split\")\n for output_file in output_files:\n print(output_file)\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=documentai)."]]