工具箱 - 将表输出到数据框或 CSV
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
将处理后的文档(或文档分片)中的表格导出到 Pandas DataFrame 或 CSV 文件。
深入探索
如需查看包含此代码示例的详细文档,请参阅以下内容:
代码示例
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis tool exports tables from a processed document or its shards.\u003c/p\u003e\n"],["\u003cp\u003eTables can be exported into a Pandas Dataframe for further manipulation and analysis.\u003c/p\u003e\n"],["\u003cp\u003eThe exported tables can also be saved as CSV, HTML, or Markdown files.\u003c/p\u003e\n"],["\u003cp\u003eThe code provided helps authenticate to Document AI and offers information on setting up local authentication for development.\u003c/p\u003e\n"]]],[],null,["# Toolbox - Output table to Dataframe or CSV\n\nExport tables from a processed document (or document shards) to a Pandas Dataframe or a CSV file.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Document AI Toolbox client libraries](/document-ai/docs/toolbox)\n- [Handle processing response](/document-ai/docs/handle-response)\n\nCode sample\n-----------\n\n### Python\n\n\nFor more information, see the\n[Document AI Python API\nreference documentation](/python/docs/reference/documentai/latest).\n\n\nTo authenticate to Document AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud.documentai_toolbox import document\n\n # TODO(developer): Uncomment these variables before running the sample.\n # Given a local document.proto or sharded document.proto in path\n # document_path = \"path/to/local/document.json\"\n # output_file_prefix = \"output/table\"\n\n\n def table_sample(document_path: str, output_file_prefix: str) -\u003e None:\n wrapped_document = document.Document.from_document_path(document_path=document_path)\n\n print(\"Tables in Document\")\n for page in wrapped_document.pages:\n for table_index, table in enumerate(page.tables):\n # Convert table to Pandas Dataframe\n # Refer to https://pandas.pydata.org/docs/reference/frame.html for all supported methods\n df = table.to_dataframe()\n print(df)\n\n output_filename = f\"{output_file_prefix}-{page.page_number}-{table_index}\"\n\n # Write Dataframe to CSV file\n df.to_csv(f\"{output_filename}.csv\", index=False)\n\n # Write Dataframe to HTML file\n df.to_html(f\"{output_filename}.html\", index=False)\n\n # Write Dataframe to Markdown file\n df.to_markdown(f\"{output_filename}.md\", index=False)\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=documentai)."]]