Build a document summarizer in the Google Cloud console
You can create a summarizer processor using Document AI to summarize the content of documents. The output can be customized based on length and format.
Here is some sample JSON output from the resulting entity:
{
"type": "summary",
"mentionText": " Superconductivity is a phenomenon in which a material conducts
electricity with no resistance. It was discovered in 1911 by Dutch physicist Heike
Kamerlingh Onnes. In 1986, a new class of materials was discovered that can superconduct
at much higher temperatures. These materials are called high-temperature superconductors.
They have the potential to revolutionize the way we use electricity. However,
high-temperature superconductors are still very expensive to produce. Scientists
are working on ways to make them more affordable.",
"normalizedValue": {
"text": " Superconductivity is a phenomenon in which a material conducts
electricity with no resistance. It was discovered in 1911 by Dutch physicist
Heike Kamerlingh Onnes. In 1986, a new class of materials was discovered that
can superconduct at much higher temperatures. These materials are called
high-temperature superconductors. They have the potential to revolutionize
the way we use electricity. However, high-temperature superconductors are
still very expensive to produce. Scientists are working on ways to make
them more affordable."
}
}
Procedure
In this quickstart, you create a document summarizer processor, upload a sample document for processing, and create a custom processor version to adjust the summary structure.
To follow step-by-step guidance for this task directly in the Google Cloud console, click Guide me:
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Document AI, Cloud Storage APIs.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the Document AI, Cloud Storage APIs.
Create a summarizer processor
Use the Google Cloud console to create a summarizer processor. See creating and managing processors for more information.
In the Google Cloud console, in the Document AI section, go to the Workbench page.
For Summarizer, select
Create processor .In the Create processor menu, enter a name for your processor, such as
quickstart-summarizer
.Select the region closest to you.
Select Create.
Your processor has now been created.
Test Processor
You are on the Processor overview page of the processor you just created.
Select on the
Customize & build tab to experiment with the processor.-
It is a PDF file containing the Wikipedia page for Superconductivity.
Select
Upload Test Document and select the document you just downloaded.You are now on the Summary page. You can view the OCR detected text and document summarization.
Adjust the
Length and Format settings to Moderate and Bulleted respectively, then select Rewrite and observe the results.Go back to the Customize & build page.
Deploy processor version
If you want to use specific summarization settings when processing documents with the API, create a processor version for those settings.
The
Summarization settings are set to the last values you used on the previous page.Select on
Create New Version to create a processor version with the specified Summarization settings.Enter a name for the processor version, such as
quickstart-moderate-bulleted
, and select Create Version.Go to the
Deploy & Use tab to view the deployment status. Deployment takes a few minutes.When the version is deployed, you can set it as the
Default version , or you can provide the version ID when processing documents with the API.To use the Document AI API:
- Follow the code samples in send a processing request to use online and batch processing.
- Refer to Quotas and limits for the number of pages supported for online and batch processing.
- Follow the code samples in Handle the processing response to get the summarization response from the processor.
- Follow the code samples in send a processing request to use online and batch processing.
You have successfully used Document AI to extract text from a document and summarize it.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
To avoid unnecessary Google Cloud charges, use the Google Cloud console to delete your processor and project if you do not need them.