Before you begin
Follow the quick start guide to set up necessary APIs, grant permissions, and provision the instances for your project.
To get access to the web application, join the DW-UI-preview group.
Configure the project
Replace the project number in the following URL and open it in a browser:
https://documentwarehouse.cloud.google.com/provision/<YOUR_PROJECT_NUMBER>
Optionally, enter the Project Display Name. It appears on the website.
Choose your Location. This must be the same location you chose in the provisioning and initialization process.
Select ACL mode. This must be the same mode you chose in the provisioning and initialization process.
If your project is using document-level ACL control:
Enter Service Account Email and Private Key that you got from the provisioning step.
In the next step, set yourself as Document Admin. This step is necessary to save the configuration. Otherwise, you may be prompted to redo the web application provisioning steps.
Schema creation and rule creation are optional. You can do that in the Document AI Warehouse admin console after provisioning setup.
Change the configurations of the project
Go to the admin console and go to Project Configuration.
Optionally, enter the Project Display Name. It appears on the website.
Choose your Location. This must be the same location you chose in the provisioning and initialization process.
Select ACL mode. This must be the same mode you chose in the provisioning and initialization process.
If your project is using document-level ACL control, enter Service Account Email and Private Key that you got from the provisioning step.
Set up project-level access controls
Grant project-level access controls (ACLs) to users. There are four
Document AI Warehouse roles, including Document Creator
, Document Viewer
,
Document Editor
, and Document Admin
. The following information about the
roles is important:
The
Document Creator
role is typically granted to all users, which lets the users create documents.- We recommend that you assign this role to a group of users, such as
Doc Owners
, who are expected to create documents in Document AI Warehouse and manage the group. - The creator is automatically granted the document-level
Document Admin
role on documents they create by default.
- We recommend that you assign this role to a group of users, such as
The
Document Viewer
,Document Editor
, andDocument Admin
roles must be used with caution and granted only to select administrators. It is recommended to grant operators temporarily for cleanup or audit needs. As with these roles, the users have permissions to view, edit, share, or delete all documents in the project.Document level ACLs can be granted later by
Document Admin
of each document.
Configure schemas
Configure the schema for documents and folders. A document schema is used to define the document structure in Document AI Warehouse. For more information about document schemas, see Manage document schemas.
(Optional) Process documents using Document AI
Customers can map the Document AI processors to the Document AI Warehouse schema, and use Document AI processors to extract text and data. By doing so, when documents are uploaded with specific Document AI Warehouse schema, Document AI Warehouse uses the corresponding Document AI processor to extract the document properties based on the mapping. For this to work, here are the requirements and recommendations:
This requires the Document AI processors in the same project.
To make uploaded documents full-text-searchable, we recommend using the Document AI OCR processor for PDF doc-types that don't have specialized processors. They are then full-text searchable in Document AI Warehouse.
Multiple processors can be mapped to a schema, the user can specify which processor to use for extraction when uploading documents.
The throughput quota for Document AI processors is lower (around 10 qps). Therefore, the batch pipelines or multiple concurrent user upload scenarios run slower than typical ingests throughput that is supported by Document AI Warehouse Create API.
If you need to use custom models for classification and extraction, convert the extracted data into a Document AI Warehouse API JSON format and ingest the data using the
Create
API.
Troubleshooting
To request access to the Document AI Warehouse web application, join this group.
If you see messages like "you do not have access to Partner Dash", follow this step to resolve the issue.
Next steps
For more information, learn how to upload documents.