Add schemas

In addition to the default schema added in the initialization process, you can add more schemas in the schemas page.

  1. Be sure you are a content Document AI Warehouse admin roles/contentwarehouse.admin in Cloud Identity.

  2. On the left navigation panel, go to schemas.

  3. Add a new schema. This option only appears if you are a content Document AI Warehouse admin.

  4. Give a schema display name, select type and create.

    • Select Document schema type if it is a document schema.
    • Select Folder schema type and folder type, if it is a folder schema.

  5. After the schema is created, the page redirects to the new schema. You can add a property to the schema.

  6. In the Create new property panel, enter a property name. The property name must begin with a letter and contain no white spaces. Use the Document AI entity as a property name if possible. That automatically maps the extracted entities to property and index the entity values to be filterable and searchable. The mapping is case sensitive.

    If you prefer to have a display name, uncheck Use property name as display name and enter a display name.

  7. Select a property type.

    For the enum type, you can enter the possible values by adding an item in the possible enum values section that appears next.

  8. Choose whether the property has multiple values or is required.

  9. Mapping Document AI entities to properties

    • If your property name matches the Document AI entity name, the entity is directly mapped to the property, and the property value is indexed to be filterable and searchable. The match is case sensitive.
    • If the property name differs from the Document AI entity name, you need to map the Document AI extraction results to the Document AI Warehouse property, and specify the mapping in the Mapping section. Check out the documentation for more details.

    Some common processor type display names and the corresponding types.

    Display Name Processor Type
    Invoice Parser INVOICE_PROCESSOR
    W2 Parser FORM_W2_PROCESSOR
    W9 Parser FORM_W9_PROCESSOR
    1040 Parser FORM_1040_PROCESSOR
    US Driver License Parser US_DRIVER_LICENSE_PROCESSOR
    US Passport Parser US_PASSPORT_PROCESSOR
    Custom Document Extractor CUSTOM_EXTRACTION_PROCESSOR

    To get a full list of processor types, use fetchProcessorTypes. Check out fields detected documentation for possible entities extracted by the processors.

  10. Expand the advanced section to configure whether the property is searchable/filterable.

After creation, the new property is added to the schema.