Limits

This document lists the system limits that apply to Document AI. Unlike quotas, system limits can't be changed.

Content limits

The following content limits apply to all Document AI processors.

Content limit Value
Maximum image resolution
(limit does not apply to PDF files)
40 megapixels (per page if image contains multiple pages)
Maximum file size for online processing requests 20 MB
Maximum file size for batch processing requests 1 GB
Files per batch processing request 5,000 files
Human-in-the-Loop pages per document 10 pages

Processor limits

Limits are defined in the current list.

Extraction processors

Processor Limits
Custom Extractor
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 200
Form Parser
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 100
Layout Parser
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 500

Classification processors

Processor Limits
Custom Classifier
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 200
Custom Splitter
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 1000

Digitize processors

Processor Limits
Enterprise Document OCR (Optical Character Recognition)
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 500

Pretrained processors

Processor Limits
Bank Statement Parser
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 30
W2 Parser
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 15
US Passport Parser
Maximum pages (online/synchronous requests): 2
Maximum pages (batch/offline/asynchronous requests): 2
Utility Parser
Maximum pages (online/synchronous requests): 10
Maximum pages (batch/offline/asynchronous requests): 200
Identity Document Proofing Parser
Maximum pages (online/synchronous requests): 2
Maximum pages (batch/offline/asynchronous requests): 2
Pay Slip Parser
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 50
US Driver License Parser
Maximum pages (online/synchronous requests): 2
Maximum pages (batch/offline/asynchronous requests): 2
Expense Parser
Maximum pages (online/synchronous requests): 10
Maximum pages (batch/offline/asynchronous requests): 10
Invoice Parser
Maximum pages (online/synchronous requests): 15
Maximum pages (batch/offline/asynchronous requests): 200

Limitations for Document AI

Document AI has the current limitations.

Criteria Stable release July 2023
Dataset
  • Maximum of 30,000 documents total
  • Maximum of 250,000 pages total
Document import
  • Maximum of 5,000 documents per import
  • Maximum of 200 pages per document
Limits to train a Custom Document Extractor (CDE) Model-based training (GA)
  • Training dataset maximums: 25,000 documents; 100,000 pages
  • Training dataset minimum: each label needs to be present on at least 1 label per 10 documents
  • Test dataset maximums: 2,000 documents; 8,000 pages
  • Test dataset minimum: every label on at least 10 documents
  • Maximum of 200 pages per document

Template-based training (GA)
  • Training dataset maximums: 300 documents, 300 pages
  • Training dataset minimum: every label on at least on at least 3 documents
  • Test dataset maximums: 2,000 documents; 8,000 pages
  • Test dataset minimum: every label on at least 3 documents
  • Maximum of 20 pages per document
Limits to train a Custom Document Classifier (CDC) or a Custom Document Splitter (CDS)
  • Training dataset maximums: 30,000 documents; 100,000 pages
  • Training dataset minimum: every label on at least 10 documents
  • Test dataset maximums: 2,000 documents; 8,000 pages
  • Test dataset minimum: every label on at least 2 documents
  • Maximum of 200 pages per document
Labeling
  • To get started, ensure document labels meet defined minimum training and evaluation thresholds.
  • To begin evaluating model performance for documents with layout variation, label at least 100 documents. Specifically, ensure that each label exists on 50 documents in training and 50 in evaluation.
  • Maximum allowed labels (fields): 150
  • Label size limits (characters): Long items aren't well supported, but there's no explicit limit. Chunk documents into 800- or 1,000-token pieces, with 100 to 200 tokens overlapping between chunks. (Items longer than the overlapping area might run into quality issues.)
  • Label occurrences in a document: No limit
Geographic coverage
  • Regions generally supported: US, EU (multiregion)
  • Regions with limited accessibility: Germany, Singapore, UK, Canada, India, Australia