Try Gemini 1.5 models, the latest multimodal models in Vertex AI, and see what you can build with up to a 2M token context window.Try Gemini 1.5 models, the latest multimodal models in Vertex AI, and see what you can build with up to a 2M token context window.
Stay organized with collections
Save and categorize content based on your preferences.
Limits
This document lists the limits that apply to Document AI.
These limits are unrelated to the quota system. Limits cannot be changed unless
otherwise stated.
Content limits
The following content limits apply to all Document AI processors.
Content limit
Value
Maximum image resolution
(limit does not apply to PDF files)
40 megapixels (per page if image contains multiple pages)
Maximum pages (batch/offline/asynchronous requests):
200
Limitations for Document AI
Document AI has the current limitations.
Criteria
Stable release July 2023
Dataset
Maximum of 30,000 documents total
Maximum of 250,000 pages total
Document import
Maximum of 5,000 documents per import
Maximum of 200 pages per document
Limits to train a Custom Document Extractor (CDE)
Model-based training (GA)
Training dataset maximums: 25,000 documents; 100,000 pages
Training dataset minimum: each label needs to be present on at least 1 label per 10 documents
Test dataset maximums: 2,000 documents; 8,000 pages
Test dataset minimum: every label on at least 10 documents
Maximum of 200 pages per document
Template-based training (experimental)
Template-based training (experimental)
Training dataset maximums: 300 documents, 300 pages
Training dataset minimum: every label on at least on at least 3 documents
Test dataset maximums: 2,000 documents; 8,000 pages
Test dataset minimum: every label on at least 3 documents
Maximum of 20 pages per document
Limits to train a Custom Document Classifier (CDC) or a Custom Document Splitter (CDS)
Training dataset maximums: 30,000 documents; 100,000 pages
Training dataset minimum: every label on at least 10 documents
Test dataset maximums: 2,000 documents; 8,000 pages
Test dataset minimum: every label on at least 2 documents
Maximum of 200 pages per document
Labeling
To get started, ensure document labels meet defined minimum training and evaluation thresholds.
To begin evaluating model performance for documents with layout variation, label at least 100 documents. Specifically, ensure that each label exists on 50 documents in training and 50 in evaluation.
Maximum allowed labels (fields): 150
Label size limits (characters): Long items aren't well supported, but there's no explicit limit. Chunk documents into 800- or 1,000-token pieces, with 100 to 200 tokens overlapping between chunks. (Items longer than the overlapping area might run into quality issues.)
Label occurrences in a document: No limit
Geographic coverage
Regions generally supported: US, EU (multiregion)
Regions with limited accessibility: Germany, Singapore, UK, Canada, India, Australia