Vertex AI Translation and Optical Character Recognition (OCR) services combine to provide a document processing feature called Document Vision Service (DVS).
DVS directly translates formatted documents such as PDF files. Compared to plain text translations, the feature preserves the original formatting and layout in your translated documents, helping you retain much of the original context, like paragraph breaks.
DVS supports document translations inline, from storage buckets, and in batch.
This page guides you through an interactive experience using the document processing feature on Google Distributed Cloud (GDC) air-gapped to translate documents while preserving their format.
Supported formats
DVS supports the following input file types and their associated output file types:
Inputs | Document MIME type | Output |
---|---|---|
application/pdf |
PDF, DOCX | |
DOC | application/msword |
DOC, DOCX |
DOCX | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
DOCX |
PPT | application/vnd.ms-powerpoint |
PPT, PPTX |
PPTX | application/vnd.openxmlformats-officedocument.presentationml.presentation |
PPTX |
XLS | application/vnd.ms-excel |
XLS, XLSX |
XLSX | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
XLSX |
Original and scanned PDF document translations
DVS supports original and scanned PDF files, including translations to or from right-to-left languages. Also, DVS preserves hyperlinks, font size, and font color from files.
Before you begin
Before you can start using the document processing feature, you must have a
project named dvs-project
. The custom resource of the project must look like
in the following example:
apiVersion: resourcemanager.gdc.goog/v1
kind: Project
metadata:
labels:
atat.config.google.com/clin-number: CLIN_NUMBER
atat.config.google.com/task-order-number: TASK_ORDER_NUMBER
name: dvs-project
namespace: platform
Furthermore, you must enable both the Vertex AI Translation and OCR pre-trained APIs and have the appropriate credentials. Consider installing the Vertex AI Translation and OCR client libraries to facilitate API calls. For more information about prerequisites, see Set up a translation project.
Translate a document from a storage bucket
To translate a document that is stored in a bucket, you use the Vertex AI Translation API.
This section describes how to translate a document from a bucket and store the result to another output bucket path. The response also returns a byte stream. You can specify the MIME type; if you don't, DVS determines it by using the input file's extension.
DVS supports language auto-detection for documents stored in buckets. If you
don't specify a source language code, DVS detects the language for you. The
detected language is included in the output in the detectedLanguageCode
field.
Follow these steps to translate a document from a storage bucket:
- Configure the gdcloud CLI for object storage.
Create a storage bucket in the
dvs-project
namespace. Use aStandard
storage class.You can create the storage bucket by deploying a
Bucket
resource in thedvs-project
namespace:apiVersion: object.gdc.goog/v1 kind: Bucket metadata: name: dvs-bucket namespace: dvs-project spec: description: bucket for document vision service storageClass: Standard bucketPolicy: lockingPolicy: defaultObjectRetentionDays: 90
Grant
read
andwrite
permissions on the bucket to the service account (ai-translation-system-sa
) used by the Vertex AI Translation service.You can follow these steps to create the role and role binding using custom resources:
Create the role by deploying a
Role
resource in thedvs-project
namespace:apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: dvs-reader-writer namespace: dvs-project rules: - apiGroups: - object.gdc.goog resources: - buckets verbs: - read-object - write-object
Create the role binding by deploying a
RoleBinding
resource in thedvs-project
namespace:apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: dvs-reader-writer-rolebinding namespace: dvs-project roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: dvs-reader-writer subjects: - kind: ServiceAccount name: ai-translation-system-sa namespace: ai-translation-system
Upload your document to the storage bucket you created. For more information, see Upload and download storage objects in projects.
Make a request to the Vertex AI Translation pre-trained API:
Follow these steps to make a
curl
request:Save the following
request.json
file:cat <<- EOF > request.json { "parent": "projects/
PROJECT_ID ", "source_language_code": "SOURCE_LANGUAGE ", "target_language_code": "TARGET_LANGUAGE ", "document_input_config": { "mime_type": "application/pdf", "s3_source": { "input_uri": "s3://INPUT_FILE_PATH " } }, "document_output_config": { "mime_type": "application/pdf" }, "enable_rotation_correction": "true" } EOFReplace the following:
PROJECT_ID
: your project ID.SOURCE_LANGUAGE
: the language in which your document is written. See the list of supported languages and their respective language codes.TARGET_LANGUAGE
: the language or languages into which you want to translate your document. See the list of supported languages and their respective language codes.INPUT_FILE_PATH
: the path of your document file in the storage bucket.
Modify the
mime_type
value according to your document.Make the request:
curl -vv --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer
TOKEN " https://ENDPOINT :443/v3/projects/PROJECT_ID :translateDocument < request.jsonReplace the following:
TOKEN
: the authentication token you obtained.ENDPOINT
: the Vertex AI Translation endpoint that you use for your organization. For more information, view service status and endpoints.PROJECT_ID
: your project ID.
Translate a document inline
This section describes how to send a document inline as part of the API request. You must include the MIME type for inline document translations.
DVS supports language auto-detection for inline text translations. If you don't
specify a source language code, DVS detects the language for you. The detected
language is included in the output in the detectedLanguageCode
field.
Make a request to the Vertex AI Translation pre-trained API:
Follow these steps to make a curl
request:
Make the request:
echo '{"parent": "projects/PROJECT_ID ","source_language_code": "SOURCE_LANGUAGE ", "target_language_code": "TARGET_LANGUAGE ", "document_input_config": { "mime_type": "application/pdf", "content": "'$(base64 -w 0 INPUT_FILE_PATH )'" }, "document_output_config": { "mime_type": "application/pdf" }, "enable_rotation_correction": "true"}' | curl --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer TOKEN " https://ENDPOINT /v3/projects/PROJECT_ID :translateDocument
Replace the following:
PROJECT_ID
: your project ID.SOURCE_LANGUAGE
: the language in which your document is written. See the list of supported languages and their respective language codes.TARGET_LANGUAGE
: the language or languages into which you want to translate your document. See the list of supported languages and their respective language codes.INPUT_FILE_PATH
: the path of your document file locally.TOKEN
: the authentication token you obtained.ENDPOINT
: the Vertex AI Translation endpoint that you use for your organization. For more information, view service status and endpoints
Translate documents in batch
Batch translation lets you translate multiple files into multiple languages in a single request. For each request, you can send up to 100 files with a total content size of up to 1 GB or 100 million Unicode codepoints, whichever limit is hit first. You can specify a particular translation model for each language.
For more information, see batchTranslateDocument
.
Translate multiple documents
The following example includes multiple input configurations. Each input configuration is a pointer to a file in a storage bucket.
Make a request to the Vertex AI Translation pre-trained API:
Follow these steps to make a curl
request:
Save the following request body in a file named
request.json
:{ "source_language_code": "
SOURCE_LANGUAGE ", "target_language_codes": ["TARGET_LANGUAGE ", ...], "input_configs": [ { "s3_source": { "input_uri": "s3://INPUT_FILE_PATH_1 " } }, { "s3_source": { "input_uri": "s3://INPUT_FILE_PATH_2 " } }, ... ], "output_config": { "s3_destination": { "output_uri_prefix": "s3://OUTPUT_FILE_PREFIX " } } }Replace the following:
SOURCE_LANGUAGE
: the language code of the input documents. See the list of supported languages and their respective language codes.TARGET_LANGUAGE
: the target language or languages to translate the input documents to. See the list of supported languages and their respective language codes.INPUT_FILE_PATH
: the storage bucket location and filename of one or more input documents.OUTPUT_FILE_PREFIX
: the storage bucket location where all output documents are stored.
Make the request:
curl -X POST \ -H "Authorization: Bearer
TOKEN " \ -H "Content-Type: application/json; charset=utf-8" \ -d @request.json \ "https://ENDPOINT :443/v3/projects/PROJECT_ID :batchTranslateDocument"Replace the following:
TOKEN
: the authentication token you obtained.ENDPOINT
: the Vertex AI Translation endpoint that you use for your organization. For more information, view service status and endpoints.PROJECT_ID
: your project ID.
The response contains the ID for a long-running operation:
{
"name": "projects/PROJECT_ID /operations/OPERATION_ID ",
"metadata": {
"@type": "type.googleapis.com/google.cloud.translation.v3.BatchTranslateDocumentMetadata",
"state": "RUNNING"
}
}
Translate and convert an original PDF file
The following example translates and converts an original PDF file to a DOCX file. You can specify multiple inputs of various file types; they don't all have to be original PDF files. However, scanned PDF files cannot be included when including a conversion; the request is rejected and no translations are done. Only original PDF files are translated and converted to DOCX files. For example, if you include PPTX files, they are translated and returned as PPTX files.
If you regularly translate a mix of scanned and original PDF files, we recommend that you organize them into separate buckets. That way, when you request a batch translation and conversion, you can exclude the bucket that contains scanned PDF files instead of having to exclude individual files.
Make a request to the Vertex AI Translation pre-trained API:
Follow these steps to make a curl
request:
Save the following request body in a file named
request.json
:{ "source_language_code": "
SOURCE_LANGUAGE ", "target_language_codes": ["TARGET_LANGUAGE ", ...], "input_configs": [ { "s3_source": { "input_uri": "s3://INPUT_FILE_PATH_1 " } }, { "s3_source": { "input_uri": "s3://INPUT_FILE_PATH_2 " } }, ... ], "output_config": { "s3_destination": { "output_uri_prefix": "s3://OUTPUT_FILE_PREFIX " } }, "format_conversions": { "application/pdf": "application/vnd.openxmlformats-officedocument.wordprocessingml.document" } }Replace the following:
SOURCE_LANGUAGE
: the language code of the input documents. See the list of supported languages and their respective language codes.TARGET_LANGUAGE
: the target language or languages to translate the input documents to. See the list of supported languages and their respective language codes.INPUT_FILE_PATH
: the storage bucket location and filename of one or more input documents.OUTPUT_FILE_PREFIX
: the storage bucket location where all output documents are stored.
Make the request:
curl -X POST \ -H "Authorization: Bearer
TOKEN " \ -H "Content-Type: application/json; charset=utf-8" \ -d @request.json \ "https://ENDPOINT :443/v3/projects/PROJECT_ID :batchTranslateDocument"Replace the following:
TOKEN
: the authentication token you obtained.ENDPOINT
: the Vertex AI Translation endpoint that you use for your organization. For more information, view service status and endpoints.PROJECT_ID
: your project ID.
The response contains the ID for a long-running operation:
{
"name": "projects/PROJECT_ID /operations/OPERATION_ID ",
"metadata": {
"@type": "type.googleapis.com/google.cloud.translation.v3.BatchTranslateDocumentMetadata",
"state": "RUNNING"
}
}
Use a glossary
You can include a glossary to handle domain-specific terminology. If you specify a glossary, you must specify the source language. The following example uses a glossary. You can specify up to 10 target languages with their own glossary.
If you specify a glossary for some target languages, the system doesn't use any glossary for the unspecified languages.
Make a request to the Vertex AI Translation pre-trained API:
Follow these steps to make a curl
request:
Save the following request body in a file named
request.json
:{ "source_language_code": "
SOURCE_LANGUAGE ", "target_language_codes": ["TARGET_LANGUAGE ", ...], "input_configs": [ { "s3_source": { "input_uri": "s3://INPUT_FILE_PATH " } } ], "output_config": { "s3_destination": { "output_uri_prefix": "s3://OUTPUT_FILE_PREFIX " } }, "glossaries": { "TARGET_LANGUAGE": { "glossary": "projects/GLOSSARY_PROJECT_ID " }, ... } }Replace the following:
SOURCE_LANGUAGE
: the language code of the input documents. See the list of supported languages and their respective language codes.TARGET_LANGUAGE
: the target language or languages to translate the input documents to. See the list of supported languages and their respective language codes.INPUT_FILE_PATH
: the storage bucket location and filename of one or more input documents.OUTPUT_FILE_PREFIX
: the storage bucket location where all output documents are stored.GLOSSARY_PROJECT_ID
: the project ID where the glossary is located.
Make the request:
curl -X POST \ -H "Authorization: Bearer
TOKEN " \ -H "Content-Type: application/json; charset=utf-8" \ -d @request.json \ "https://ENDPOINT :443/v3/projects/PROJECT_ID :batchTranslateDocument"Replace the following:
TOKEN
: the authentication token you obtained.ENDPOINT
: the Vertex AI Translation endpoint that you use for your organization. For more information, view service status and endpoints.PROJECT_ID
: your project ID.
The response contains the ID for a long-running operation:
{
"name": "projects/PROJECT_ID /operations/OPERATION_ID ",
"metadata": {
"@type": "type.googleapis.com/google.cloud.translation.v3.BatchTranslateDocumentMetadata",
"state": "RUNNING"
}
}