Try Document Vision Service (DVS)

Translation and OCR combine to provide the Document Vision Service (DVS) and document processing feature, which use the Translate Document API for directly translating formatted documents such as PDF files. Compared to plain text translations, the feature preserves the original formatting and layout in your translated documents, helping you retain much of the original context, like paragraph breaks. DVS supports document translations both inline and from storage buckets.

This quickstart guides the Application Operator (AO) through the process of using the Vertex AI Translate Document pre-trained API on Google Distributed Cloud (GDC) air-gapped.

Supported formats

DVS supports the following input file types and their associated output file types.

Inputs Document MIME type Output
PDF application/pdf PDF

Original and scanned PDF document translations

DVS supports both original and scanned PDF files, including translations to or from right-to-left languages. Also, DVS preserves hyperlinks, font size, and font color from files.

Before you begin

Follow these steps before trying DVS:

  1. Create the dvs-project project. For information about creating and using projects, see Create a project.

    Alternatively, you can create the project using a custom resource (CR):

    apiVersion: resourcemanager.gdc.goog/v1
    kind: Project
    metadata:
      labels:
        atat.config.google.com/clin-number: CLIN_NUMBER
        atat.config.google.com/task-order-number: TASK_ORDER_NUMBER
      name: dvs-project
      namespace: platform
    
  2. Ask your Project IAM Admin to grant you the AI Translation Developer (ai-translation-developer) role in the dvs-project project namespace. For more information, see Grant access to project resources.

  3. Enable both the Translation and OCR pre-trained APIs.

  4. Download the gdcloud command-line interface (CLI).

  5. Install Vertex AI client libraries. You must download the Vision and Translation client libraries according to your operating system.

Set up your service account

Set up your service account with a name, project ID, and service key.

  ${HOME}/gdcloud init  # set URI and project

  ${HOME}/gdcloud auth login

  ${HOME}/gdcloud iam service-accounts create SERVICE_ACCOUNT  --project=PROJECT_ID

  ${HOME}/gdcloud iam service-accounts keys create "SERVICE_KEY".json --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT

Replace the following:

  • SERVICE_ACCOUNT: the name you want to give to your service account.
  • PROJECT_ID: your project ID number.
  • SERVICE_KEY: the name of the JSON file for the service key.

Grant access to project resources

Grant access to the Translation API service account by providing your project ID, name of your service account, and the role ai-translation-developer.

  ${HOME}/gdcloud iam service-accounts add-iam-policy-binding --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT --role=role/ai-translation-developer

Authenticate the gdcloud CLI

You must get a token to authenticate the gdcloud CLI before sending requests to the Translation pre-trained services. Follow these steps:

  1. Install the google-auth client library.

    pip install google-auth
    
  2. Save the following code to a Python script.

    import google.auth
    from google.auth.transport import requests
    
    api_endpoint = "https://ENDPOINT.GDC_URL"
    
    creds, project_id = google.auth.default()
    creds = creds.with_gdch_audience(api_endpoint)
    
    def test_get_token():
      req = requests.Request()
      creds.refresh(req)
      print(creds.token)
    
    if __name__=="__main__":
      test_get_token()
    

    Replace the following:

    • ENDPOINT: the Translation endpoint.
    • GDC_URL: the URL of your organization in Distributed Cloud, for example, org-1.zone1.gdch.test.

    For more information, see View service statuses and endpoints.

  3. Run the script to fetch the token.

For any grpcurl or curl request, you must replace TOKEN with the fetched token in the header as in the following example:

-H "Authorization: Bearer TOKEN"

Translate documents

DVS in Distributed Cloud provides the following two types of document translations:

Translate a document from a storage bucket

To translate a document that is stored in a bucket, follow these steps:

Prepare your environment

Before using the Translation API to detect text offline, follow these steps:

  1. Create a storage bucket in the dvs-project project, using the Standard class.
  2. Grant read and write permissions on the bucket to the Vertex AI Translation system service account (ai-translation-system-sa) used by the Translation service.

Alternatively, you can follow these steps to create the storage bucket, role, and role binding using custom resources (CR):

  1. Create the storage bucket by deploying a Bucket CR in the dvs-project namespace:

    apiVersion: object.gdc.goog/v1
    kind: Bucket
    metadata:
      name: dvs-bucket
      namespace: dvs-project
    spec:
      description: bucket for document vision service
      storageClass: Standard
      bucketPolicy:
        lockingPolicy:
          defaultObjectRetentionDays: 90
    
  2. Create the role by deploying a Role CR in the dvs-project namespace:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      name: dvs-reader-writer
      namespace: dvs-project
    rules:
      -
        apiGroups:
          - object.gdc.goog
        resources:
          - buckets
        verbs:
          - read-object
          - write-object
    
  3. Create the role binding by deploying a RoleBinding CR in the dvs-project namespace:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: dvs-reader-writer-rolebinding
      namespace: dvs-project
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: Role
      name: dvs-reader-writer
    subjects:
      -
        kind: ServiceAccount
        name: ai-translation-system-sa
        namespace: ai-translation-system
    

Upload files to the storage bucket

You must upload your documents to the storage bucket to let the Translation service process the files.

To upload files to the storage bucket, follow these steps:

  1. Configure the gdcloud CLI storage by following the instructions from Configure the gdcloud CLI for object storage.
  2. Upload your document to the storage bucket you created. For more information about how to upload objects to storage buckets, see Upload and download storage objects in projects.

The following example translates a file from a bucket and outputs the result to another bucket path. The response also returns a byte stream. You can specify the MIME type; if you don't, DVS determines it by using the input file's extension.

If you don't specify a source language code, DVS detects the language for you. The detected language is included in the output in the detectedLanguageCode field.

HTTP

The following example uses HTTP to call DVS with an input PDF document in a storage bucket.

  1. Save the following request.json file:

    cat <<- EOF > request.json
    {
      "parent": "projects/PROJECT_ID",
      "source_language_code": "SOURCE_LANGUAGE",
      "target_language_code": "TARGET_LANGUAGE",
      "document_input_config": {
        "mime_type": "application/pdf",
        "s3_source": {
          "input_uri": "s3://INPUT_FILE_PATH"
        }
      },
      "document_output_config": {
        "mime_type": "application/pdf"
      },
      "enable_rotation_correction": "true"
    }
    EOF
    

    Replace the following:

    • PROJECT_ID: The ID of the project that you want to use.
    • SOURCE_LANGUAGE: the language in which your document is written. For a list of supported languages, see Get supported languages.
    • TARGET_LANGUAGE: the language or languages into which you want to translate your document. For a list of supported languages, see Get supported languages.
    • INPUT_FILE_PATH: the path of your document file in the storage bucket.
  2. Use the curl tool to call the endpoint and take the request from the request.json file:

    curl --cacert CACERT --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer TOKEN" https://ENDPOINT.GDC_URL:443/v3/projects/PROJECT_ID:translateDocument<request.json
    

    Replace the following:

    • CACERT: the path to find the CA certificate.
    • TOKEN: the token you obtained when you authenticated the gdcloud CLI.
    • ENDPOINT: the Translation endpoint that you use for your organization.
    • GDC_URL: the URL of your organization in Distributed Cloud, for example, org-1.zone1.gdch.test.
    • PROJECT_ID: The ID of the project that you want to use.

You obtain the output following the command.

gRPC

If you don't have grpcurl installed, download and install it from a resource outside of Distributed Cloud (https://github.com/fullstorydev/grpcurl#from-source).

The following example uses gRPC to call DVS with an input PDF document in a storage bucket.

  1. Save the following request.json file:

    cat <<- EOF > request.json
    {
      "parent": "projects/PROJECT_ID",
      "source_language_code": "SOURCE_LANGUAGE",
      "target_language_code": "TARGET_LANGUAGE",
      "document_input_config": {
        "mime_type": "application/pdf",
        "s3_source": {
          "input_uri": "s3://INPUT_FILE_PATH"
        }
      },
      "document_output_config": {
        "mime_type": "application/pdf",
      },
      "enable_rotation_correction": "true"
    }
    EOF
    

    Replace the following:

    • PROJECT_ID: The ID of the project that you want to use.
    • SOURCE_LANGUAGE: the language in which your document is written. For a list of supported languages, see Get supported languages.
    • TARGET_LANGUAGE: the language or languages into which you want to translate your document. For a list of supported languages, see Get supported languages.
    • INPUT_FILE_PATH: the path of your document file in the storage bucket.
  2. Use the grpcurl tool to call the endpoint and take the request from the request.json file:

    grpcurl --cacert CACERT -authority ENDPOINT.GDC_URL -max-msg-sz 50000000 -d @  -H "Authorization: Bearer TOKEN" ENDPOINT.GDC_URL:443 google.cloud.translation.v3.TranslationService/TranslateDocument<request.json
    

    Replace the following:

    • CACERT: the path to find the CA certificate.
    • ENDPOINT: the Translation endpoint that you use for your organization.
    • GDC_URL: the URL of your organization in Distributed Cloud, for example, org-1.zone1.gdch.test.
    • TOKEN: the token you obtained when you authenticated the gdcloud CLI.

You obtain the output following the command.

Translate a document inline

The following example sends a document inline as part of the request. You must include the MIME type for inline document translations.

If you don't specify a source language code, DVS detects the language for you. The detected language is included in the output in the detectedLanguageCode field.

HTTP

The following example uses HTTP to call DVS with an inline PDF document.

echo '{"parent": "projects/PROJECT_ID","source_language_code": "SOURCE_LANGUAGE", "target_language_code": "TARGET_LANGUAGE", "document_input_config": { "mime_type": "application/pdf", "content": "'$(base64 -w 0 INPUT_FILE_PATH)'" }, "document_output_config": { "mime_type": "application/pdf" }, "enable_rotation_correction": "true"}' | curl --cacert CACERT --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer TOKEN" https://ENDPOINT.GDC_URL/v3/projects/PROJECT_ID:translateDocument

Replace the following:

  • PROJECT_ID: The ID of the project that you want to use.
  • SOURCE_LANGUAGE: the language in which your document is written. For a list of supported languages, see Get supported languages.
  • TARGET_LANGUAGE: the language or languages into which you want to translate your document. For a list of supported languages, see Get supported languages.
  • INPUT_FILE_PATH: the path of your document file locally.
  • ENDPOINT: the Translation endpoint that you use for your organization.
  • GDC_URL: the URL of your organization in Distributed Cloud, for example, org-1.zone1.gdch.test.
  • TOKEN: the token you obtained when you authenticated the gdcloud CLI.

You obtain the output following the command.

gRPC

If you don't have grpcurl installed, download and install it from a resource outside of Distributed Cloud (https://github.com/fullstorydev/grpcurl#from-source).

The following example uses gRPC to call DVS with an inline PDF document.

echo '{"parent": "projects/PROJECT_ID","source_language_code": "SOURCE_LANGUAGE", "target_language_code": "TARGET_LANGUAGE", "document_input_config": { "mime_type": "application/pdf", "content": "'$(base64 -w 0 INPUT_FILE_PATH)'" }, "document_output_config": { "mime_type": "application/pdf" }, "enable_rotation_correction": "true"}' | grpcurl --cacert CACERT -authority ENDPOINT.GDC_URL -max-msg-sz 50000000 -d @  -H "Authorization: Bearer TOKEN" ENDPOINT.GDC_URL:443 google.cloud.translation.v3.TranslationService/TranslateDocument

Replace the following:

  • PROJECT_ID: The ID of the project that you want to use.
  • SOURCE_LANGUAGE: the language in which your document is written. For a list of supported languages, see Get supported languages.
  • TARGET_LANGUAGE: the language or languages into which you want to translate your document. For a list of supported languages, see Get supported languages.
  • INPUT_FILE_PATH: the path of your document file locally.
  • ENDPOINT: the Translation endpoint that you use for your organization.
  • GDC_URL: the URL of your organization in Distributed Cloud, for example, org-1.zone1.gdch.test.
  • TOKEN: the token you obtained when you authenticated the gdcloud CLI.

You obtain the output following the command.