Run a batch translation using the Cloud Translation connector

This tutorial shows you how to create a workflow that uses the Cloud Translation API connector to translate files to other languages in asynchronous batch mode. This provides real-time output as the inputs are being processed.

Objectives

In this tutorial you will:

  1. Create an input Cloud Storage bucket.
  2. Create two files in English and upload them to the input bucket.
  3. Create a workflow that uses the Cloud Translation API connector to translate the two files to French and Spanish and saves the results in an output bucket.
  4. Deploy and execute the workflow to orchestrate the entire process.

Costs

This tutorial uses the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

Before you begin

Some of the steps in this document might not work correctly if your organization applies constraints to your Google Cloud environment. In that case, you might not be able to complete tasks like creating public IP addresses or service account keys. If you make a request that returns an error about constraints, see how to Develop applications in a constrained Google Cloud environment.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  4. Enable the Cloud Storage, Translation and Workflows APIs.

    Enable the APIs

  5. Install and initialize the Google Cloud CLI.
  6. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  7. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  8. Enable the Cloud Storage, Translation and Workflows APIs.

    Enable the APIs

  9. Install and initialize the Google Cloud CLI.
  10. Update gcloud components:
    gcloud components update
  11. Log in using your account:
    gcloud auth login
  12. Set the default location used in this tutorial:
    gcloud config set workflows/location us-central1
    

    Since this tutorial uses the default AutoML Translation model which resides in us-central1, you must set the location to us-central1.

    If using an AutoML Translation model or glossary other than the default, ensure that it resides in the same location as the call to the connector; otherwise, an INVALID_ARGUMENT (400) error is returned. For details, see the batchTranslateText method.

Create an input Cloud Storage bucket and files

You can use Cloud Storage to store objects. Objects are immutable pieces of data consisting of a file of any format, and are stored in containers called buckets.

  1. Create a Cloud Storage bucket to hold the files to translate:

    BUCKET_INPUT=${GOOGLE_CLOUD_PROJECT}-input-files
    gsutil mb gs://${BUCKET_INPUT}
  2. Create two files in English and upload them to the input bucket:

    echo "Hello World!" > file1.txt
    gsutil cp file1.txt gs://${BUCKET_INPUT}
    echo "Workflows connectors simplify calling services." > file2.txt
    gsutil cp file2.txt gs://${BUCKET_INPUT}

Deploy and execute the workflow

A workflow is made up of a series of steps described using the Workflows syntax, which can be written in either YAML or JSON format. This is the workflow's definition. After creating a workflow, you deploy it to make it available for execution.

  1. Create a text file with the filename workflow.yaml and with the following content:

    main:
      steps:
      - init:
          assign:
          - projectId: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
          - location: ${sys.get_env("GOOGLE_CLOUD_LOCATION")}
          - inputBucketName: ${projectId + "-input-files"}
          - outputBucketName: ${projectId + "-output-files-" + string(int(sys.now()))}
      - createOutputBucket:
            call: googleapis.storage.v1.buckets.insert
            args:
              query:
                project: ${projectId}
              body:
                name: ${outputBucketName}
      - batchTranslateText:
          call: googleapis.translate.v3beta1.projects.locations.batchTranslateText
          args:
              parent: ${"projects/" + projectId + "/locations/" + location}
              body:
                  inputConfigs:
                    gcsSource:
                      inputUri: ${"gs://" + inputBucketName + "/*"}
                  outputConfig:
                      gcsDestination:
                        outputUriPrefix: ${"gs://" + outputBucketName + "/"}
                  sourceLanguageCode: "en"
                  targetLanguageCodes: ["es", "fr"]
          result: batchTranslateTextResult

    The workflow assigns variables, creates an output bucket, and initiates the translation of the files, saving the results to the output bucket.

  2. After creating the workflow, deploy it:

    gcloud workflows deploy batch-translation --source=workflow.yaml
  3. Execute the workflow:

    gcloud workflows execute batch-translation
  4. To view the workflow status, you can run the returned command. For example:

    gcloud workflows executions describe eb4a6239-cffa-4672-81d8-d4caef7d8424 /
      --workflow batch-translation /
      --location us-central1

    The workflow should be ACTIVE. After a few minutes, the translated files (in French and Spanish) are uploaded to the output bucket.

List objects in the output bucket

You can confirm that the workflow worked as expected by listing the objects in your output bucket.

  1. Retrieve your output bucket name:

    gsutil ls

    The output is similar to the following:

    gs://PROJECT_ID-input-files/
    gs://PROJECT_ID-output-files-TIMESTAMP/

  2. List the objects in your output bucket:

    gsutil ls -r gs://PROJECT_ID-output-files-TIMESTAMP/**

    After a few minutes, the translated files, two of each in French and Spanish, are listed.

Clean up

If you created a new project for this tutorial, delete the project. If you used an existing project and wish to keep it without the changes added in this tutorial, delete resources created for the tutorial.

Delete the project

The easiest way to eliminate billing is to delete the project that you created for the tutorial.

To delete the project:

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete tutorial resources

  1. Remove the gcloud default configuration you added during the tutorial setup:

    gcloud config unset workflows/location
    
  2. Delete the workflow created in this tutorial:

    gcloud workflows delete WORKFLOW_NAME
    
  3. Delete the buckets created in this tutorial:

    gsutil rm -r gs://BUCKET_NAME

    Where BUCKET_NAME is the name of the bucket to delete. For example, my-bucket.

    The response is similar to the following:

    Removing gs://my-bucket/...

What's next