Batch requests (v3beta1)

Batch translation allows you to translate large amounts of text (with a limit of 1,000 files per batch), and to up to 10 different target languages in a command offline. The total content size should be <= 100M Unicode codepoints and must use UTF-8 encoding.

Input file

Only two MIME types are supported: text/html (HTML) and text/plain (.tsv and .txt).

.tsv

If a file extension is .tsv, it can contain either one or two columns. The first column (optional) is the ID of the text request. If the first column is missing, Google uses the row number (0-based) from the input file as the ID in the output file. The second column is the actual text to be translated. We recommend each row be <= 10K Unicode codepoints, otherwise an error may be returned.

.txt/.html

The other supported file extensions are .txt or .html, which is treated as a single large chunk of text.

Batch request

With a batch translation request, you provide the path to an input configuration file (InputConfig) containing the content you want translated and provide a path to an output location (OutputConfig) for the final translation. You need at least two different Google Cloud Storage buckets. The source bucket contains content to be translated, and the destination bucket will contain the resulting translated file(s). The destination bucket must be empty before the translation process begins.

As the request is processing, we write the results to the output location in real time. Even if you cancel the request halfway through, input file-level partial output are still produced in the output gcs location. Therefore, the translated number of characters are still charged.

REST & CMD LINE

This example shows two input files sent for translating.

Before using any of the request data below, make the following replacements:

  • project-number: your GCP project number

HTTP method and URL:

POST https://translation.googleapis.com/v3beta1/projects/project-number/locations/us-central1:batchTranslateText

Request JSON body:

{
  "sourceLanguageCode": "en",
  "targetLanguageCodes": ["es", "fr"],
  "inputConfigs": [
   {
      "gcsSource": {
        "inputUri": "gs://bucket-name-source/input-file-name"
      }
    },
    {
      "gcsSource": {
        "inputUri": "gs://bucket-name-source/input-file-name2"
      }
    }
  ],
  "outputConfig": {
      "gcsDestination": {
        "outputUriPrefix": "gs://bucket-name-destination/"
      }
   }
}

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://translation.googleapis.com/v3beta1/projects/project-number/locations/us-central1:batchTranslateText

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://translation.googleapis.com/v3beta1/projects/project-number/locations/us-central1:batchTranslateText " | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/project-number/locations/us-central1/operations/20190725-08251564068323-5d3895ce-0000-2067-864c-001a1136fb06",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.translation.v3beta1.BatchTranslateMetadata",
    "state": "RUNNING"
  }
}
The response contains the ID for a long-running operation.

Java

Before trying this sample, follow the Java setup instructions in the Translation Quickstart Using Client Libraries . For more information, see the Translation Java API reference documentation .

static BatchTranslateResponse batchTranslateText(
    String projectId, String location, String sourceUri, String destinationUri) {
  try (TranslationServiceClient translationServiceClient = TranslationServiceClient.create()) {

    LocationName locationName =
        LocationName.newBuilder().setProject(projectId).setLocation(location).build();
    GcsSource gcsSource = GcsSource.newBuilder().setInputUri(sourceUri).build();
    InputConfig inputConfig =
        InputConfig.newBuilder().setGcsSource(gcsSource).setMimeType("text/plain").build();
    GcsDestination gcsDestination =
        GcsDestination.newBuilder().setOutputUriPrefix(destinationUri).build();
    OutputConfig outputConfig =
        OutputConfig.newBuilder().setGcsDestination(gcsDestination).build();
    BatchTranslateTextRequest batchTranslateTextRequest =
        BatchTranslateTextRequest.newBuilder()
            .setParent(locationName.toString())
            .setSourceLanguageCode("en")
            .addTargetLanguageCodes("sr")
            .addInputConfigs(inputConfig)
            .setOutputConfig(outputConfig)
            .build();

    // Call the API
    BatchTranslateResponse response =
        translationServiceClient
            .batchTranslateTextAsync(batchTranslateTextRequest)
            .get(300, TimeUnit.SECONDS);

    System.out.printf("Total Characters: %d\n", response.getTotalCharacters());
    System.out.printf("Translated Characters: %d\n", response.getTranslatedCharacters());
    return response;

  } catch (Exception e) {
    throw new RuntimeException("Couldn't create client.", e);
  }
}

Node.js

Before trying this sample, follow the Node.js setup instructions in the Translation Quickstart Using Client Libraries . For more information, see the Translation Node.js API reference documentation .

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const text = 'text to translate';

// Imports the Google Cloud Translation library
const {TranslationServiceClient} = require('@google-cloud/translate').v3beta1;

// Instantiates a client
const translationClient = new TranslationServiceClient();
async function batchTranslateText() {
  // Construct request
  const request = {
    parent: translationClient.locationPath(projectId, location),
    sourceLanguageCode: 'en-US',
    targetLanguageCodes: ['sr-Latn'],
    inputConfigs: [
      {
        mimeType: 'text/plain', // mime types: text/plain, text/html
        gcsSource: {
          inputUri: inputUri,
        },
      },
    ],
    outputConfig: {
      gcsDestination: {
        outputUriPrefix: outputUri,
      },
    },
  };

  // Batch translate text using a long-running operation.
  // You can wait for now, or get results later.
  const [operation] = await translationClient.batchTranslateText(request);

  // Wait for operation to complete.
  const [response] = await operation.promise();

  console.log(`Total Characters: ${response.totalCharacters}`);
  console.log(`Translated Characters: ${response.translatedCharacters}`);
}

batchTranslateText();

Python

Before trying this sample, follow the Python setup instructions in the Translation Quickstart Using Client Libraries . For more information, see the Translation Python API reference documentation .

from google.cloud import translate_v3beta1 as translate
client = translate.TranslationServiceClient()

# project_id = YOUR_PROJECT_ID
# input_uri = 'gs://cloud-samples-data/translation/text.txt'
# output_uri = 'gs://YOUR_BUCKET_ID/path_to_store_results/'
location = 'us-central1'

parent = client.location_path(project_id, location)

gcs_source = translate.types.GcsSource(input_uri=input_uri)

input_config = translate.types.InputConfig(
    mime_type='text/plain',  # mime types: text/plain, text/html
    gcs_source=gcs_source)

gcs_destination = translate.types.GcsDestination(
    output_uri_prefix=output_uri)

output_config = translate.types.OutputConfig(
    gcs_destination=gcs_destination)

operation = client.batch_translate_text(
    parent=parent,
    source_language_code='en-US',
    target_language_codes=['sr-Latn'],
    input_configs=[input_config],
    output_config=output_config)

result = operation.result(90)

print('Total Characters: {}'.format(result.total_characters))
print('Translated Characters: {}'.format(result.translated_characters))

Making a batch request using an AutoML model

You can use a custom model for batch requests. There are various scenarios when multiple target languages are involved.

Specifying an AutoML model for target language

REST & CMD LINE

This example shows how to specify a custom model for the target language.

Before using any of the request data below, make the following replacements:

  • project-number: your GCP project number

HTTP method and URL:

POST https://translation.googleapis.com/v3beta1/projects/project-id/locations/us-central1:batchTranslateText

Request JSON body:

{
  "models":{"es":"projects/project-id/locations/us-central1/models/model-id"},
  "sourceLanguageCode": "en",
  "targetLanguageCodes": ["es"],
  "inputConfigs": [
   {
      "gcsSource": {
        "inputUri": "gs://bucket-name-source/input-file-name"
      }
    },
    {
      "gcsSource": {
        "inputUri": "gs://bucket-name-source/input-file-name2"
      }
    }
  ],
  "outputConfig": {
      "gcsDestination": {
        "outputUriPrefix": "gs://bucket-name-destination/"
      }
   }
}

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://translation.googleapis.com/v3beta1/projects/project-id/locations/us-central1:batchTranslateText

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://translation.googleapis.com/v3beta1/projects/project-id/locations/us-central1:batchTranslateText " | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/project-number/locations/us-central1/operations/20190725-08251564068323-5d3895ce-0000-2067-864c-001a1136fb06",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.translation.v3beta1.BatchTranslateMetadata",
    "state": "RUNNING"
  }
}
The response contains the ID for a long-running operation.

Specifying AutoML models for multiple target languages

REST & CMD LINE

When you have multiple target languages, you can specify a custom model for each target language.

Before using any of the request data below, make the following replacements:

  • project-number: your GCP project number

HTTP method and URL:

POST https://translation.googleapis.com/v3beta1/projects/project-id/locations/us-central1:batchTranslateText

Request JSON body:

{
  "models":{"es":"projects/project-id/locations/us-central1/models/model-id", "fr":"projects/project-id/locations/us-central1/models/model-id2"},
  "sourceLanguageCode": "en",
  "targetLanguageCodes": ["es", "fr"],
  "inputConfigs": [
   {
      "gcsSource": {
        "inputUri": "gs://bucket-name-source/input-file-name"
      }
    },
    {
      "gcsSource": {
        "inputUri": "gs://bucket-name-source/input-file-name2"
      }
    }
  ],
  "outputConfig": {
      "gcsDestination": {
        "outputUriPrefix": "gs://bucket-name-destination/"
      }
   }
 }

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://translation.googleapis.com/v3beta1/projects/project-id/locations/us-central1:batchTranslateText

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://translation.googleapis.com/v3beta1/projects/project-id/locations/us-central1:batchTranslateText " | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/project-number/locations/us-central1/operations/20190725-08251564068323-5d3895ce-0000-2067-864c-001a1136fb06",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.translation.v3beta1.BatchTranslateMetadata",
    "state": "RUNNING"
  }
}
The response contains the ID for a long-running operation.

Specifying an AutoML model for a target language and not others

You can specify a custom model for a particular target language while not specifying a model for the other target languages. Using the code for Specifying custom models for multiple target language, just modify the models field to specify the target language for the model, es in this example, and leaving fr unspecified:

  • "models": {'es':'projects/project-id/locations/us-central1/models/model-id'},

Operation status

A batch request is a long-running operation, so it may take a substantial amount of time to complete. You can poll the status of this operation to see if it has completed, or you can cancel the operation.

For more information, see Long-running operations.

هل كانت هذه الصفحة مفيدة؟ يرجى تقييم أدائنا:

إرسال تعليقات حول...

هل تحتاج إلى مساعدة؟ انتقل إلى صفحة الدعم.