Package google.cloud.translation.v3

Index

TranslationService

Provides natural language translation operations.

AdaptiveMtTranslate

rpc AdaptiveMtTranslate(AdaptiveMtTranslateRequest) returns (AdaptiveMtTranslateResponse)

Translate text using Adaptive MT.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

BatchTranslateDocument

rpc BatchTranslateDocument(BatchTranslateDocumentRequest) returns (Operation)

Translates a large volume of document in asynchronous batch mode. This function provides real-time output as the inputs are being processed. If caller cancels a request, the partial results (for an input file, it's all or nothing) may still be available on the specified output location.

This call returns immediately and you can use google.longrunning.Operation.name to poll the status of the call.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

BatchTranslateText

rpc BatchTranslateText(BatchTranslateTextRequest) returns (Operation)

Translates a large volume of text in asynchronous batch mode. This function provides real-time output as the inputs are being processed. If caller cancels a request, the partial results (for an input file, it's all or nothing) may still be available on the specified output location.

This call returns immediately and you can use google.longrunning.Operation.name to poll the status of the call.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateAdaptiveMtDataset

rpc CreateAdaptiveMtDataset(CreateAdaptiveMtDatasetRequest) returns (AdaptiveMtDataset)

Creates an Adaptive MT dataset.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateDataset

rpc CreateDataset(CreateDatasetRequest) returns (Operation)

Creates a Dataset.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateGlossary

rpc CreateGlossary(CreateGlossaryRequest) returns (Operation)

Creates a glossary and returns the long-running operation. Returns NOT_FOUND, if the project doesn't exist.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

CreateGlossaryEntry

rpc CreateGlossaryEntry(CreateGlossaryEntryRequest) returns (GlossaryEntry)

Creates a glossary entry.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateModel

rpc CreateModel(CreateModelRequest) returns (Operation)

Creates a Model.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteAdaptiveMtDataset

rpc DeleteAdaptiveMtDataset(DeleteAdaptiveMtDatasetRequest) returns (Empty)

Deletes an Adaptive MT dataset, including all its entries and associated metadata.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteAdaptiveMtFile

rpc DeleteAdaptiveMtFile(DeleteAdaptiveMtFileRequest) returns (Empty)

Deletes an AdaptiveMtFile along with its sentences.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteDataset

rpc DeleteDataset(DeleteDatasetRequest) returns (Operation)

Deletes a dataset and all of its contents.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteGlossary

rpc DeleteGlossary(DeleteGlossaryRequest) returns (Operation)

Deletes a glossary, or cancels glossary construction if the glossary isn't created yet. Returns NOT_FOUND, if the glossary doesn't exist.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteGlossaryEntry

rpc DeleteGlossaryEntry(DeleteGlossaryEntryRequest) returns (Empty)

Deletes a single entry from the glossary

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteModel

rpc DeleteModel(DeleteModelRequest) returns (Operation)

Deletes a model.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DetectLanguage

rpc DetectLanguage(DetectLanguageRequest) returns (DetectLanguageResponse)

Detects the language of text within a request.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

ExportData

rpc ExportData(ExportDataRequest) returns (Operation)

Exports dataset's data to the provided output location.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetAdaptiveMtDataset

rpc GetAdaptiveMtDataset(GetAdaptiveMtDatasetRequest) returns (AdaptiveMtDataset)

Gets the Adaptive MT dataset.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetAdaptiveMtFile

rpc GetAdaptiveMtFile(GetAdaptiveMtFileRequest) returns (AdaptiveMtFile)

Gets and AdaptiveMtFile

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetDataset

rpc GetDataset(GetDatasetRequest) returns (Dataset)

Gets a Dataset.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

GetGlossary

rpc GetGlossary(GetGlossaryRequest) returns (Glossary)

Gets a glossary. Returns NOT_FOUND, if the glossary doesn't exist.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

GetGlossaryEntry

rpc GetGlossaryEntry(GetGlossaryEntryRequest) returns (GlossaryEntry)

Gets a single glossary entry by the given id.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

GetModel

rpc GetModel(GetModelRequest) returns (Model)

Gets a model.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetSupportedLanguages

rpc GetSupportedLanguages(GetSupportedLanguagesRequest) returns (SupportedLanguages)

Returns a list of supported languages for translation.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

ImportAdaptiveMtFile

rpc ImportAdaptiveMtFile(ImportAdaptiveMtFileRequest) returns (ImportAdaptiveMtFileResponse)

Imports an AdaptiveMtFile and adds all of its sentences into the AdaptiveMtDataset.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ImportData

rpc ImportData(ImportDataRequest) returns (Operation)

Import sentence pairs into translation Dataset.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListAdaptiveMtDatasets

rpc ListAdaptiveMtDatasets(ListAdaptiveMtDatasetsRequest) returns (ListAdaptiveMtDatasetsResponse)

Lists all Adaptive MT datasets for which the caller has read permission.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListAdaptiveMtFiles

rpc ListAdaptiveMtFiles(ListAdaptiveMtFilesRequest) returns (ListAdaptiveMtFilesResponse)

Lists all AdaptiveMtFiles associated to an AdaptiveMtDataset.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListAdaptiveMtSentences

rpc ListAdaptiveMtSentences(ListAdaptiveMtSentencesRequest) returns (ListAdaptiveMtSentencesResponse)

Lists all AdaptiveMtSentences under a given file/dataset.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListDatasets

rpc ListDatasets(ListDatasetsRequest) returns (ListDatasetsResponse)

Lists datasets.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

ListExamples

rpc ListExamples(ListExamplesRequest) returns (ListExamplesResponse)

Lists sentence pairs in the dataset.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

ListGlossaries

rpc ListGlossaries(ListGlossariesRequest) returns (ListGlossariesResponse)

Lists glossaries in a project. Returns NOT_FOUND, if the project doesn't exist.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

ListGlossaryEntries

rpc ListGlossaryEntries(ListGlossaryEntriesRequest) returns (ListGlossaryEntriesResponse)

List the entries for the glossary.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

ListModels

rpc ListModels(ListModelsRequest) returns (ListModelsResponse)

Lists models.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

RomanizeText

rpc RomanizeText(RomanizeTextRequest) returns (RomanizeTextResponse)

Romanize input text written in non-Latin scripts to Latin text.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

TranslateDocument

rpc TranslateDocument(TranslateDocumentRequest) returns (TranslateDocumentResponse)

Translates documents in synchronous mode.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

TranslateText

rpc TranslateText(TranslateTextRequest) returns (TranslateTextResponse)

Translates input text and returns translated text.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-translation

For more information, see the Authentication Overview.

UpdateGlossary

rpc UpdateGlossary(UpdateGlossaryRequest) returns (Operation)

Updates a glossary. A LRO is used since the update can be async if the glossary's entry file is updated.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateGlossaryEntry

rpc UpdateGlossaryEntry(UpdateGlossaryEntryRequest) returns (GlossaryEntry)

Updates a glossary entry.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-translation
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

AdaptiveMtDataset

An Adaptive MT Dataset.

Fields
name

string

Required. The resource name of the dataset, in form of projects/{project-number-or-id}/locations/{location_id}/adaptiveMtDatasets/{dataset_id}

display_name

string

The name of the dataset to show in the interface. The name can be up to 32 characters long and can consist only of ASCII Latin letters A-Z and a-z, underscores (_), and ASCII digits 0-9.

source_language_code

string

The BCP-47 language code of the source language.

target_language_code

string

The BCP-47 language code of the target language.

example_count

int32

The number of examples in the dataset.

create_time

Timestamp

Output only. Timestamp when this dataset was created.

update_time

Timestamp

Output only. Timestamp when this dataset was last updated.

AdaptiveMtFile

An AdaptiveMtFile.

Fields
name

string

Required. The resource name of the file, in form of projects/{project-number-or-id}/locations/{location_id}/adaptiveMtDatasets/{dataset}/adaptiveMtFiles/{file}

display_name

string

The file's display name.

entry_count

int32

The number of entries that the file contains.

create_time

Timestamp

Output only. Timestamp when this file was created.

update_time

Timestamp

Output only. Timestamp when this file was last updated.

AdaptiveMtSentence

An AdaptiveMt sentence entry.

Fields
name

string

Required. The resource name of the file, in form of projects/{project-number-or-id}/locations/{location_id}/adaptiveMtDatasets/{dataset}/adaptiveMtFiles/{file}/adaptiveMtSentences/{sentence}

source_sentence

string

Required. The source sentence.

target_sentence

string

Required. The target sentence.

create_time

Timestamp

Output only. Timestamp when this sentence was created.

update_time

Timestamp

Output only. Timestamp when this sentence was last updated.

AdaptiveMtTranslateRequest

The request for sending an AdaptiveMt translation query.

Fields
parent

string

Required. Location to make a regional call.

Format: projects/{project-number-or-id}/locations/{location-id}.

dataset

string

Required. The resource name for the dataset to use for adaptive MT. projects/{project}/locations/{location-id}/adaptiveMtDatasets/{dataset}

content[]

string

Required. The content of the input in string format. For now only one sentence per request is supported.

AdaptiveMtTranslateResponse

An AdaptiveMtTranslate response.

Fields
translations[]

AdaptiveMtTranslation

Output only. The translation.

language_code

string

Output only. The translation's language code.

AdaptiveMtTranslation

An AdaptiveMt translation.

Fields
translated_text

string

Output only. The translated text.

BatchDocumentInputConfig

Input configuration for BatchTranslateDocument request.

Fields
Union field source. Specify the input. source can be only one of the following:
gcs_source

GcsSource

Google Cloud Storage location for the source input. This can be a single file (for example, gs://translation-test/input.docx) or a wildcard (for example, gs://translation-test/*).

File mime type is determined based on extension. Supported mime type includes: - pdf, application/pdf - docx, application/vnd.openxmlformats-officedocument.wordprocessingml.document - pptx, application/vnd.openxmlformats-officedocument.presentationml.presentation - xlsx, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

The max file size to support for .docx, .pptx and .xlsx is 100MB. The max file size to support for .pdf is 1GB and the max page limit is 1000 pages. The max file size to support for all input documents is 1GB.

BatchDocumentOutputConfig

Output configuration for BatchTranslateDocument request.

Fields
Union field destination. The destination of output. The destination directory provided must exist and be empty. destination can be only one of the following:
gcs_destination

GcsDestination

Google Cloud Storage destination for output content. For every single input document (for example, gs://a/b/c.[extension]), we generate at most 2 * n output files. (n is the # of target_language_codes in the BatchTranslateDocumentRequest).

While the input documents are being processed, we write/update an index file index.csv under gcs_destination.output_uri_prefix (for example, gs://translation_output/index.csv) The index file is generated/updated as new files are being translated. The format is:

input_document,target_language_code,translation_output,error_output, glossary_translation_output,glossary_error_output

input_document is one file we matched using gcs_source.input_uri. target_language_code is provided in the request. translation_output contains the translations. (details provided below) error_output contains the error message during processing of the file. Both translations_file and errors_file could be empty strings if we have no content to output. glossary_translation_output and glossary_error_output are the translated output/error when we apply glossaries. They could also be empty if we have no content to output.

Once a row is present in index.csv, the input/output matching never changes. Callers should also expect all the content in input_file are processed and ready to be consumed (that is, no partial output file is written).

Since index.csv will be keeping updated during the process, please make sure there is no custom retention policy applied on the output bucket that may avoid file updating. (https://cloud.google.com/storage/docs/bucket-lock#retention-policy)

The naming format of translation output files follows (for target language code [trg]): translation_output: gs://translation_output/a_b_c_[trg]_translation.[extension] glossary_translation_output: gs://translation_test/a_b_c_[trg]_glossary_translation.[extension]. The output document will maintain the same file format as the input document.

The naming format of error output files follows (for target language code [trg]): error_output: gs://translation_test/a_b_c_[trg]_errors.txt glossary_error_output: gs://translation_test/a_b_c_[trg]_glossary_translation.txt. The error output is a txt file containing error details.

BatchTransferResourcesMetadata

Metadata message for TranslationService.BatchTransferResources.

Fields
state

OperationState

The current state of the operation.

create_time

Timestamp

The creation time of the operation.

update_time

Timestamp

The last update time of the operation.

error

Status

Only populated when operation doesn't succeed.

BatchTransferResourcesResponse

Response message for BatchTransferResources.

Fields
responses[]

TransferResourceResponse

Responses of the transfer for individual resources.

TransferResourceResponse

Transfer response for a single resource.

Fields
source

string

Full name of the resource to transfer as specified in the request.

target

string

Full name of the new resource successfully transferred from the source hosted by Translation API. Target will be empty if the transfer failed.

error

Status

The error result in case of failure.

BatchTranslateDocumentMetadata

State metadata for the batch translation operation.

Fields
state

State

The state of the operation.

total_pages

int64

Total number of pages to translate in all documents so far. Documents without clear page definition (such as XLSX) are not counted.

translated_pages

int64

Number of successfully translated pages in all documents so far. Documents without clear page definition (such as XLSX) are not counted.

failed_pages

int64

Number of pages that failed to process in all documents so far. Documents without clear page definition (such as XLSX) are not counted.

total_billable_pages

int64

Number of billable pages in documents with clear page definition (such as PDF, DOCX, PPTX) so far.

total_characters

int64

Total number of characters (Unicode codepoints) in all documents so far.

translated_characters

int64

Number of successfully translated characters (Unicode codepoints) in all documents so far.

failed_characters

int64

Number of characters that have failed to process (Unicode codepoints) in all documents so far.

total_billable_characters

int64

Number of billable characters (Unicode codepoints) in documents without clear page definition (such as XLSX) so far.

submit_time

Timestamp

Time when the operation was submitted.

State

State of the job.

Enums
STATE_UNSPECIFIED Invalid.
RUNNING Request is being processed.
SUCCEEDED The batch is processed, and at least one item was successfully processed.
FAILED The batch is done and no item was successfully processed.
CANCELLING Request is in the process of being canceled after caller invoked longrunning.Operations.CancelOperation on the request id.
CANCELLED The batch is done after the user has called the longrunning.Operations.CancelOperation. Any records processed before the cancel command are output as specified in the request.

BatchTranslateDocumentRequest

The BatchTranslateDocument request.

Fields
parent

string

Required. Location to make a regional call.

Format: projects/{project-number-or-id}/locations/{location-id}.

The global location is not supported for batch translation.

Only AutoML Translation models or glossaries within the same region (have the same location-id) can be used, otherwise an INVALID_ARGUMENT (400) error is returned.

source_language_code

string

Required. The ISO-639 language code of the input document if known, for example, "en-US" or "sr-Latn". Supported language codes are listed in Language Support.

target_language_codes[]

string

Required. The ISO-639 language code to use for translation of the input document. Specify up to 10 language codes here.

input_configs[]

BatchDocumentInputConfig

Required. Input configurations. The total number of files matched should be <= 100. The total content size to translate should be <= 100M Unicode codepoints. The files must use UTF-8 encoding.

output_config

BatchDocumentOutputConfig

Required. Output configuration. If 2 input configs match to the same file (that is, same input path), we don't generate output for duplicate inputs.

models

map<string, string>

Optional. The models to use for translation. Map's key is target language code. Map's value is the model name. Value can be a built-in general model, or an AutoML Translation model.

The value format depends on model type:

  • AutoML Translation models: projects/{project-number-or-id}/locations/{location-id}/models/{model-id}

  • General (built-in) models: projects/{project-number-or-id}/locations/{location-id}/models/general/nmt,

If the map is empty or a specific model is not requested for a language pair, then default google model (nmt) is used.

glossaries

map<string, TranslateTextGlossaryConfig>

Optional. Glossaries to be applied. It's keyed by target language code.

format_conversions

map<string, string>

Optional. The file format conversion map that is applied to all input files. The map key is the original mime_type. The map value is the target mime_type of translated documents.

Supported file format conversion includes: - application/pdf to application/vnd.openxmlformats-officedocument.wordprocessingml.document

If nothing specified, output files will be in the same format as the original file.

customized_attribution

string

Optional. This flag is to support user customized attribution. If not provided, the default is Machine Translated by Google. Customized attribution should follow rules in https://cloud.google.com/translate/attribution#attribution_and_logos

enable_shadow_removal_native_pdf

bool

Optional. If true, use the text removal server to remove the shadow text on background image for native pdf translation. Shadow removal feature can only be enabled when is_translate_native_pdf_only: false && pdf_native_only: false

enable_rotation_correction

bool

Optional. If true, enable auto rotation correction in DVS.

BatchTranslateDocumentResponse

Stored in the google.longrunning.Operation.response field returned by BatchTranslateDocument if at least one document is translated successfully.

Fields
total_pages

int64

Total number of pages to translate in all documents. Documents without clear page definition (such as XLSX) are not counted.

translated_pages

int64

Number of successfully translated pages in all documents. Documents without clear page definition (such as XLSX) are not counted.

failed_pages

int64

Number of pages that failed to process in all documents. Documents without clear page definition (such as XLSX) are not counted.

total_billable_pages

int64

Number of billable pages in documents with clear page definition (such as PDF, DOCX, PPTX)

total_characters

int64

Total number of characters (Unicode codepoints) in all documents.

translated_characters

int64

Number of successfully translated characters (Unicode codepoints) in all documents.

failed_characters

int64

Number of characters that have failed to process (Unicode codepoints) in all documents.

total_billable_characters

int64

Number of billable characters (Unicode codepoints) in documents without clear page definition, such as XLSX.

submit_time

Timestamp

Time when the operation was submitted.

end_time

Timestamp

The time when the operation is finished and google.longrunning.Operation.done is set to true.

BatchTranslateMetadata

State metadata for the batch translation operation.

Fields
state

State

The state of the operation.

translated_characters

int64

Number of successfully translated characters so far (Unicode codepoints).

failed_characters

int64

Number of characters that have failed to process so far (Unicode codepoints).

total_characters

int64

Total number of characters (Unicode codepoints). This is the total number of codepoints from input files times the number of target languages and appears here shortly after the call is submitted.

submit_time

Timestamp

Time when the operation was submitted.

State

State of the job.

Enums
STATE_UNSPECIFIED Invalid.
RUNNING Request is being processed.
SUCCEEDED The batch is processed, and at least one item was successfully processed.
FAILED The batch is done and no item was successfully processed.
CANCELLING Request is in the process of being canceled after caller invoked longrunning.Operations.CancelOperation on the request id.
CANCELLED The batch is done after the user has called the longrunning.Operations.CancelOperation. Any records processed before the cancel command are output as specified in the request.

BatchTranslateResponse

Stored in the google.longrunning.Operation.response field returned by BatchTranslateText if at least one sentence is translated successfully.

Fields
total_characters

int64

Total number of characters (Unicode codepoints).

translated_characters

int64

Number of successfully translated characters (Unicode codepoints).

failed_characters

int64

Number of characters that have failed to process (Unicode codepoints).

submit_time

Timestamp

Time when the operation was submitted.

end_time

Timestamp

The time when the operation is finished and google.longrunning.Operation.done is set to true.

BatchTranslateTextRequest

The batch translation request.

Fields
parent

string

Required. Location to make a call. Must refer to a caller's project.

Format: projects/{project-number-or-id}/locations/{location-id}.

The global location is not supported for batch translation.

Only AutoML Translation models or glossaries within the same region (have the same location-id) can be used, otherwise an INVALID_ARGUMENT (400) error is returned.

source_language_code

string

Required. Source language code.

target_language_codes[]

string

Required. Specify up to 10 language codes here.

models

map<string, string>

Optional. The models to use for translation. Map's key is target language code. Map's value is model name. Value can be a built-in general model, or an AutoML Translation model.

The value format depends on model type:

  • AutoML Translation models: projects/{project-number-or-id}/locations/{location-id}/models/{model-id}

  • General (built-in) models: projects/{project-number-or-id}/locations/{location-id}/models/general/nmt,

If the map is empty or a specific model is not requested for a language pair, then default google model (nmt) is used.

Authorization requires one or more of the following IAM permissions on the specified resource models:

  • cloudtranslate.generalModels.batchPredict
  • automl.models.predict
input_configs[]

InputConfig

Required. Input configurations. The total number of files matched should be <= 100. The total content size should be <= 100M Unicode codepoints. The files must use UTF-8 encoding.

output_config

OutputConfig

Required. Output configuration. If 2 input configs match to the same file (that is, same input path), we don't generate output for duplicate inputs.

glossaries

map<string, TranslateTextGlossaryConfig>

Optional. Glossaries to be applied for translation. It's keyed by target language code.

Authorization requires the following IAM permission on the specified resource glossaries:

  • cloudtranslate.glossaries.batchPredict
labels

map<string, string>

Optional. The labels with user-defined metadata for the request.

Label keys and values can be no longer than 63 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter.

See https://cloud.google.com/translate/docs/advanced/labels for more information.

CreateAdaptiveMtDatasetRequest

 Request message for creating an AdaptiveMtDataset.

Fields
parent

string

Required. Name of the parent project. In form of projects/{project-number-or-id}/locations/{location-id}

adaptive_mt_dataset

AdaptiveMtDataset

Required. The AdaptiveMtDataset to be created.

CreateDatasetMetadata

Metadata of create dataset operation.

Fields
state

OperationState

The current state of the operation.

create_time

Timestamp

The creation time of the operation.

update_time

Timestamp

The last update time of the operation.

error

Status

Only populated when operation doesn't succeed.

CreateDatasetRequest

Request message for CreateDataset.

Fields
parent

string

Required. The project name.

dataset

Dataset

Required. The Dataset to create.

CreateGlossaryEntryRequest

Request message for CreateGlosaryEntry

Fields
parent

string

Required. The resource name of the glossary to create the entry under.

glossary_entry

GlossaryEntry

Required. The glossary entry to create

CreateGlossaryMetadata

Stored in the google.longrunning.Operation.metadata field returned by CreateGlossary.

Fields
name

string

The name of the glossary that is being created.

state

State

The current state of the glossary creation operation.

submit_time

Timestamp

The time when the operation was submitted to the server.

State

Enumerates the possible states that the creation request can be in.

Enums
STATE_UNSPECIFIED Invalid.
RUNNING Request is being processed.
SUCCEEDED The glossary was successfully created.
FAILED Failed to create the glossary.
CANCELLING Request is in the process of being canceled after caller invoked longrunning.Operations.CancelOperation on the request id.
CANCELLED The glossary creation request was successfully canceled.

CreateGlossaryRequest

Request message for CreateGlossary.

Fields
parent

string

Required. The project name.

Authorization requires the following IAM permission on the specified resource parent:

  • cloudtranslate.glossaries.create
glossary

Glossary

Required. The glossary to create.

CreateModelMetadata

Metadata of create model operation.

Fields
state

OperationState

The current state of the operation.

create_time

Timestamp

The creation time of the operation.

update_time

Timestamp

The last update time of the operation.

error

Status

Only populated when operation doesn't succeed.

CreateModelRequest

Request message for CreateModel.

Fields
parent

string

Required. The project name, in form of projects/{project}/locations/{location}

model

Model

Required. The Model to create.

Dataset

A dataset that hosts the examples (sentence pairs) used for translation models.

Fields
name

string

The resource name of the dataset, in form of projects/{project-number-or-id}/locations/{location_id}/datasets/{dataset_id}

display_name

string

The name of the dataset to show in the interface. The name can be up to 32 characters long and can consist only of ASCII Latin letters A-Z and a-z, underscores (_), and ASCII digits 0-9.

source_language_code

string

The BCP-47 language code of the source language.

target_language_code

string

The BCP-47 language code of the target language.

example_count

int32

Output only. The number of examples in the dataset.

train_example_count

int32

Output only. Number of training examples (sentence pairs).

validate_example_count

int32

Output only. Number of validation examples (sentence pairs).

test_example_count

int32

Output only. Number of test examples (sentence pairs).

create_time

Timestamp

Output only. Timestamp when this dataset was created.

update_time

Timestamp

Output only. Timestamp when this dataset was last updated.

DatasetInputConfig

Input configuration for datasets.

Fields
input_files[]

InputFile

Files containing the sentence pairs to be imported to the dataset.

InputFile

An input file.

Fields
usage

string

Optional. Usage of the file contents. Options are TRAIN|VALIDATION|TEST, or UNASSIGNED (by default) for auto split.

Union field source. Source of the file containing sentence pairs. Supported formats are tab-separated values (.tsv) and Translation Memory eXchange (.tmx) . source can be only one of the following:
gcs_source

GcsInputSource

Google Cloud Storage file source.

DatasetOutputConfig

Output configuration for datasets.

Fields
Union field destination. Required. Specify the output. destination can be only one of the following:
gcs_destination

GcsOutputDestination

Google Cloud Storage destination to write the output.

DeleteAdaptiveMtDatasetRequest

Request message for deleting an AdaptiveMtDataset.

Fields
name

string

Required. Name of the dataset. In the form of projects/{project-number-or-id}/locations/{location-id}/adaptiveMtDatasets/{adaptive-mt-dataset-id}

DeleteAdaptiveMtFileRequest

The request for deleting an AdaptiveMt file.

Fields
name

string

Required. The resource name of the file to delete, in form of projects/{project-number-or-id}/locations/{location_id}/adaptiveMtDatasets/{dataset}/adaptiveMtFiles/{file}

DeleteDatasetMetadata

Metadata of delete dataset operation.

Fields
state

OperationState

The current state of the operation.

create_time

Timestamp

The creation time of the operation.

update_time

Timestamp

The last update time of the operation.

error

Status

Only populated when operation doesn't succeed.

DeleteDatasetRequest

Request message for DeleteDataset.

Fields
name

string

Required. The name of the dataset to delete.

DeleteGlossaryEntryRequest

Request message for Delete Glossary Entry

Fields
name

string

Required. The resource name of the glossary entry to delete

DeleteGlossaryMetadata

Stored in the google.longrunning.Operation.metadata field returned by DeleteGlossary.

Fields
name

string

The name of the glossary that is being deleted.

state

State

The current state of the glossary deletion operation.

submit_time

Timestamp

The time when the operation was submitted to the server.

State

Enumerates the possible states that the creation request can be in.

Enums
STATE_UNSPECIFIED Invalid.
RUNNING Request is being processed.
SUCCEEDED The glossary was successfully deleted.
FAILED Failed to delete the glossary.
CANCELLING Request is in the process of being canceled after caller invoked longrunning.Operations.CancelOperation on the request id.
CANCELLED The glossary deletion request was successfully canceled.

DeleteGlossaryRequest

Request message for DeleteGlossary.

Fields
name

string

Required. The name of the glossary to delete.

Authorization requires the following IAM permission on the specified resource name:

  • cloudtranslate.glossaries.delete

DeleteGlossaryResponse

Stored in the google.longrunning.Operation.response field returned by DeleteGlossary.

Fields
name

string

The name of the deleted glossary.

submit_time

Timestamp

The time when the operation was submitted to the server.

end_time

Timestamp

The time when the glossary deletion is finished and google.longrunning.Operation.done is set to true.

DeleteModelMetadata

Metadata of delete model operation.

Fields
state

OperationState

The current state of the operation.

create_time

Timestamp

The creation time of the operation.

update_time

Timestamp

The last update time of the operation.

error

Status

Only populated when operation doesn't succeed.

DeleteModelRequest

Request message for DeleteModel.

Fields
name

string

Required. The name of the model to delete.

DetectLanguageRequest

The request message for language detection.

Fields
parent

string

Required. Project or location to make a call. Must refer to a caller's project.

Format: projects/{project-number-or-id}/locations/{location-id} or projects/{project-number-or-id}.

For global calls, use projects/{project-number-or-id}/locations/global or projects/{project-number-or-id}.

Only models within the same region (has same location-id) can be used. Otherwise an INVALID_ARGUMENT (400) error is returned.

model

string

Optional. The language detection model to be used.

Format: projects/{project-number-or-id}/locations/{location-id}/models/language-detection/{model-id}

Only one language detection model is currently supported: projects/{project-number-or-id}/locations/{location-id}/models/language-detection/default.

If not specified, the default model is used.

Authorization requires the following IAM permission on the specified resource model:

  • cloudtranslate.languageDetectionModels.predict
mime_type

string

Optional. The format of the source text, for example, "text/html", "text/plain". If left blank, the MIME type defaults to "text/html".

labels

map<string, string>

Optional. The labels with user-defined metadata for the request.

Label keys and values can be no longer than 63 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter.

See https://cloud.google.com/translate/docs/advanced/labels for more information.

Union field source. Required. The source of the document from which to detect the language. source can be only one of the following:
content

string

The content of the input stored as a string.

DetectLanguageResponse

The response message for language detection.

Fields
languages[]

DetectedLanguage

The most probable language detected by the Translation API. For each request, the Translation API will always return only one result.

DetectedLanguage

The response message for language detection.

Fields
language_code

string

The ISO-639 language code of the source content in the request, detected automatically.

confidence

float

The confidence of the detection result for this language.

DocumentInputConfig

A document translation request input config.

Fields
mime_type

string

Specifies the input document's mime_type.

If not specified it will be determined using the file extension for gcs_source provided files. For a file provided through bytes content the mime_type must be provided. Currently supported mime types are: - application/pdf - application/vnd.openxmlformats-officedocument.wordprocessingml.document - application/vnd.openxmlformats-officedocument.presentationml.presentation - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

Union field source. Specifies the source for the document's content. The input file size should be <= 20MB for - application/vnd.openxmlformats-officedocument.wordprocessingml.document - application/vnd.openxmlformats-officedocument.presentationml.presentation - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet The input file size should be <= 20MB and the maximum page limit is 20 for - application/pdf source can be only one of the following:
content

bytes

Document's content represented as a stream of bytes.

gcs_source

GcsSource

Google Cloud Storage location. This must be a single file. For example: gs://example_bucket/example_file.pdf

DocumentOutputConfig

A document translation request output config.

Fields
mime_type

string

Optional. Specifies the translated document's mime_type. If not specified, the translated file's mime type will be the same as the input file's mime type. Currently only support the output mime type to be the same as input mime type. - application/pdf - application/vnd.openxmlformats-officedocument.wordprocessingml.document - application/vnd.openxmlformats-officedocument.presentationml.presentation - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

Union field destination. A URI destination for the translated document. It is optional to provide a destination. If provided the results from TranslateDocument will be stored in the destination. Whether a destination is provided or not, the translated documents will be returned within TranslateDocumentResponse.document_translation and TranslateDocumentResponse.glossary_document_translation. destination can be only one of the following:
gcs_destination

GcsDestination

Optional. Google Cloud Storage destination for the translation output, e.g., gs://my_bucket/my_directory/.

The destination directory provided does not have to be empty, but the bucket must exist. If a file with the same name as the output file already exists in the destination an error will be returned.

For a DocumentInputConfig.contents provided document, the output file will have the name "output_[trg]_translations.[ext]", where - [trg] corresponds to the translated file's language code, - [ext] corresponds to the translated file's extension according to its mime type.

For a DocumentInputConfig.gcs_uri provided document, the output file will have a name according to its URI. For example: an input file with URI: gs://a/b/c.[extension] stored in a gcs_destination bucket with name "my_bucket" will have an output URI: gs://my_bucket/a_b_c_[trg]_translations.[ext], where - [trg] corresponds to the translated file's language code, - [ext] corresponds to the translated file's extension according to its mime type.

If the document was directly provided through the request, then the output document will have the format: gs://my_bucket/translated_document_[trg]_translations.[ext], where - [trg] corresponds to the translated file's language code, - [ext] corresponds to the translated file's extension according to its mime type.

If a glossary was provided, then the output URI for the glossary translation will be equal to the default output URI but have glossary_translations instead of translations. For the previous example, its glossary URI would be: gs://my_bucket/a_b_c_[trg]_glossary_translations.[ext].

Thus the max number of output files will be 2 (Translated document, Glossary translated document).

Callers should expect no partial outputs. If there is any error during document translation, no output will be stored in the Cloud Storage bucket.

DocumentTranslation

A translated document message.

Fields
byte_stream_outputs[]

bytes

The array of translated documents. It is expected to be size 1 for now. We may produce multiple translated documents in the future for other type of file formats.

mime_type

string

The translated document's mime type.

detected_language_code

string

The detected language for the input document. If the user did not provide the source language for the input document, this field will have the language code automatically detected. If the source language was passed, auto-detection of the language does not occur and this field is empty.

Example

A sentence pair.

Fields
name

string

Output only. The resource name of the example, in form of `projects/{project-number-or-id}/locations/{location_id}/datasets/{dataset_id}/examples/{example_id}'

source_text

string

Sentence in source language.

target_text

string

Sentence in target language.

usage

string

Output only. Usage of the sentence pair. Options are TRAIN|VALIDATION|TEST.

ExportDataMetadata

Metadata of export data operation.

Fields
state

OperationState

The current state of the operation.

create_time

Timestamp

The creation time of the operation.

update_time

Timestamp

The last update time of the operation.

error

Status

Only populated when operation doesn't succeed.

ExportDataRequest

Request message for ExportData.

Fields
dataset

string

Required. Name of the dataset. In form of projects/{project-number-or-id}/locations/{location-id}/datasets/{dataset-id}

output_config

DatasetOutputConfig

Required. The config for the output content.

FileInputSource

An inlined file.

Fields
mime_type

string

Required. The file's mime type.

content

bytes

Required. The file's byte contents.

display_name

string

Required. The file's display name.

GcsDestination

The Google Cloud Storage location for the output content.

Fields
output_uri_prefix

string

Required. The bucket used in 'output_uri_prefix' must exist and there must be no files under 'output_uri_prefix'. 'output_uri_prefix' must end with "/" and start with "gs://". One 'output_uri_prefix' can only be used by one batch translation job at a time. Otherwise an INVALID_ARGUMENT (400) error is returned.

GcsInputSource

The Google Cloud Storage location for the input content.

Fields
input_uri

string

Required. Source data URI. For example, gs://my_bucket/my_object.

GcsOutputDestination

The Google Cloud Storage location for the output content.

Fields
output_uri_prefix

string

Required. Google Cloud Storage URI to output directory. For example, gs://bucket/directory. The requesting user must have write permission to the bucket. The directory will be created if it doesn't exist.

GcsSource

The Google Cloud Storage location for the input content.

Fields
input_uri

string

Required. Source data URI. For example, gs://my_bucket/my_object.

GetAdaptiveMtDatasetRequest

Request message for getting an Adaptive MT dataset.

Fields
name

string

Required. Name of the dataset. In the form of projects/{project-number-or-id}/locations/{location-id}/adaptiveMtDatasets/{adaptive-mt-dataset-id}

GetAdaptiveMtFileRequest

The request for getting an AdaptiveMtFile.

Fields
name

string

Required. The resource name of the file, in form of projects/{project-number-or-id}/locations/{location_id}/adaptiveMtDatasets/{dataset}/adaptiveMtFiles/{file}

GetDatasetRequest

Request message for GetDataset.

Fields
name

string

Required. The resource name of the dataset to retrieve.

GetGlossaryEntryRequest

Request message for the Get Glossary Entry Api

Fields
name

string

Required. The resource name of the glossary entry to get

GetGlossaryRequest

Request message for GetGlossary.

Fields
name

string

Required. The name of the glossary to retrieve.

Authorization requires the following IAM permission on the specified resource name:

  • cloudtranslate.glossaries.get

GetModelRequest

Request message for GetModel.

Fields
name

string

Required. The resource name of the model to retrieve.

GetSupportedLanguagesRequest

The request message for discovering supported languages.

Fields
parent

string

Required. Project or location to make a call. Must refer to a caller's project.

Format: projects/{project-number-or-id} or projects/{project-number-or-id}/locations/{location-id}.

For global calls, use projects/{project-number-or-id}/locations/global or projects/{project-number-or-id}.

Non-global location is required for AutoML models.

Only models within the same region (have same location-id) can be used, otherwise an INVALID_ARGUMENT (400) error is returned.

display_language_code

string

Optional. The language to use to return localized, human readable names of supported languages. If missing, then display names are not returned in a response.

model

string

Optional. Get supported languages of this model.

The format depends on model type:

  • AutoML Translation models: projects/{project-number-or-id}/locations/{location-id}/models/{model-id}

  • General (built-in) models: projects/{project-number-or-id}/locations/{location-id}/models/general/nmt,

Returns languages supported by the specified model. If missing, we get supported languages of Google general NMT model.

Authorization requires one or more of the following IAM permissions on the specified resource model:

  • cloudtranslate.generalModels.get
  • automl.models.get

Glossary

Represents a glossary built from user-provided data.

Fields
name

string

Required. The resource name of the glossary. Glossary names have the form projects/{project-number-or-id}/locations/{location-id}/glossaries/{glossary-id}.

input_config

GlossaryInputConfig

Required. Provides examples to build the glossary from. Total glossary must not exceed 10M Unicode codepoints.

entry_count

int32

Output only. The number of entries defined in the glossary.

submit_time

Timestamp

Output only. When CreateGlossary was called.

end_time

Timestamp

Output only. When the glossary creation was finished.

display_name

string

Optional. The display name of the glossary.

Union field languages. Languages supported by the glossary. languages can be only one of the following:
language_pair

LanguageCodePair

Used with unidirectional glossaries.

language_codes_set

LanguageCodesSet

Used with equivalent term set glossaries.

LanguageCodePair

Used with unidirectional glossaries.

Fields
source_language_code

string

Required. The ISO-639 language code of the input text, for example, "en-US". Expected to be an exact match for GlossaryTerm.language_code.

target_language_code

string

Required. The ISO-639 language code for translation output, for example, "zh-CN". Expected to be an exact match for GlossaryTerm.language_code.

LanguageCodesSet

Used with equivalent term set glossaries.

Fields
language_codes[]

string

The ISO-639 language code(s) for terms defined in the glossary. All entries are unique. The list contains at least two entries. Expected to be an exact match for GlossaryTerm.language_code.

GlossaryEntry

Represents a single entry in a glossary.

Fields
name

string

Required. The resource name of the entry. Format: "projects/*/locations/*/glossaries/*/glossaryEntries/*"

description

string

Describes the glossary entry.

Union field data. The different data for the glossary types (Unidirectional, Equivalent term sets). data can be only one of the following:
terms_pair

GlossaryTermsPair

Used for an unidirectional glossary.

terms_set

GlossaryTermsSet

Used for an equivalent term sets glossary.

GlossaryTermsPair

Represents a single entry for an unidirectional glossary.

Fields
source_term

GlossaryTerm

The source term is the term that will get match in the text,

target_term

GlossaryTerm

The term that will replace the match source term.

GlossaryTermsSet

Represents a single entry for an equivalent term set glossary. This is used for equivalent term sets where each term can be replaced by the other terms in the set.

Fields
terms[]

GlossaryTerm

Each term in the set represents a term that can be replaced by the other terms.

GlossaryInputConfig

Input configuration for glossaries.

Fields
Union field source. Required. Specify the input. source can be only one of the following:
gcs_source

GcsSource

Required. Google Cloud Storage location of glossary data. File format is determined based on the filename extension. API returns google.rpc.Code.INVALID_ARGUMENT for unsupported URI-s and file formats. Wildcards are not allowed. This must be a single file in one of the following formats:

For unidirectional glossaries:

  • TSV/CSV (.tsv/.csv): Two column file, tab- or comma-separated. The first column is source text. The second column is target text. No headers in this file. The first row contains data and not column names.

  • TMX (.tmx): TMX file with parallel data defining source/target term pairs.

For equivalent term sets glossaries:

  • CSV (.csv): Multi-column CSV file defining equivalent glossary terms in multiple languages. See documentation for more information - glossaries.

GlossaryTerm

Represents a single glossary term

Fields
language_code

string

The language for this glossary term.

text

string

The text for the glossary term.

ImportAdaptiveMtFileRequest

The request for importing an AdaptiveMt file along with its sentences.

Fields
parent

string

Required. The resource name of the file, in form of projects/{project-number-or-id}/locations/{location_id}/adaptiveMtDatasets/{dataset}

Union field source. The source for the document. source can be only one of the following:
file_input_source

FileInputSource

Inline file source.

gcs_input_source

GcsInputSource

Google Cloud Storage file source.

ImportAdaptiveMtFileResponse

The response for importing an AdaptiveMtFile

Fields
adaptive_mt_file

AdaptiveMtFile

Output only. The Adaptive MT file that was imported.

ImportDataMetadata

Metadata of import data operation.

Fields
state

OperationState

The current state of the operation.

create_time

Timestamp

The creation time of the operation.

update_time

Timestamp

The last update time of the operation.

error

Status

Only populated when operation doesn't succeed.

ImportDataRequest

Request message for ImportData.

Fields
dataset

string

Required. Name of the dataset. In form of projects/{project-number-or-id}/locations/{location-id}/datasets/{dataset-id}

input_config

DatasetInputConfig

Required. The config for the input content.

InputConfig

Input configuration for BatchTranslateText request.

Fields
mime_type

string

Optional. Can be "text/plain" or "text/html". For .tsv, "text/html" is used if mime_type is missing. For .html, this field must be "text/html" or empty. For .txt, this field must be "text/plain" or empty.

Union field source. Required. Specify the input. source can be only one of the following:
gcs_source

GcsSource

Required. Google Cloud Storage location for the source input. This can be a single file (for example, gs://translation-test/input.tsv) or a wildcard (for example, gs://translation-test/*). If a file extension is .tsv, it can contain either one or two columns. The first column (optional) is the id of the text request. If the first column is missing, we use the row number (0-based) from the input file as the ID in the output file. The second column is the actual text to be translated. We recommend each row be <= 10K Unicode codepoints, otherwise an error might be returned. Note that the input tsv must be RFC 4180 compliant.

You could use https://github.com/Clever/csvlint to check potential formatting errors in your tsv file. csvlint --delimiter='\t' your_input_file.tsv

The other supported file extensions are .txt or .html, which is treated as a single large chunk of text.

ListAdaptiveMtDatasetsRequest

Request message for listing all Adaptive MT datasets that the requestor has access to.

Fields
parent

string

Required. The resource name of the project from which to list the Adaptive MT datasets. projects/{project-number-or-id}/locations/{location-id}

page_size

int32

Optional. Requested page size. The server may return fewer results than requested. If unspecified, the server picks an appropriate default.

page_token

string

Optional. A token identifying a page of results the server should return. Typically, this is the value of ListAdaptiveMtDatasetsResponse.next_page_token returned from the previous call to ListAdaptiveMtDatasets method. The first page is returned if page_tokenis empty or missing.

filter

string

Optional. An expression for filtering the results of the request. Filter is not supported yet.

ListAdaptiveMtDatasetsResponse

A list of AdaptiveMtDatasets.

Fields
adaptive_mt_datasets[]

AdaptiveMtDataset

Output only. A list of Adaptive MT datasets.

next_page_token

string

Optional. A token to retrieve a page of results. Pass this value in the [ListAdaptiveMtDatasetsRequest.page_token] field in the subsequent call to ListAdaptiveMtDatasets method to retrieve the next page of results.

ListAdaptiveMtFilesRequest

The request to list all AdaptiveMt files under a given dataset.

Fields
parent

string

Required. The resource name of the project from which to list the Adaptive MT files. projects/{project}/locations/{location}/adaptiveMtDatasets/{dataset}

page_size

int32

Optional.

page_token

string

Optional. A token identifying a page of results the server should return. Typically, this is the value of ListAdaptiveMtFilesResponse.next_page_token returned from the previous call to ListAdaptiveMtFiles method. The first page is returned if page_tokenis empty or missing.

ListAdaptiveMtFilesResponse

The response for listing all AdaptiveMt files under a given dataset.

Fields
adaptive_mt_files[]

AdaptiveMtFile

Output only. The Adaptive MT files.

next_page_token

string

Optional. A token to retrieve a page of results. Pass this value in the ListAdaptiveMtFilesRequest.page_token field in the subsequent call to ListAdaptiveMtFiles method to retrieve the next page of results.

ListAdaptiveMtSentencesRequest

The request for listing Adaptive MT sentences from a Dataset/File.

Fields
parent

string

Required. The resource name of the project from which to list the Adaptive MT files. The following format lists all sentences under a file. projects/{project}/locations/{location}/adaptiveMtDatasets/{dataset}/adaptiveMtFiles/{file} The following format lists all sentences within a dataset. projects/{project}/locations/{location}/adaptiveMtDatasets/{dataset}

page_size

int32

page_token

string

A token identifying a page of results the server should return. Typically, this is the value of ListAdaptiveMtSentencesRequest.next_page_token returned from the previous call to ListTranslationMemories method. The first page is returned if page_token is empty or missing.

ListAdaptiveMtSentencesResponse

List AdaptiveMt sentences response.

Fields
adaptive_mt_sentences[]

AdaptiveMtSentence

Output only. The list of AdaptiveMtSentences.

next_page_token

string

Optional.

ListDatasetsRequest

Request message for ListDatasets.

Fields
parent

string

Required. Name of the parent project. In form of projects/{project-number-or-id}/locations/{location-id}

page_size

int32

Optional. Requested page size. The server can return fewer results than requested.

page_token

string

Optional. A token identifying a page of results for the server to return. Typically obtained from next_page_token field in the response of a ListDatasets call.

ListDatasetsResponse

Response message for ListDatasets.

Fields
datasets[]

Dataset

The datasets read.

next_page_token

string

A token to retrieve next page of results. Pass this token to the page_token field in the ListDatasetsRequest to obtain the corresponding page.

ListExamplesRequest

Request message for ListExamples.

Fields
parent

string

Required. Name of the parent dataset. In form of projects/{project-number-or-id}/locations/{location-id}/datasets/{dataset-id}

filter

string

Optional. An expression for filtering the examples that will be returned. Example filter: * usage=TRAIN

page_size

int32

Optional. Requested page size. The server can return fewer results than requested.

page_token

string

Optional. A token identifying a page of results for the server to return. Typically obtained from next_page_token field in the response of a ListExamples call.

ListExamplesResponse

Response message for ListExamples.

Fields
examples[]

Example

The sentence pairs.

next_page_token

string

A token to retrieve next page of results. Pass this token to the page_token field in the ListExamplesRequest to obtain the corresponding page.

ListGlossariesRequest

Request message for ListGlossaries.

Fields
parent

string

Required. The name of the project from which to list all of the glossaries.

Authorization requires the following IAM permission on the specified resource parent:

  • cloudtranslate.glossaries.list
page_size

int32

Optional. Requested page size. The server may return fewer glossaries than requested. If unspecified, the server picks an appropriate default.

page_token

string

Optional. A token identifying a page of results the server should return. Typically, this is the value of [ListGlossariesResponse.next_page_token] returned from the previous call to ListGlossaries method. The first page is returned if page_tokenis empty or missing.

filter

string

Optional. Filter specifying constraints of a list operation. Specify the constraint by the format of "key=value", where key must be "src" or "tgt", and the value must be a valid language code. For multiple restrictions, concatenate them by "AND" (uppercase only), such as: "src=en-US AND tgt=zh-CN". Notice that the exact match is used here, which means using 'en-US' and 'en' can lead to different results, which depends on the language code you used when you create the glossary. For the unidirectional glossaries, the "src" and "tgt" add restrictions on the source and target language code separately. For the equivalent term set glossaries, the "src" and/or "tgt" add restrictions on the term set. For example: "src=en-US AND tgt=zh-CN" will only pick the unidirectional glossaries which exactly match the source language code as "en-US" and the target language code "zh-CN", but all equivalent term set glossaries which contain "en-US" and "zh-CN" in their language set will be picked. If missing, no filtering is performed.

ListGlossariesResponse

Response message for ListGlossaries.

Fields
glossaries[]

Glossary

The list of glossaries for a project.

next_page_token

string

A token to retrieve a page of results. Pass this value in the [ListGlossariesRequest.page_token] field in the subsequent call to ListGlossaries method to retrieve the next page of results.

ListGlossaryEntriesRequest

Request message for ListGlossaryEntries

Fields
parent

string

Required. The parent glossary resource name for listing the glossary's entries.

page_size

int32

Optional. Requested page size. The server may return fewer glossary entries than requested. If unspecified, the server picks an appropriate default.

page_token

string

Optional. A token identifying a page of results the server should return. Typically, this is the value of [ListGlossaryEntriesResponse.next_page_token] returned from the previous call. The first page is returned if page_tokenis empty or missing.

ListGlossaryEntriesResponse

Response message for ListGlossaryEntries

Fields
glossary_entries[]

GlossaryEntry

Optional. The Glossary Entries

next_page_token

string

Optional. A token to retrieve a page of results. Pass this value in the [ListGLossaryEntriesRequest.page_token] field in the subsequent calls.

ListModelsRequest

Request message for ListModels.

Fields
parent

string

Required. Name of the parent project. In form of projects/{project-number-or-id}/locations/{location-id}

filter

string

Optional. An expression for filtering the models that will be returned. Supported filter: dataset_id=${dataset_id}

page_size

int32

Optional. Requested page size. The server can return fewer results than requested.

page_token

string

Optional. A token identifying a page of results for the server to return. Typically obtained from next_page_token field in the response of a ListModels call.

ListModelsResponse

Response message for ListModels.

Fields
models[]

Model

The models read.

next_page_token

string

A token to retrieve next page of results. Pass this token to the page_token field in the ListModelsRequest to obtain the corresponding page.

Model

A trained translation model.

Fields
name

string

The resource name of the model, in form of projects/{project-number-or-id}/locations/{location_id}/models/{model_id}

display_name

string

The name of the model to show in the interface. The name can be up to 32 characters long and can consist only of ASCII Latin letters A-Z and a-z, underscores (_), and ASCII digits 0-9.

dataset

string

The dataset from which the model is trained, in form of projects/{project-number-or-id}/locations/{location_id}/datasets/{dataset_id}

source_language_code

string

Output only. The BCP-47 language code of the source language.

target_language_code

string

Output only. The BCP-47 language code of the target language.

train_example_count

int32

Output only. Number of examples (sentence pairs) used to train the model.

validate_example_count

int32

Output only. Number of examples (sentence pairs) used to validate the model.

test_example_count

int32

Output only. Number of examples (sentence pairs) used to test the model.

create_time

Timestamp

Output only. Timestamp when the model resource was created, which is also when the training started.

update_time

Timestamp

Output only. Timestamp when this model was last updated.

OperationState

Possible states of long running operations.

Enums
OPERATION_STATE_UNSPECIFIED Invalid.
OPERATION_STATE_RUNNING Request is being processed.
OPERATION_STATE_SUCCEEDED The operation was successful.
OPERATION_STATE_FAILED Failed to process operation.
OPERATION_STATE_CANCELLING Request is in the process of being canceled after caller invoked longrunning.Operations.CancelOperation on the request id.
OPERATION_STATE_CANCELLED The operation request was successfully canceled.

OutputConfig

Output configuration for BatchTranslateText request.

Fields
Union field destination. Required. The destination of output. destination can be only one of the following:
gcs_destination

GcsDestination

Google Cloud Storage destination for output content. For every single input file (for example, gs://a/b/c.[extension]), we generate at most 2 * n output files. (n is the # of target_language_codes in the BatchTranslateTextRequest).

Output files (tsv) generated are compliant with RFC 4180 except that record delimiters are '\n' instead of '\r\n'. We don't provide any way to change record delimiters.

While the input files are being processed, we write/update an index file 'index.csv' under 'output_uri_prefix' (for example, gs://translation-test/index.csv) The index file is generated/updated as new files are being translated. The format is:

input_file,target_language_code,translations_file,errors_file, glossary_translations_file,glossary_errors_file

input_file is one file we matched using gcs_source.input_uri. target_language_code is provided in the request. translations_file contains the translations. (details provided below) errors_file contains the errors during processing of the file. (details below). Both translations_file and errors_file could be empty strings if we have no content to output. glossary_translations_file and glossary_errors_file are always empty strings if the input_file is tsv. They could also be empty if we have no content to output.

Once a row is present in index.csv, the input/output matching never changes. Callers should also expect all the content in input_file are processed and ready to be consumed (that is, no partial output file is written).

Since index.csv will be keeping updated during the process, please make sure there is no custom retention policy applied on the output bucket that may avoid file updating. (https://cloud.google.com/storage/docs/bucket-lock#retention-policy)

The format of translations_file (for target language code 'trg') is: gs://translation_test/a_b_c_'trg'_translations.[extension]

If the input file extension is tsv, the output has the following columns: Column 1: ID of the request provided in the input, if it's not provided in the input, then the input row number is used (0-based). Column 2: source sentence. Column 3: translation without applying a glossary. Empty string if there is an error. Column 4 (only present if a glossary is provided in the request): translation after applying the glossary. Empty string if there is an error applying the glossary. Could be same string as column 3 if there is no glossary applied.

If input file extension is a txt or html, the translation is directly written to the output file. If glossary is requested, a separate glossary_translations_file has format of gs://translation_test/a_b_c_'trg'_glossary_translations.[extension]

The format of errors file (for target language code 'trg') is: gs://translation_test/a_b_c_'trg'_errors.[extension]

If the input file extension is tsv, errors_file contains the following: Column 1: ID of the request provided in the input, if it's not provided in the input, then the input row number is used (0-based). Column 2: source sentence. Column 3: Error detail for the translation. Could be empty. Column 4 (only present if a glossary is provided in the request): Error when applying the glossary.

If the input file extension is txt or html, glossary_error_file will be generated that contains error details. glossary_error_file has format of gs://translation_test/a_b_c_'trg'_glossary_errors.[extension]

Romanization

A single romanization response.

Fields
romanized_text

string

Romanized text. If an error occurs during romanization, this field might be excluded from the response.

detected_language_code

string

The ISO-639 language code of source text in the initial request, detected automatically, if no source language was passed within the initial request. If the source language was passed, auto-detection of the language does not occur and this field is empty.

RomanizeTextRequest

The request message for synchronous romanization.

Fields
parent

string

Required. Project or location to make a call. Must refer to a caller's project.

Format: projects/{project-number-or-id}/locations/{location-id} or projects/{project-number-or-id}.

For global calls, use projects/{project-number-or-id}/locations/global or projects/{project-number-or-id}.

contents[]

string

Required. The content of the input in string format.

source_language_code

string

Optional. The ISO-639 language code of the input text if known, for example, "hi" or "zh". If the source language isn't specified, the API attempts to identify the source language automatically and returns the source language for each content in the response.

RomanizeTextResponse

The response message for synchronous romanization.

Fields
romanizations[]

Romanization

Text romanization responses. This field has the same length as contents.

SupportedLanguage

A single supported language response corresponds to information related to one supported language.

Fields
language_code

string

Supported language code, generally consisting of its ISO 639-1 identifier, for example, 'en', 'ja'. In certain cases, ISO-639 codes including language and region identifiers are returned (for example, 'zh-TW' and 'zh-CN').

display_name

string

Human-readable name of the language localized in the display language specified in the request.

support_source

bool

Can be used as a source language.

support_target

bool

Can be used as a target language.

SupportedLanguages

The response message for discovering supported languages.

Fields
languages[]

SupportedLanguage

A list of supported language responses. This list contains an entry for each language the Translation API supports.

TranslateDocumentRequest

A document translation request.

Fields
parent

string

Required. Location to make a regional call.

Format: projects/{project-number-or-id}/locations/{location-id}.

For global calls, use projects/{project-number-or-id}/locations/global or projects/{project-number-or-id}.

Non-global location is required for requests using AutoML models or custom glossaries.

Models and glossaries must be within the same region (have the same location-id), otherwise an INVALID_ARGUMENT (400) error is returned.

source_language_code

string

Optional. The ISO-639 language code of the input document if known, for example, "en-US" or "sr-Latn". Supported language codes are listed in Language Support. If the source language isn't specified, the API attempts to identify the source language automatically and returns the source language within the response. Source language must be specified if the request contains a glossary or a custom model.

target_language_code

string

Required. The ISO-639 language code to use for translation of the input document, set to one of the language codes listed in Language Support.

document_input_config

DocumentInputConfig

Required. Input configurations.

document_output_config

DocumentOutputConfig

Optional. Output configurations. Defines if the output file should be stored within Cloud Storage as well as the desired output format. If not provided the translated file will only be returned through a byte-stream and its output mime type will be the same as the input file's mime type.

model

string

Optional. The model type requested for this translation.

The format depends on model type:

  • AutoML Translation models: projects/{project-number-or-id}/locations/{location-id}/models/{model-id}

  • General (built-in) models: projects/{project-number-or-id}/locations/{location-id}/models/general/nmt,

If not provided, the default Google model (NMT) will be used for translation.

Authorization requires one or more of the following IAM permissions on the specified resource model:

  • cloudtranslate.generalModels.docPredict
  • automl.models.predict
glossary_config

TranslateTextGlossaryConfig

Optional. Glossary to be applied. The glossary must be within the same region (have the same location-id) as the model, otherwise an INVALID_ARGUMENT (400) error is returned.

Authorization requires the following IAM permission on the specified resource glossaryConfig:

  • cloudtranslate.glossaries.docPredict
labels

map<string, string>

Optional. The labels with user-defined metadata for the request.

Label keys and values can be no longer than 63 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter.

See https://cloud.google.com/translate/docs/advanced/labels for more information.

customized_attribution

string

Optional. This flag is to support user customized attribution. If not provided, the default is Machine Translated by Google. Customized attribution should follow rules in https://cloud.google.com/translate/attribution#attribution_and_logos

is_translate_native_pdf_only

bool

Optional. is_translate_native_pdf_only field for external customers. If true, the page limit of online native pdf translation is 300 and only native pdf pages will be translated.

enable_shadow_removal_native_pdf

bool

Optional. If true, use the text removal server to remove the shadow text on background image for native pdf translation. Shadow removal feature can only be enabled when is_translate_native_pdf_only: false && pdf_native_only: false

enable_rotation_correction

bool

Optional. If true, enable auto rotation correction in DVS.

TranslateDocumentResponse

A translated document response message.

Fields
document_translation

DocumentTranslation

Translated document.

glossary_document_translation

DocumentTranslation

The document's translation output if a glossary is provided in the request. This can be the same as [TranslateDocumentResponse.document_translation] if no glossary terms apply.

model

string

Only present when 'model' is present in the request. 'model' is normalized to have a project number.

For example: If the 'model' field in TranslateDocumentRequest is: projects/{project-id}/locations/{location-id}/models/general/nmt then model here would be normalized to projects/{project-number}/locations/{location-id}/models/general/nmt.

glossary_config

TranslateTextGlossaryConfig

The glossary_config used for this translation.

TranslateTextGlossaryConfig

Configures which glossary is used for a specific target language and defines options for applying that glossary.

Fields
glossary

string

Required. The glossary to be applied for this translation.

The format depends on the glossary:

  • User-provided custom glossary: projects/{project-number-or-id}/locations/{location-id}/glossaries/{glossary-id}
ignore_case

bool

Optional. Indicates match is case insensitive. The default value is false if missing.

TranslateTextRequest

The request message for synchronous translation.

Fields
contents[]

string

Required. The content of the input in string format. We recommend the total content be less than 30,000 codepoints. The max length of this field is 1024. Use BatchTranslateText for larger text.

mime_type

string

Optional. The format of the source text, for example, "text/html", "text/plain". If left blank, the MIME type defaults to "text/html".

source_language_code

string

Optional. The ISO-639 language code of the input text if known, for example, "en-US" or "sr-Latn". Supported language codes are listed in Language Support. If the source language isn't specified, the API attempts to identify the source language automatically and returns the source language within the response.

target_language_code

string

Required. The ISO-639 language code to use for translation of the input text, set to one of the language codes listed in Language Support.

parent

string

Required. Project or location to make a call. Must refer to a caller's project.

Format: projects/{project-number-or-id} or projects/{project-number-or-id}/locations/{location-id}.

For global calls, use projects/{project-number-or-id}/locations/global or projects/{project-number-or-id}.

Non-global location is required for requests using AutoML models or custom glossaries.

Models and glossaries must be within the same region (have same location-id), otherwise an INVALID_ARGUMENT (400) error is returned.

model

string

Optional. The model type requested for this translation.

The format depends on model type:

  • AutoML Translation models: projects/{project-number-or-id}/locations/{location-id}/models/{model-id}

  • General (built-in) models: projects/{project-number-or-id}/locations/{location-id}/models/general/nmt,

For global (non-regionalized) requests, use location-id global. For example, projects/{project-number-or-id}/locations/global/models/general/nmt.

If not provided, the default Google model (NMT) will be used

Authorization requires one or more of the following IAM permissions on the specified resource model:

  • cloudtranslate.generalModels.predict
  • automl.models.predict
glossary_config

TranslateTextGlossaryConfig

Optional. Glossary to be applied. The glossary must be within the same region (have the same location-id) as the model, otherwise an INVALID_ARGUMENT (400) error is returned.

Authorization requires the following IAM permission on the specified resource glossaryConfig:

  • cloudtranslate.glossaries.predict
transliteration_config

TransliterationConfig

Optional. Transliteration to be applied.

labels

map<string, string>

Optional. The labels with user-defined metadata for the request.

Label keys and values can be no longer than 63 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter.

See https://cloud.google.com/translate/docs/advanced/labels for more information.

TranslateTextResponse

Fields
translations[]

Translation

Text translation responses with no glossary applied. This field has the same length as contents.

glossary_translations[]

Translation

Text translation responses if a glossary is provided in the request. This can be the same as translations if no terms apply. This field has the same length as contents.

Translation

A single translation response.

Fields
translated_text

string

Text translated into the target language. If an error occurs during translation, this field might be excluded from the response.

model

string

Only present when model is present in the request. model here is normalized to have project number.

For example: If the model requested in TranslationTextRequest is projects/{project-id}/locations/{location-id}/models/general/nmt then model here would be normalized to projects/{project-number}/locations/{location-id}/models/general/nmt.

detected_language_code

string

The ISO-639 language code of source text in the initial request, detected automatically, if no source language was passed within the initial request. If the source language was passed, auto-detection of the language does not occur and this field is empty.

glossary_config

TranslateTextGlossaryConfig

The glossary_config used for this translation.

TransliterationConfig

Configures transliteration feature on top of translation.

Fields
enable_transliteration

bool

If true, source text in romanized form can be translated to the target language.

UpdateGlossaryEntryRequest

Request message for UpdateGlossaryEntry

Fields
glossary_entry

GlossaryEntry

Required. The glossary entry to update.

UpdateGlossaryMetadata

Stored in the google.longrunning.Operation.metadata field returned by UpdateGlossary.

Fields
glossary

Glossary

The updated glossary object.

state

State

The current state of the glossary update operation. If the glossary input file was not updated this will be completed immediately

submit_time

Timestamp

The time when the operation was submitted to the server.

State

Enumerates the possible states that the update request can be in.

Enums
STATE_UNSPECIFIED Invalid.
RUNNING Request is being processed.
SUCCEEDED The glossary was successfully updated.
FAILED Failed to update the glossary.
CANCELLING Request is in the process of being canceled after caller invoked longrunning.Operations.CancelOperation on the request id.
CANCELLED The glossary update request was successfully canceled.

UpdateGlossaryRequest

Request message for the update glossary flow

Fields
glossary

Glossary

Required. The glossary entry to update.

update_mask

FieldMask

The list of fields to be updated. Currently only display_name and 'input_config'