The ML.TRANSLATE function

This document describes the ML.TRANSLATE function, which lets you translate text stored in BigQuery tables by using the Cloud Translation API.

Syntax

ML.TRANSLATE(
  MODEL `project_id.dataset.model_name`,
  TABLE `project_id.dataset.bq_table`,
  STRUCT('mode_name' AS translate_mode [, 'target_language_code' AS target_language_code ])
)

Arguments

ML.TRANSLATE takes the following arguments:

  • project_id: Your project ID.

  • dataset: The BigQuery dataset that contains the model.

  • model: The name of a remote model with a REMOTE_SERVICE_TYPE of CLOUD_AI_TRANSLATE_V3.

  • bq_table: The name of the BigQuery table that contains text data. The text analysis is applied on the column namedtext_contentin this table. If your table does not have atext_contentcolumn, use aSELECT` statement for this argument to provide an alias for an existing table column, as shown in the following example:

    SELECT * from ML.TRANSLATE(
      mydataset.mymodel,
      (SELECT comment AS text_content from mydataset.mytable),
      STRUCT('translate_text' AS translate_mode, 'en' AS target_language_code)
    );
    

    An error occurs if no text_content column is available.

  • translate_mode: a STRING value that specifies one of the following supported translation modes:

  • target_language_code: a STRING value that specifies a supported language code for translation. This argument is only required when you use the TRANSLATE_TEXT translation mode.

Output

ML.TRANSLATE returns the input table plus the following columns:

  • ml_translate_result: a JSON value that contains the translation result from Cloud Translation API.
  • ml_translate_status: a STRING value that contains the API response status for the corresponding row. This value is empty if the operation was successful.

Quotas

See Cloud AI service functions quotas and limits.

Example

Example 1

The following example applies text translation on the column name text_content on the bq table mybqtable in mydataset to Chinese.

# Create model
CREATE OR REPLACE MODEL
`myproject.mydataset.mytranslatemodel`
REMOTE WITH CONNECTION `myproject.myregion.myconnection`
OPTIONS (remote_service_type = 'cloud_ai_translate_v3')
# Translate text
SELECT * FROM ML.TRANSLATE(
  MODEL `mydataset.mytranslatemodel`,
  TABLE `mydataset.mybqtable`,
  STRUCT('translate_text' AS translate_mode, 'zh-CN' AS target_language_code));

The output is similar to the following:

ml_translate_result ml_translate_status text_content
{"glossary_translations":[],"translation_memory_translations":[],"translations":[{"detected_language_code":"en","translated_text":"苹果"}]} apple

Example 2

The following example translates the text in the column text_content in the table mybqtable to Chinese, and parses the JSON results into separate columns.

# Translate text and parse the json
CREATE TABLE
  `mydataset.translate_result` AS (
  SELECT
    STRING(ml_translate_result.translations[0].detected_language_code) AS `Original Language`,
    text_content AS `Original Text`,
    "zh-CN" AS `Destination Language`,
    STRING(ml_translate_result.translations[0].translated_text) AS Translation,
    ml_translate_status as `Status`
  FROM ML.TRANSLATE(
    MODEL `mydataset.mytranslatemodel`,
    TABLE `mydataset.mybqtable`,
    STRUCT('translate_text' AS translate_mode, 'zh-CN' AS target_language_code)));

SELECT * FROM `mydataset.translate_result`;

The output is similar to the following:

Original Language Original Text Destination Language Translation Status
en apple zh-cn. 苹果

If you get an error like query limit exceeded, you might have exceeded the quota for this function, which can leave you with unprocessed rows. Use the following query to complete processing the unprocessed rows:

CREATE TABLE
  `mydataset.translate_result_next` AS (
  SELECT
    STRING(ml_translate_result.translations[0].detected_language_code) AS `Original Language`,
    text_content AS `Original Text`,
    'zh-CN' AS `Destination Language`,
    STRING(ml_translate_result.translations[0].translated_text) AS Translation,
    ml_translate_status as `Status`
  FROM ML.TRANSLATE(
    MODEL `mydataset.mytranslatemodel`,
    (SELECT `Original Text` AS text_content
     FROM `mydataset.translate_result`
     WHERE Status != ''),
    STRUCT('translate_text' AS translate_mode, 'zh-CN' AS target_language_code)));

SELECT * FROM `mydataset.translate_result_next`;

What's next