This page shows you how to train and make use of a custom machine translation model by using AutoML Translation UI. It trains a custom English-to-Spanish translation model using technology-oriented sentence pairs from software localization.

Set up your project

Open the AutoML Translation UI and select your project from the drop-down list in the upper right of the title bar. (You must have at least roles/editor access to the project.) The application walks you through the necessary set-up steps, which are also described in Before you begin.

Create a dataset

  1. Download the archive file containing the sample data for training the model, and extract the file en-es.tsv.

  2. Visit the AutoML Translation UI.

  3. Select the project for which you enabled AutoML Translation.

    Datasets page with one dataset

  4. Click the Create Dataset button.

  5. On the Create dataset page, enter a name for the dataset and select the source and target languages.

    When you select English as the Translate from language, the available Translate to languages appear. Select Spanish.

  6. Click Create.

  7. On the Import tab for your dataset, do the following:

    Import tab for my_dataset

    • Select Upload files from your computer, click Select Files, and choose the en-es.tsv file you downloaded previously.
    • When choosing files from local, you must specify the Cloud Storage path where the uploaded files are to be stored. The Cloud Storage bucket region must be us-central1.
  8. Click Continue.

    You're returned to the Datasets page; your dataset will show an in progress animation while your documents are being imported. When your dataset has been successfully uploaded, you will receive a message at the email address that you used to sign up for the program.

  9. Review the dataset.

    After your data has been successfully imported, select the dataset from the dataset listing page (or click the link in the email notification) to see the details about the dataset. The name of the selected dataset appears in the title bar, and the page lists the sentence pairs and which stage of processing they will be used for (TRAIN, VALIDATION, TEST).

Train your model

To begin training your custom model, click the Train tab just below the title bar, then the Start Training button.

Train tab for the my_dataset dataset

Training a model can take several hours to complete. After the model is successfully trained, you will receive a message at the email address you used to sign up for the program.

When you receive notification that training is complete, open the email message and click the link to return to the AutoML Translation UI. The Train page shows high-level metrics for the model, most notably its BLEU score. The BLEU (Bilingual Evaluation Understudy) score indicates how similar the candidate text is to the reference texts, with values closer to one representing more similar texts.

Train tab for the my_dataset showing the model evaluation

Use the custom model

Click the Predict tab just below the title bar or the Test and use link below the model information. Enter some text to translate and click the Translate button. You can compare the results from your custom model to the Google NMT model.

Clean up

To avoid unnecessary Google Cloud charges, use the Cloud Console to delete your project if you do not need it.

What's next