This page shows you how to train and make use of a custom machine translation model by using AutoML Translation UI. It trains a custom English-to-Spanish translation model using technology-oriented sentence pairs from software localization.
Set up your project
Open the AutoML Translation UI and select your project from the drop-down list in the upper right of the title bar. (You must have at least roles/editor access to the project.) The application walks you through the necessary set-up steps, which are also described in Before you begin.
Create a dataset
Download the archive file containing the sample data for training the model, and extract the file
en-es.tsv
.Visit the AutoML Translation UI.
Select the project for which you enabled AutoML Translation.
Click the Create Dataset button.
On the Create dataset page, enter a name for the dataset and select the source and target languages.
When you select English as the Translate from language, the available Translate to languages appear. Select Spanish.
Click Create.
On the Import tab for your dataset, do the following:
- Select Upload files from your computer,
click Select Files, and choose the
en-es.tsv
file you downloaded previously. - When choosing files from local, you must specify the
Cloud Storage path
where the uploaded files are to be stored. The Cloud Storage
bucket region must be
us-central1.
- Select Upload files from your computer,
click Select Files, and choose the
Click Continue.
You're returned to the Datasets page; your dataset will show an in progress animation while your documents are being imported. When your dataset has been successfully uploaded, you will receive a message at the email address that you used to sign up for the program.
Review the dataset.
After your data has been successfully imported, select the dataset from the dataset listing page (or click the link in the email notification) to see the details about the dataset. The name of the selected dataset appears in the title bar, and the page lists the sentence pairs and which stage of processing they will be used for (TRAIN, VALIDATION, TEST).
Train your model
To begin training your custom model, click the Train tab just below the title bar, then the Start Training button.
Training a model can take several hours to complete. After the model is successfully trained, you will receive a message at the email address you used to sign up for the program.
When you receive notification that training is complete, open the email message and click the link to return to the AutoML Translation UI. The Train page shows high-level metrics for the model, most notably its BLEU score. The BLEU (Bilingual Evaluation Understudy) score indicates how similar the candidate text is to the reference texts, with values closer to one representing more similar texts.
Use the custom model
Click the Predict tab just below the title bar or the Test and use link below the model information. Enter some text to translate and click the Translate button. You can compare the results from your custom model to the Google NMT model.
Clean up
To avoid unnecessary Google Cloud charges, use the Cloud Console to delete your project if you do not need it.
What's next
- When you're ready to create your own dataset to create an AutoML Translation model, read the instructions on how to prepare your data.