Create a machine translation model
This page shows you how to train and make use of a custom machine translation model by using AutoML Translation UI. It trains a custom English-to-Spanish translation model using technology-oriented sentence pairs from software localization.
Before you begin
Open the AutoML Translation UI and select your project from the drop-down list in the upper right of the title bar. (You must have at least roles/editor access to the project.) The application walks you through the necessary set-up steps, which are also described in Before you begin.
Create a dataset
Download the archive file containing the sample data for training the model, and extract the file
Visit the AutoML Translation UI.
Select the project for which you enabled AutoML Translation.
Click the Create Dataset button.
On the Create dataset page, enter a name for the dataset and select the source and target languages.
When you select English as the Translate from language, the available Translate to languages appear. Select Spanish.
On the Import tab for your dataset, do the following:
- Select Upload files from your computer,
click Select Files, and choose the
en-es.tsvfile you downloaded previously.
- When choosing files from local, you must specify the
Cloud Storage path
where the uploaded files are to be stored. The Cloud Storage
bucket region must be
- Select Upload files from your computer, click Select Files, and choose the
You're returned to the Datasets page; your dataset will show an in progress animation while your documents are being imported. When your dataset has been successfully uploaded, you will receive a message at the email address that you used to sign up for the program.
Review the dataset.
After your data has been successfully imported, select the dataset from the dataset listing page (or click the link in the email notification) to see the details about the dataset. The name of the selected dataset appears in the title bar, and the page lists the sentence pairs and which stage of processing they will be used for (TRAIN, VALIDATION, TEST).
Train your model
To begin training your custom model, click the Train tab just below the title bar, then the Start Training button.
Training a model can take several hours to complete. After the model is successfully trained, you will receive a message at the email address you used to sign up for the program.
When you receive notification that training is complete, open the email message and click the link to return to the AutoML Translation UI. The Train page shows high-level metrics for the model, most notably its BLEU score. The BLEU (Bilingual Evaluation Understudy) score indicates how similar the candidate text is to the reference texts, with values closer to one representing more similar texts.
Use the custom model
Click the Predict tab just below the title bar or the Test and use link below the model information. Enter some text to translate and click the Translate button. You can compare the results from your custom model to the Google NMT model.
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
To avoid unnecessary Google Cloud charges, use the Cloud console to delete your project if you do not need it.
- When you're ready to create your own dataset to create an AutoML Translation model, read the instructions on how to prepare your data.