Language support for custom models

AutoML Translation models are supported for the following language translation pairs. A supported language means that Google has an existing NMT (neural machine translation) model for that language, which AutoML Translation uses as a base to create a custom model during training.

The arrows indicate which source and target languages are supported for translation. For example, both English to German and German to English are supported.

Language Pair Language Codes
Afrikaans <-> English af <-> en
Albanian <-> English sq <-> en
Arabic <-> English ar <-> en
Azerbaijani <-> English az <-> en
Bengali <-> English bn <-> en
Bulgarian <-> English bg <-> en
Catalan <-> English ca <-> en
Chinese (Simplified) <-> English zh-CN * <-> en
Chinese (Traditional) <-> English zh-TW <-> en
Croatian <-> English hr <-> en
Czech <-> English cs <-> en
Danish <-> English da <-> en
Dutch <-> English nl <-> en
Estonian <-> English et <-> en
Finnish <-> English fi <-> en
French <-> English fr <-> en
Galician <-> English gl <-> en
Georgian <-> English ka <-> en
German <-> English de <-> en
Greek <-> English el <-> en
Gujarati <-> English gu <-> en
Haitian Creole <-> English ht <-> en
Hebrew <-> English iw <-> en
Hindi <-> English hi <-> en
Hungarian <-> English hu <-> en
Icelandic <-> English is <-> en
Indonesian <-> English id <-> en
Italian <-> English it <-> en
Japanese <-> English ja <-> en
Korean <-> English ko <-> en
Latvian <-> English lv <-> en
Lithuanian <-> English lt <-> en
Malay <-> English ms <-> en
Marathi <-> English mr <-> en
Norwegian <-> English no <-> en
Persian <-> English fa <-> en
Polish <-> English pl <-> en
Portuguese <-> English pt <-> en
Punjabi <-> English pa <-> en
Romanian <-> English ro <-> en
Russian <-> English ru <-> en
Serbian <-> English sr <-> en
Slovak <-> English sk <-> en
Slovenian <-> English sl <-> en
Spanish <-> English es <-> en
Swahili <-> English sw <-> en
Swedish <-> English sv <-> en
Thai <-> English th <-> en
Turkish <-> English tr <-> en
Ukrainian <-> English uk <-> en
Urdu <-> English ur <-> en
Vietnamese <-> English vi <-> en
Welsh <-> English cy <-> en

* Simplified Chinese can be specified either by zh-CN or zh.

Supported codes for language variants

You can use the following language codes, which are variants of the supported languages in the previous table. You can use these codes as the source or target language when you create datasets.

Google doesn't have base NMT models for these languages. Instead, AutoML Translation uses the language variant's associated base model for training custom models.

Using these codes are useful, for example, when you translate content for a particular dialect or region. For example, suppose you have localized data for zh-HK that you create a custom model for. When you perform translations, you can specify the zh-HK language code, which points to your custom model and produces more accurate translations for that locale.

The following table lists the language codes, their descriptions, and their associated base models that AutoML Translation uses when training custom models.

Language code Description Base model
zh-HK Hong Kong (Traditional) zh-TW