OCR Language Support

The Cloud Vision API's OCR recognition engine can be configured to understand one of a wide variety of languages. These languages are specified within an annotate image request's ImageContext as a list of languageHints for a TEXT_DETECTION and DOCUMENT_TEXT_DETECTION request. Each language code parameter typically consists of a BCP-47 identifier. This parameter can be of the form language-region, where language refers to the primary language and the optional region refers to a region (usually a country identifier) of a particular dialect. For example, Chinese can be represented as Simplified Chinese as written in the People's Republic of China (zh-CN) or Traditional Chinese as written in Taiwan (zh-TW).

The list of languages (with associated languageHint codes) supported by TEXT_DETECTION and DOCUMENT_TEXT_DETECTION is shown below. As new languages are added to the Cloud Vision API, this list will be updated.

Language Name languageHints code Notes
Afrikaans af
Arabic ar
Assamese as
Azerbaijani az
Belarusian be
Bengali bn
Bulgarian bg
Catalan ca
Chinese zh* TEXT_DETECTION only.
Croatian hr
Czech cs
Danish da
Dutch nl
English en
Estonian et
Filipino fil or tl
Finnish fi
French fr
German de
Greek el TEXT_DETECTION only.
Hebrew he or iw TEXT_DETECTION only.
Hindi hi
Hungarian hu
Icelandic is
Indonesian id
Italian it
Japanese ja TEXT_DETECTION only.
Kazakh kk
Korean ko TEXT_DETECTION only.
Kyrgyz ky
Latvian lv
Lithuanian lt
Macedonian mk
Marathi mr
Mongolian mn
Nepali ne
Norwegian no
Pashtu ps
Persian fa
Polish pl
Portuguese pt
Romanian ro
Russian ru
Sanskrit sa
Serbian sr
Slovak sk
Slovenian sl
Spanish es
Swedish sv
Tamil ta
Thai th TEXT_DETECTION only.
Turkish tr
Ukrainian uk
Urdu ur
Uzbek uz
Vietnamese vi

* Both Simplified Chinese (zh-CN) and Traditional Chinese (zh-TW) are supported when specifying a languageHints code of zh. You may use any of these language codes for recognition of Chinese text.

Send feedback about...

Google Cloud Vision API Documentation