OCR Language Support

Cloud Vision API's text recognition feature is able to detect a wide variety of languages and can detect multiple languages within a single image.

Providing a language hint to the service is not required, but can be done if the service is having trouble detecting the language used in your image.

With the release of Handwriting OCR GA images with handwriting no longer require a handwriting languageHints flag when using DOCUMENT_TEXT_DETECTION.

Optional language hints are specified within a request's ImageContext as a list of languageHints for a TEXT_DETECTION and DOCUMENT_TEXT_DETECTION request.

Each language code parameter typically consists of a BCP-47 identifier. This parameter can be of the form language-region, where language refers to the primary language and the optional region refers to a region (usually a country identifier) of a particular dialect. For example, Chinese can be represented as Simplified Chinese as written in the People's Republic of China (zh-Hans) or Traditional Chinese as written in Taiwan (zh-Hant).

There are three levels of language support in the text recognition feature:

  1. Supported languages are those we prioritize and regularly evaluate performance against.
  2. Experimental languages are those under active development but not regularly evaluated against.
  3. Mapped languages are those supported by mapping them to another language code or to a general character recognizer. For example, "en-GB" is supported, but it is not treated any differently than "en" for the purposes of recognizing text. We make a best-effort to return the correct mapped language code in the Entity locale field, but mapped languages are more likely than fully supported or experimentally supported languages to be misidentified as a similar language.

The list of languages (with associated languageHint codes) supported by TEXT_DETECTION and DOCUMENT_TEXT_DETECTION is shown below.

If the language hint is left blank, we will attempt to auto-detect the most appropriate language. The TEXT_DETECTION endpoint will auto-detect only a subset of supported languages, while the DOCUMENT_TEXT_DETECTION endpoint will auto-detect the full set of supported languages.

Supported languages

The following languages are prioritized and regularly evaluated.

To filter by features, type or directly select the desired language from the dropdown menu:

Language Language (English name) languageHints code Script Notes
Afrikaans Afrikaans af Latn
shqip Albanian sq Latn
العربية Arabic ar Arab Modern Standard
Հայ Armenian hy Armn
беларуская Belarusian be Cyrl
বাংলা Bengali bn Beng
български Bulgarian bg Cyrl
Català Catalan ca Latn
普通话 Chinese zh Hans/Hant
Hrvatski Croatian hr Latn
Čeština Czech cs Latn
Dansk Danish da Latn
Nederlands Dutch nl Latn
English English en Latn American
Eesti keel Estonian et Latn
Filipino Filipino fil Latn
Suomi Finnish fi Latn
Français French fr Latn European
Deutsch German de Latn
Ελληνικά Greek el Grek
ગુજરાતી Gujarati gu Gujr
עברית Hebrew iw Hebr
हिन्दी Hindi hi Deva
Magyar Hungarian hu Latn
Íslenska Icelandic is Latn
Bahasa Indonesia Indonesian id Latn
Italiano Italian it Latn
日本語 Japanese ja Jpan
ಕನ್ನಡ Kannada kn Knda
ភាសាខ្មែរ Khmer km Khmr
한국어 Korean ko Kore
ລາວ Lao lo Laoo
Latviešu Latvian lv Latn
Lietuvių Lithuanian lt Latn
Македонски Macedonian mk Cyrl
Bahasa Melayu Malay ms Latn
മലയാളം Malayalam ml Mlym
मराठी Marathi mr Deva
नेपाली Nepali ne Deva
Norsk Norwegian no Latn Bokmål
فارسی Persian fa Arab
Polski Polish pl Latn
Português Portuguese pt Latn Brazilian
ਪੰਜਾਬੀ Punjabi pa Guru Gurmukhi
Română Romanian ro Latn
Русский Russian ru Cyrl
Русский (старая орфография) Russian ru-PETR1708 Cyrl Old Orthography
Српски Serbian sr Cyrl
Српски (латиница) Serbian sr-Latn Latn
Slovenčina Slovak sk Latn
Slovenščina Slovenian sl Latn
Español Spanish es Latn European
Svenska Swedish sv Latn
Tagalog Tagalog tl Latn
தமிழ் Tamil ta Taml
తెలుగు Telugu te Telu
ไทย Thai th Thai
Türkçe Turkish tr Latn
Українська Ukrainian uk Cyrl
Tiếng Việt Vietnamese vi Latn
Yiddish Yiddish yi Hebr

Experimental languages

The following languages are under active development and not yet regularly evaluated against.

Language Language (English name) languageHints code Script Notes
አማርኛ Amharic am Ethi
Αρχαία ελληνικά Ancient Greek grc Grek
অসমীয়া Assamese as Beng
Azərbaycan Azerbaijani az Latn
Azərbaycan (qədim yazı) Azerbaijani az-Cyrl Cyrl Old Orthography
Euskara Basque eu Latn
Bosanski Bosnian bs Latn
မြန်မာ Burmese my Mymr
Cebuano Cebuano ceb Latn
ᏣᎳᎩ ᎦᏬᏂᎯᏍᏗ Cherokee chr Cher
dhivehi, dhivehi-bas Dhivehi dv Thaa
རྫོང་ཁ Dzonkha dz Tibt
Esperanto Esperanto eo Latn
Galego Galician gl Latn
ქართული Georgian ka Geor
Kreyòl Ayisyen Haitian Creole ht Latn
Gaeilge Irish ga Latn
Jawa Javanese jv Latn
Қазақ Kazakh kk Cyrl
Kirghiz Kirghiz ky Cyrl
Latine Latin la Latn
Malti Maltese mt Latn
Монгол Mongolian mn Cyrl
ଓଡ଼ିଆ Oriya or Orya
پښتو Pashto ps Arab
संस्कृतम् Sanskrit sa Deva
සිංහල Sinhala si Sinh
Swahili Swahili sw Latn
leššānā Suryāyā Syriac syr Syriac
བོད་སྐད་ Tibetan bo Tibt
ትግርኛ Tigirinya ti Ethi
اردو Urdu ur Arab
oʻzbekcha Uzbek uz Latn Latin
oʻzbekcha Uzbek uz-Cyrl Cyrl Old Orthography
Cymraeg Welsh cy Latn
IsiZulu Zulu zu Latn

Mapped languages

The following languages are mapped to another language code or mapped to a general character recognizer.

Language Language (English name) languageHints code Script Notes
بهسا اچيه Acehnese ace Latn Latin model
Lwo Acholi ach Latn Latin model
Dangme Adangme ada Latn Latin model
Akan Akan ak Latn Latin model
Anicinâbemowin Algonquinian alg Latn Latin model
Mapudungu Araucanian/Mapuche arn Latn Latin model
Asturianu Asturian ast Latn Latin model
Dene Athabaskan ath Latn Latin model
Aymar aru Aymara ay Latn Latin model
Bhāṣa Bali Balinese ban Latn Latin model
Bamanankan Bambara bm Latn Latin model
Narrow Bantu Bantu bnt Latn Latin model
башҡорт теле Bashkir ba Cyrl Cyrillic model
Toba–Batak Batak btk Latn Latin model
Chibemba Bemba bem Latn Latin model
Bikol Naga Bikol bik Latn Latin model
Bichelamar Bislama bi Latn Latin model
Brezhoneg Breton br Latn Latin model
нохчийн мотт / noxçiyn mott Chechen ce Cyrl Cyrillic model
汉语 Chinese (Mandarin, Simplified,) zh-Hans Hans Chinese model
漢語 Chinese (Mandarin, Traditional) zh-Hant Hant Chinese model
普通話 Chinese (Mandarin, Hong Kong) zh-Hant-HK Hant Chinese model
Chahta' Choctaw cho Latn Latin model
Чӑвашла Chuvash cv Cyrl Cyrillic model
Cree–Montagnais–Naskapi Cree cr Latn Latin model
Mvskoke Creek mus Latn Latin model
qırımtatar tili, къырымтатар тили Crimean Tatar crh Latn Cyrillic model
Dakhótiyapi, Dakȟótiyapi Dakota dak Latn Latin model
Douala Duala dua Latn Latin model
Ikɔ Efik Efik efi Latn Latin model
English (British) English (British) en-GB Latn Latin model
Èʋegbe Ewe ee Latn Latin model
føroyskt mál Faroese fo Latn Latin model
Na Vosa Vakaviti Fijian fj Latn Latin model
fɔ̀ngbè Fon fon Latn Latin model
Français canadien French (Canadian) fr-CA Latn Latin model
Fulani, Fulah, Peul Fulah ff Latn Latin model
Ga gaa Latn Latin model
Luganda Ganda lg Latn Latin model
Basa Gayo Gayo gay Latn Latin model
Kiribati Gilbertese gil Latn Latin model
Gothic Gothic got Latn Latin model
Guaraní Guarani gn Latn Latin model
Harshen/Halshen Hausa هَرْشَن هَوْسَ Hausa ha Latn Latin model
ʻŌlelo Hawaiʻi Hawaiian haw Latn Latin model
Otjiherero Herero hz Latn Latin model
Ilonggo Hiligaynon hil Latn Latin model
Jaku Iban Iban iba Latn Latin model
Asụsụ Igbo Igbo ig Latn Latin model
Ilokano Iloko ilo Latn Latin model
Taqbaylit Kabyle kab Latn Latin model
Jingpho Kachin kac Latn Latin model
Kalaallisut Kalaallisut kl Latn Latin model
Kikamba Kamba kam Latn Latin model
Kanuri Kanuri kr Latn Latin model
Qaraqalpaq tili, Қарақалпақ тили, قاراقالپاق تىلى Kara-Kalpak kaa Cyrl/Latn Cyrillic model
Ka Ktien Khasi Khasi kha Latn Latin model
Gĩkũyũ Kikuyu ki Latn Latin model
Kinyarwanda Kinyarwanda rw Latn Latin model
коми кыв Komi kv Cyrl Cyrillic model
Kikongo Kongo kg Latn Latin model
Kosraean Kosraean kos Latn Latin model
Oshikwanyama Kuanyama kj Latn Latin model
Ngala Lingala ln Latn Latin model
Plattdütsch, Plattdeutsch, Nedersaksisch Low German nds Latn Latin model
siLozi Lozi loz Latn Latin model
Kiluba Luba-Katanga lu Latn Latin model
Dholuo Luo luo Latn Latin model
Madhura, Basa Mathura, بَهاسَ مَدورا Madurese mad Latn Latin model
Malagasy Malagasy mg Latn Latin model
Mandinka, لغة مندنكا Mandingo man Latn Latin model
Gaelg, Gailck Manx gv Latn Latin model
Te reo Māori Maori mi Latn Latin model
Ebon Marshallese mh Latn Latin model
Mɛnde yia Mende men Latn Latin model
Middle English Middle English enm Latn Latin model
Mittelhochdeutsch Middle High German gmh Latn Latin model
Baso Minangkabau, باسو مينڠكاباو Minangkabau min Latn Latin model
Kanienʼkéha Mohawk moh Latn Latin model
Nkundu Mongo lol Latn Latin model
Nāhuatl Nahuatl nah Latn Latin model
Diné bizaad Navajo nv Latn Latin model
Ndonga Ndonga ng Latn Latin model
ko e vagahau Niuē Niuean niu Latn Latin model
Zimbabwe Ndebele North Ndebele nd Latn Latin model
Sesotho sa Leboa Northern Sotho nso Latn Latin model
Chichewa, Chinyanja Nyanja ny Latn Latin model
Runyankore Nyankole nyn Latn Latin model
Chitonga Nyasa Tonga tog Latn Latin model
Appolo Nzima nzi Latn Latin model
Occitan, lenga d'òc, provençal Occitan oc Latn Latin model
Anishinaabemowin, ᐊᓂᔑᓈᐯᒧᐎᓐ Ojibwa oj Latn Latin model
Ænglisc, Englisc, Anglisc Old English ang Latn Latin model
Franceis, François, Romanz Old French fro Latn Latin model
Diutisk, Althochdeutsch Old High German goh Latn Latin model
Dǫnsk tunga Old Norse non Latn Latin model
Occitan ancian Old Provencal pro Latn Latin model
ирон ӕвзаг Ossetic os Cyrl Cyrillic model
Kapampangan Pampanga pam Latn Latin model
Salitan Pangasinan Pangasinan pag Latn Latin model
Papiamentu Papiamento pap Latn Latin model
Português (Portugal) Portuguese (European) pt-PT Latn Latin model
Kechua / Runa Simi Quechua qu Latn Latin model
Rumantsch Romansh rm Latn Latin model
Romani čhib Romany rom Latn Latin model
Ikirundi Rundi rn Latn Latin model
Sakha Sakha sah Cyrl Cyrillic model
Gagana faʻa Sāmoa Samoan sm Latn Latin model
yângâ tî sängö Sango sg Latn Latin model
(Braid) Scots, Lallans, Doric Scots sco Latn Latin model
Gàidhlig Scottish Gaelic gd Latn Latin model
chiShona Shona sn Latn Latin model
Songhay Songhai son Latn Latin model
Sesotho Southern Sotho st Latn Latin model
Español (Latinoamérica) Spanish (Latin American) es-419 Latn Latin model
ᮘᮞ ᮞᮥᮔ᮪ᮓ , Basa Sunda Sundanese su Latn Latin model
siSwati Swati ss Latn Latin model
Reo Tahiti Tahitian ty Latn Latin model
тоҷикӣ Tajik tg Cyrl Cyrillic model
татар теле Tatar tt Cyrl/Latn Cyrillic model
KʌThemnɛ Temne tem Latn Latin model
lea faka-Tonga Tongan to Latn Latin model
Xitsonga Tsonga ts Latn Latin model
Setswana Tswana tn Latn Latin model
Türkmençe Turkmen tk Latn Cyrillic model
удмурт кыл Udmurt udm Cyrl Cyrillic model
Tshivenḓa Venda ve Latn Latin model
Vod Votic vot Cyrl/Latn Cyrillic model
Frysk Western Frisian fy Latn Latin model
Wolof Wolof wo Latn Latin model
isiXhosa Xhosa xh Latn Latin model
Èdè Yorùbá Yoruba yo Latn Latin model
Diidxazá Zapotec zap Latn Latin model

Handwriting scripts

The following scripts are supported for handwriting recognition. Check the language tables above for languages that use each script.

Script Name Support Level
Beng Bengali Experimental
Cyrl Cyrillic Experimental
Deva Devanagari Experimental
Grek Greek Experimental
Hani Chinese Experimental
Jpan Japanese Supported
Kore Korean Supported
Latn Latin Supported
vi Vietnamese Experimental