Cloud Vision API's text recognition feature is able to detect a wide variety of languages and can detect multiple languages within a single image.
Providing a language hint to the service is not required, but can be done if the service is having trouble detecting the language used in your image.
With the release of Handwriting OCR GA images
with handwriting no longer require a handwriting languageHints
flag
when using
DOCUMENT_TEXT_DETECTION
.
Optional language hints
are specified within a request's
ImageContext
as a list of languageHints
for a
TEXT_DETECTION
and DOCUMENT_TEXT_DETECTION
request.
Each language code parameter typically consists of a
BCP-47 identifier. This parameter can be of
the form language-region, where language refers to the primary
language and the optional region refers to a region (usually a country
identifier) of a particular dialect. For example, Chinese can be
represented as Simplified Chinese as written in the People's Republic of China
(zh-Hans
) or Traditional Chinese as written in Taiwan (zh-Hant
).
There are three levels of language support in the text recognition feature:
- Supported languages are those we prioritize and regularly evaluate performance against.
- Experimental languages are those under active development but not regularly evaluated against.
- Mapped languages are those supported by mapping them
to another language
code or to a general character recognizer. For example, "
en-GB
" is supported, but it is not treated any differently than "en
" for the purposes of recognizing text. We make a best-effort to return the correct mapped language code in the Entity locale field, but mapped languages are more likely than fully supported or experimentally supported languages to be misidentified as a similar language.
The list of languages (with associated languageHint
codes) supported by
TEXT_DETECTION
and DOCUMENT_TEXT_DETECTION
is shown below.
If the language hint is left blank, we will attempt to auto-detect the most
appropriate language. The TEXT_DETECTION
endpoint will auto-detect only
a subset of supported languages, while the DOCUMENT_TEXT_DETECTION
endpoint
will auto-detect the full set of supported languages.
Supported languages
The following languages are prioritized and regularly evaluated.
To filter by features, type or directly select the desired language from the dropdown menu:
Language | Language (English name) | languageHints code | Script / notes |
---|---|---|---|
Afrikaans | Afrikaans | af | Latn |
shqip | Albanian | sq | Latn |
العربية | Arabic | ar | Arab; Modern Standard |
Հայ | Armenian | hy | Armn |
беларуская | Belarusian | be | Cyrl |
বাংলা | Bengali | bn | Beng |
български | Bulgarian | bg | Cyrl |
Català | Catalan | ca | Latn |
普通话 | Chinese | zh | Hans/Hant |
Hrvatski | Croatian | hr | Latn |
Čeština | Czech | cs | Latn |
Dansk | Danish | da | Latn |
Nederlands | Dutch | nl | Latn |
English | English | en | Latn; American |
Eesti keel | Estonian | et | Latn |
Filipino | Filipino | fil (or tl) | Latn |
Suomi | Finnish | fi | Latn |
Français | French | fr | Latn; European |
Deutsch | German | de | Latn |
Ελληνικά | Greek | el | Grek |
ગુજરાતી | Gujarati | gu | Gujr |
עברית | Hebrew | iw | Hebr |
हिन्दी | Hindi | hi | Deva |
Magyar | Hungarian | hu | Latn |
Íslenska | Icelandic | is | Latn |
Bahasa Indonesia | Indonesian | id | Latn |
Italiano | Italian | it | Latn |
日本語 | Japanese | ja | Jpan |
ಕನ್ನಡ | Kannada | kn | Knda |
ភាសាខ្មែរ | Khmer | km | Khmr |
한국어 | Korean | ko | Kore |
ລາວ | Lao | lo | Laoo |
Latviešu | Latvian | lv | Latn |
Lietuvių | Lithuanian | lt | Latn |
Македонски | Macedonian | mk | Cyrl |
Bahasa Melayu | Malay | ms | Latn |
മലയാളം | Malayalam | ml | Mlym |
मराठी | Marathi | mr | Deva |
नेपाली | Nepali | ne | Deva |
Norsk | Norwegian | no | Latn; Bokmål |
فارسی | Persian | fa | Arab |
Polski | Polish | pl | Latn |
Português | Portuguese | pt | Latn; Brazilian |
ਪੰਜਾਬੀ | Punjabi | pa | Guru; Gurmukhi |
Română | Romanian | ro | Latn |
Русский | Russian | ru | Cyrl |
Русский (старая орфография) | Russian | ru-PETR1708 | Cyrl; Old Orthography |
Српски | Serbian | sr | Cyrl & Latn |
Српски (латиница) | Serbian | sr-Latn | Latn |
Slovenčina | Slovak | sk | Latn |
Slovenščina | Slovenian | sl | Latn |
Español | Spanish | es | Latn; European |
Svenska | Swedish | sv | Latn |
தமிழ் | Tamil | ta | Taml |
తెలుగు | Telugu | te | Telu |
ไทย | Thai | th | Thai |
Türkçe | Turkish | tr | Latn |
Українська | Ukrainian | uk | Cyrl |
Tiếng Việt | Vietnamese | vi | Latn |
Yiddish | Yiddish | yi | Hebr |
Experimental languages
The following languages are under active development and not yet regularly evaluated against.
Language | Language (English name) | languageHints code |
Script / notes |
---|---|---|---|
አማርኛ | Amharic | am | Ethi |
Αρχαία ελληνικά | Ancient Greek | grc | Grek |
অসমীয়া | Assamese | as | Beng |
Azərbaycan | Azerbaijani | az | Latn |
Azərbaycan (qədim yazı) | Azerbaijani | az-Cyrl | Cyrl; old orthography |
Euskara | Basque | eu | Latn |
Bosanski | Bosnian | bs | Latn |
မြန်မာ | Burmese | my | Mymr |
Cebuano | Cebuano | ceb | Latn |
ᏣᎳᎩ ᎦᏬᏂᎯᏍᏗ | Cherokee | chr | Cher |
dhivehi, dhivehi-bas | Dhivehi | dv | Thaa |
རྫོང་ཁ | Dzonkha | dz | Tibt |
Esperanto | Esperanto | eo | Latn |
Galego | Galician | gl | Latn |
ქართული | Georgian | ka | Geor |
Kreyòl Ayisyen | Haitian Creole | ht | Latn |
Gaeilge | Irish | ga | Latn |
Jawa | Javanese | jv | Latn |
Қазақ | Kazakh | kk | Cyrl |
Kirghiz | Kirghiz | ky | Cyrl |
Latine | Latin | la | Latn |
Malti | Maltese | mt | Latn |
Монгол | Mongolian | mn | Cyrl |
ଓଡ଼ିଆ | Oriya | or | Orya |
پښتو | Pashto | ps | Arab |
संस्कृतम् | Sanskrit | sa | Deva |
සිංහල | Sinhala | si | Sinh |
Swahili | Swahili | sw | Latn |
leššānā Suryāyā | Syriac | syr | Syriac |
བོད་སྐད་ | Tibetan | bo | Tibt |
ትግርኛ | Tigirinya | ti | Ethi |
اردو | Urdu | ur | Arab |
oʻzbekcha | Uzbek | uz | Latn; Latin |
oʻzbekcha | Uzbek | uz-Cyrl | Cyrl; old orthography |
Cymraeg | Welsh | cy | Latn |
IsiZulu | Zulu | zu | Latn |
Mapped languages
The following languages are mapped to another language code or mapped to a general character recognizer.
Language | Language (English name) | languageHints code | Script / notes | Mapped to |
---|---|---|---|---|
بهسا اچيه | Acehnese | ace | Latn | Latin script model |
Lwo | Acholi | ach | Latn | Latin script model |
Dangme | Adangme | ada | Latn | Latin script model |
Akan | Akan | ak | Latn | Latin script model |
Anicinâbemowin | Algonquinian | alg | Latn | Latin script model |
Mapudungu | Araucanian/Mapuche | arn | Latn | Latin script model |
Asturianu | Asturian | ast | Latn | Latin script model |
Dene | Athabaskan | ath | Latn | Latin script model |
Aymar aru | Aymara | ay | Latn | Latin script model |
Bhāṣa Bali | Balinese | ban | Latn | Latin script model |
Bamanankan | Bambara | bm | Latn | Latin script model |
Narrow Bantu | Bantu | bnt | Latn | Latin script model |
башҡорт теле | Bashkir | ba | Cyrl | Cyrillic script model |
Toba–Batak | Batak | btk | Latn | Latin script model |
Chibemba | Bemba | bem | Latn | Latin script model |
Bikol Naga | Bikol | bik | Latn | Latin script model |
Bichelamar | Bislama | bi | Latn | Latin script model |
Brezhoneg | Breton | br | Latn | Latin script model |
нохчийн мотт / noxçiyn mott | Chechen | ce | Cyrl | Cyrillic script model |
汉语 | Chinese | zh-Hans | Hans; Simplified; Mandarin | zh |
漢語 | Chinese | zh-Hant | Hant; Traditional; Mandarin | zh |
普通話 | Chinese | zh-Hant-HK | Hant; Mandarin; Hong Kong | zh |
Chahta' | Choctaw | cho | Latn | Latin script model |
Чӑвашла | Chuvash | cv | Cyrl | Cyrillic script model |
Cree–Montagnais–Naskapi | Cree | cr | Latn | Latin script model |
Mvskoke | Creek | mus | Latn | Latin script model |
qırımtatar tili, къырымтатар тили | Crimean Tatar | crh | Latn | Cyrillic script model |
Dakhótiyapi, Dakȟótiyapi | Dakota | dak | Latn | Latin script model |
Douala | Duala | dua | Latn | Latin script model |
Ikɔ Efik | Efik | efi | Latn | Latin script model |
English (British) | English | en-GB | Latn; British | en |
Èʋegbe | Ewe | ee | Latn | Latin script model |
føroyskt mál | Faroese | fo | Latn | Latin script model |
Na Vosa Vakaviti | Fijian | fj | Latn | Latin script model |
fɔ̀ngbè | Fon | fon | Latn | Latin script model |
Français canadien | French | fr-CA | Latn; Canadian | fr |
Fulani, Fulah, Peul | Fulah | ff | Latn | Latin script model |
Gã | Ga | gaa | Latn | Latin script model |
Luganda | Ganda | lg | Latn | Latin script model |
Basa Gayo | Gayo | gay | Latn | Latin script model |
Kiribati | Gilbertese | gil | Latn | Latin script model |
Gothic | Gothic | got | Latn | Latin script model |
Guaraní | Guarani | gn | Latn | Latin script model |
Harshen/Halshen Hausa هَرْشَن هَوْسَ | Hausa | ha | Latn | Latin script model |
ʻŌlelo Hawaiʻi | Hawaiian | haw | Latn | Latin script model |
Otjiherero | Herero | hz | Latn | Latin script model |
Ilonggo | Hiligaynon | hil | Latn | Latin script model |
Jaku Iban | Iban | iba | Latn | Latin script model |
Asụsụ Igbo | Igbo | ig | Latn | Latin script model |
Ilokano | Iloko | ilo | Latn | Latin script model |
Taqbaylit | Kabyle | kab | Latn | Latin script model |
Jingpho | Kachin | kac | Latn | Latin script model |
Kalaallisut | Kalaallisut | kl | Latn | Latin script model |
Kikamba | Kamba | kam | Latn | Latin script model |
Kanuri | Kanuri | kr | Latn | Latin script model |
Qaraqalpaq tili, Қарақалпақ тили, قاراقالپاق تىلى | Kara-Kalpak | kaa | Cyrl/Latn | Cyrillic script model |
Ka Ktien Khasi | Khasi | kha | Latn | Latin script model |
Gĩkũyũ | Kikuyu | ki | Latn | Latin script model |
Kinyarwanda | Kinyarwanda | rw | Latn | Latin script model |
коми кыв | Komi | kv | Cyrl | Cyrillic script model |
Kikongo | Kongo | kg | Latn | Latin script model |
Kosraean | Kosraean | kos | Latn | Latin script model |
Oshikwanyama | Kuanyama | kj | Latn | Latin script model |
Ngala | Lingala | ln | Latn | Latin script model |
Plattdütsch, Plattdeutsch, Nedersaksisch | Low German | nds | Latn | Latin script model |
siLozi | Lozi | loz | Latn | Latin script model |
Kiluba | Luba-Katanga | lu | Latn | Latin script model |
Dholuo | Luo | luo | Latn | Latin script model |
Madhura, Basa Mathura, بَهاسَ مَدورا | Madurese | mad | Latn | Latin script model |
Malagasy | Malagasy | mg | Latn | Latin script model |
Mandinka, لغة مندنكا | Mandingo | man | Latn | Latin script model |
Gaelg, Gailck | Manx | gv | Latn | Latin script model |
Te reo Māori | Maori | mi | Latn | Latin script model |
Ebon | Marshallese | mh | Latn | Latin script model |
Mɛnde yia | Mende | men | Latn | Latin script model |
Middle English | Middle English | enm | Latn | Latin script model |
Mittelhochdeutsch | Middle High German | gmh | Latn | Latin script model |
Baso Minangkabau, باسو مينڠكاباو | Minangkabau | min | Latn | Latin script model |
Kanienʼkéha | Mohawk | moh | Latn | Latin script model |
Nkundu | Mongo | lol | Latn | Latin script model |
Nāhuatl | Nahuatl | nah | Latn | Latin script model |
Diné bizaad | Navajo | nv | Latn | Latin script model |
Ndonga | Ndonga | ng | Latn | Latin script model |
ko e vagahau Niuē | Niuean | niu | Latn | Latin script model |
Zimbabwe Ndebele | North Ndebele | nd | Latn | Latin script model |
Sesotho sa Leboa | Northern Sotho | nso | Latn | Latin script model |
Chichewa, Chinyanja | Nyanja | ny | Latn | Latin script model |
Runyankore | Nyankole | nyn | Latn | Latin script model |
Chitonga | Nyasa Tonga | tog | Latn | Latin script model |
Appolo | Nzima | nzi | Latn | Latin script model |
Occitan, lenga d'òc, provençal | Occitan | oc | Latn | Latin script model |
Anishinaabemowin, ᐊᓂᔑᓈᐯᒧᐎᓐ | Ojibwa | oj | Latn | Latin script model |
Ænglisc, Englisc, Anglisc | Old English | ang | Latn | Latin script model |
Franceis, François, Romanz | Old French | fro | Latn | Latin script model |
Diutisk, Althochdeutsch | Old High German | goh | Latn | Latin script model |
Dǫnsk tunga | Old Norse | non | Latn | Latin script model |
Occitan ancian | Old Provencal | pro | Latn | Latin script model |
ирон ӕвзаг | Ossetic | os | Cyrl | Cyrillic script model |
Kapampangan | Pampanga | pam | Latn | Latin script model |
Salitan Pangasinan | Pangasinan | pag | Latn | Latin script model |
Papiamentu | Papiamento | pap | Latn | Latin script model |
Português (Portugal) | Portuguese | pt-PT | Latn; European | pt |
Kechua / Runa Simi | Quechua | qu | Latn | Latin script model |
Rumantsch | Romansh | rm | Latn | Latin script model |
Romani čhib | Romany | rom | Latn | Latin script model |
Ikirundi | Rundi | rn | Latn | Latin script model |
Sakha | Sakha | sah | Cyrl | Cyrillic script model |
Gagana faʻa Sāmoa | Samoan | sm | Latn | Latin script model |
yângâ tî sängö | Sango | sg | Latn | Latin script model |
(Braid) Scots, Lallans, Doric | Scots | sco | Latn | Latin script model |
Gàidhlig | Scottish Gaelic | gd | Latn | Latin script model |
chiShona | Shona | sn | Latn | Latin script model |
Songhay | Songhai | son | Latn | Latin script model |
Sesotho | Southern Sotho | st | Latn | Latin script model |
Español (Latinoamérica) | Spanish | es-419 | Latn; Latin American | es |
ᮘᮞ ᮞᮥᮔ᮪ᮓ , Basa Sunda | Sundanese | su | Latn | Latin script model |
siSwati | Swati | ss | Latn | Latin script model |
Reo Tahiti | Tahitian | ty | Latn | Latin script model |
тоҷикӣ | Tajik | tg | Cyrl | Cyrillic script model |
татар теле | Tatar | tt | Cyrl/Latn | Cyrillic script model |
KʌThemnɛ | Temne | tem | Latn | Latin script model |
lea faka-Tonga | Tongan | to | Latn | Latin script model |
Xitsonga | Tsonga | ts | Latn | Latin script model |
Setswana | Tswana | tn | Latn | Latin script model |
Türkmençe | Turkmen | tk | Latn | Cyrillic script model |
удмурт кыл | Udmurt | udm | Cyrl | Cyrillic script model |
Tshivenḓa | Venda | ve | Latn | Latin script model |
Vod | Votic | vot | Cyrl/Latn | Cyrillic script model |
Frysk | Western Frisian | fy | Latn | Latin script model |
Wolof | Wolof | wo | Latn | Latin script model |
isiXhosa | Xhosa | xh | Latn | Latin script model |
Èdè Yorùbá | Yoruba | yo | Latn | Latin script model |
Diidxazá | Zapotec | zap | Latn | Latin script model |
Handwriting scripts
The following scripts are supported for handwriting recognition. Check the language tables above for languages that use each script.
Script Tag | Name | Support Level |
---|---|---|
Beng | Bengali | Experimental |
Cyrl | Cyrillic | Experimental |
Deva | Devanagari | Experimental |
Grek | Greek | Experimental |
Hani | Chinese | Experimental |
Jpan | Japanese | Supported |
Kore | Korean | Supported |
Latn | Latin | Supported |
vi | Vietnamese | Experimental |