OCR 语言支持

Cloud Vision API 的文本识别功能可以检测多种语言,还可以检测单张图片中的多种语言。

无需为该服务提供语言提示,不过,如果该服务难以检测您的图片中所用的语言,则可以提供语言提示。

随着手写 OCR 正式版 (GA) 的发布,使用 DOCUMENT_TEXT_DETECTION 时包含手写内容的图片不再需要手写 languageHints 标志。

可选语言提示在请求的 ImageContext 中指定,作为 TEXT_DETECTIONDOCUMENT_TEXT_DETECTION 请求的一系列 languageHints

每个语言代码参数通常都包含一个 BCP-47 标识符。此参数可以采用“language-region”的格式,其中“language”是指主要语言,“region”(可选)是指特定方言的地区(通常是国家/地区标识符)。例如,中文可以表示为中华人民共和国的简体中文 (zh-Hans) 或台湾地区的繁体中文 (zh-Hant)。

文本识别功能支持三种级别的语言:

  1. 受支持语言,我们会优先考虑这些语言,并定期进行性能评估。
  2. 实验性语言,这些语言正处于开发阶段,但不会定期评估。
  3. 映射语言,这些语言可通过映射到其他语言代码或常规字符识别器而获得支持。例如,“en-GB”受支持,但出于识别文本的目的,系统不会将其与“en”区分对待。我们会尽力在“实体”语言区域字段中返回正确的映射语言代码,但是与完全受支持的语言或用实验方法支持的语言相比,映射语言更有可能被误识别为相似的语言。

下面显示了 TEXT_DETECTIONDOCUMENT_TEXT_DETECTION 支持的语言列表(包含关联的 languageHint 代码)。

如果语言提示留为空白,我们将尝试自动检测最合适的语言。TEXT_DETECTION 端点将仅会自动检测部分受支持的语言,而 DOCUMENT_TEXT_DETECTION 端点将自动检测全部受支持的语言。

支持的语言

以下语言会得到优先考虑并会定期进行评估。

如需按功能过滤,请从下拉菜单中输入或直接选择所需的语言:

Language Language (English name) languageHints code Script / notes
Afrikaans Afrikaans af Latn
shqip Albanian sq Latn
العربية Arabic ar Arab; Modern Standard
Հայ Armenian hy Armn
беларускі Belorussian be Cyrl
বাংলা Bengali bn Beng
български Bulgarian bg Cyrl
Català Catalan ca Latn
普通话 Chinese zh Hans/Hant
Hrvatski Croatian hr Latn
Čeština Czech cs Latn
Dansk Danish da Latn
Nederlands Dutch nl Latn
English English en Latn; American
Eesti keel Estonian et Latn
Filipino Filipino fil (or tl) Latn
Suomi Finnish fi Latn
Français French fr Latn; European
Deutsch German de Latn
Ελληνικά Greek el Grek
ગુજરાતી Gujarati gu Gujr
עברית Hebrew iw Hebr
हिन्दी Hindi hi Deva
Magyar Hungarian hu Latn
Íslenska Icelandic is Latn
Bahasa Indonesia Indonesian id Latn
Italiano Italian it Latn
日本語 Japanese ja Jpan
ಕನ್ನಡ Kannada kn Knda
ភាសាខ្មែរ Khmer km Khmr
한국어 Korean ko Kore
ລາວ Lao lo Laoo
Latviešu Latvian lv Latn
Lietuvių Lithuanian lt Latn
Македонски Macedonian mk Cyrl
Bahasa Melayu Malay ms Latn
മലയാളം Malayalam ml Mlym
मराठी Marathi mr Deva
नेपाली Nepali ne Deva
Norsk Norwegian no Latn; Bokmål
فارسی Persian fa Arab
Polski Polish pl Latn
Português Portuguese pt Latn; Brazilian
ਪੰਜਾਬੀ Punjabi pa Guru; Gurmukhi
Română Romanian ro Latn
Русский Russian ru Cyrl
Русский (старая орфография) Russian ru-PETR1708 Cyrl; Old Orthography
Српски Serbian sr Cyrl & Latn
Српски (латиница) Serbian sr-Latn Latn
Slovenčina Slovak sk Latn
Slovenščina Slovenian sl Latn
Español Spanish es Latn; European
Svenska Swedish sv Latn
தமிழ் Tamil ta Taml
తెలుగు Telugu te Telu
ไทย Thai th Thai
Türkçe Turkish tr Latn
Українська Ukrainian uk Cyrl
Tiếng Việt Vietnamese vi Latn
Yiddish Yiddish yi Hebr

实验性语言

以下语言正处于开发阶段,但不会定期评估。

Language Language (English name) languageHints code Script / notes
አማርኛ Amharic am Ethi
Αρχαία ελληνικά Ancient Greek grc Grek
অসমীয়া Assamese as Beng
Azərbaycan Azerbaijani az Latn
Azərbaycan (qədim yazı) Azerbaijani az-Cyrl Cyrl; old orthography
Euskara Basque eu Latn
Bosanski Bosnian bs Latn
မြန်မာ Burmese my Mymr
Cebuano Cebuano ceb Latn
ᏣᎳᎩ ᎦᏬᏂᎯᏍᏗ Cherokee chr Cher
dhivehi, dhivehi-bas Dhivehi dv Thaa
རྫོང་ཁ Dzonkha dz Tibt
Esperanto Esperanto eo Latn
Galego Galician gl Latn
ქართული Georgian ka Geor
Kreyòl Ayisyen Haitian Creole ht Latn
Gaeilge Irish ga Latn
Jawa Javanese jv Latn
Қазақ Kazakh kk Cyrl
Kirghiz Kirghiz ky Cyrl
Latine Latin la Latn
Malti Maltese mt Latn
Монгол Mongolian mn Cyrl
ଓଡ଼ିଆ Oriya or Orya
پښتو Pashto ps Arab
संस्कृतम् Sanskrit sa Deva
සිංහල Sinhala si Sinh
Swahili Swahili sw Latn
leššānā Suryāyā Syriac syr Syriac
བོད་སྐད་ Tibetan bo Tibt
ትግርኛ Tigirinya ti Ethi
اردو Urdu ur Arab
oʻzbekcha Uzbek uz Latn; Latin
oʻzbekcha Uzbek uz-Cyrl Cyrl; old orthography
Cymraeg Welsh cy Latn
IsiZulu Zulu zu Latn

映射语言

以下语言会映射到其他语言代码或映射到常规字符识别器。

Language Language (English name) languageHints code Script / notes Mapped to
بهسا اچيه Acehnese ace Latn Latin script model
Lwo Acholi ach Latn Latin script model
Dangme Adangme ada Latn Latin script model
Akan Akan ak Latn Latin script model
Anicinâbemowin Algonquinian alg Latn Latin script model
Mapudungu Araucanian/Mapuche arn Latn Latin script model
Asturianu Asturian ast Latn Latin script model
Dene Athabaskan ath Latn Latin script model
Aymar aru Aymara ay Latn Latin script model
Bhāṣa Bali Balinese ban Latn Latin script model
Bamanankan Bambara bm Latn Latin script model
Narrow Bantu Bantu bnt Latn Latin script model
башҡорт теле Bashkir ba Cyrl Cyrillic script model
Toba–Batak Batak btk Latn Latin script model
Chibemba Bemba bem Latn Latin script model
Bikol Naga Bikol bik Latn Latin script model
Bichelamar Bislama bi Latn Latin script model
Brezhoneg Breton br Latn Latin script model
нохчийн мотт / noxçiyn mott Chechen ce Cyrl Cyrillic script model
汉语 Chinese zh-Hans Hans; Simplified; Mandarin zh
漢語 Chinese zh-Hant Hant; Traditional; Mandarin zh
普通話 Chinese zh-Hant-HK Hant; Mandarin; Hong Kong zh
Chahta' Choctaw cho Latn Latin script model
Чӑвашла Chuvash cv Cyrl Cyrillic script model
Cree–Montagnais–Naskapi Cree cr Latn Latin script model
Mvskoke Creek mus Latn Latin script model
qırımtatar tili, къырымтатар тили Crimean Tatar crh Latn Cyrillic script model
Dakhótiyapi, Dakȟótiyapi Dakota dak Latn Latin script model
Douala Duala dua Latn Latin script model
Ikɔ Efik Efik efi Latn Latin script model
English (British) English en-GB Latn; British en
Èʋegbe Ewe ee Latn Latin script model
føroyskt mál Faroese fo Latn Latin script model
Na Vosa Vakaviti Fijian fj Latn Latin script model
fɔ̀ngbè Fon fon Latn Latin script model
Français canadien French fr-CA Latn; Canadian fr
Fulani, Fulah, Peul Fulah ff Latn Latin script model
Ga gaa Latn Latin script model
Luganda Ganda lg Latn Latin script model
Basa Gayo Gayo gay Latn Latin script model
Kiribati Gilbertese gil Latn Latin script model
Gothic Gothic got Latn Latin script model
Guaraní Guarani gn Latn Latin script model
Harshen/Halshen Hausa هَرْشَن هَوْسَ Hausa ha Latn Latin script model
ʻŌlelo Hawaiʻi Hawaiian haw Latn Latin script model
Otjiherero Herero hz Latn Latin script model
Ilonggo Hiligaynon hil Latn Latin script model
Jaku Iban Iban iba Latn Latin script model
Asụsụ Igbo Igbo ig Latn Latin script model
Ilokano Iloko ilo Latn Latin script model
Taqbaylit Kabyle kab Latn Latin script model
Jingpho Kachin kac Latn Latin script model
Kalaallisut Kalaallisut kl Latn Latin script model
Kikamba Kamba kam Latn Latin script model
Kanuri Kanuri kr Latn Latin script model
Qaraqalpaq tili, Қарақалпақ тили, قاراقالپاق تىلى Kara-Kalpak kaa Cyrl/Latn Cyrillic script model
Ka Ktien Khasi Khasi kha Latn Latin script model
Gĩkũyũ Kikuyu ki Latn Latin script model
Kinyarwanda Kinyarwanda rw Latn Latin script model
коми кыв Komi kv Cyrl Cyrillic script model
Kikongo Kongo kg Latn Latin script model
Kosraean Kosraean kos Latn Latin script model
Oshikwanyama Kuanyama kj Latn Latin script model
Ngala Lingala ln Latn Latin script model
Plattdütsch, Plattdeutsch, Nedersaksisch Low German nds Latn Latin script model
siLozi Lozi loz Latn Latin script model
Kiluba Luba-Katanga lu Latn Latin script model
Dholuo Luo luo Latn Latin script model
Madhura, Basa Mathura, بَهاسَ مَدورا Madurese mad Latn Latin script model
Malagasy Malagasy mg Latn Latin script model
Mandinka, لغة مندنكا Mandingo man Latn Latin script model
Gaelg, Gailck Manx gv Latn Latin script model
Te reo Māori Maori mi Latn Latin script model
Ebon Marshallese mh Latn Latin script model
Mɛnde yia Mende men Latn Latin script model
Middle English Middle English enm Latn Latin script model
Mittelhochdeutsch Middle High German gmh Latn Latin script model
Baso Minangkabau, باسو مينڠكاباو Minangkabau min Latn Latin script model
Kanienʼkéha Mohawk moh Latn Latin script model
Nkundu Mongo lol Latn Latin script model
Nāhuatl Nahuatl nah Latn Latin script model
Diné bizaad Navajo nv Latn Latin script model
Ndonga Ndonga ng Latn Latin script model
ko e vagahau Niuē Niuean niu Latn Latin script model
Zimbabwe Ndebele North Ndebele nd Latn Latin script model
Sesotho sa Leboa Northern Sotho nso Latn Latin script model
Chichewa, Chinyanja Nyanja ny Latn Latin script model
Runyankore Nyankole nyn Latn Latin script model
Chitonga Nyasa Tonga tog Latn Latin script model
Appolo Nzima nzi Latn Latin script model
Occitan, lenga d'òc, provençal Occitan oc Latn Latin script model
Anishinaabemowin, ᐊᓂᔑᓈᐯᒧᐎᓐ Ojibwa oj Latn Latin script model
Ænglisc, Englisc, Anglisc Old English ang Latn Latin script model
Franceis, François, Romanz Old French fro Latn Latin script model
Diutisk, Althochdeutsch Old High German goh Latn Latin script model
Dǫnsk tunga Old Norse non Latn Latin script model
Occitan ancian Old Provencal pro Latn Latin script model
ирон ӕвзаг Ossetic os Cyrl Cyrillic script model
Kapampangan Pampanga pam Latn Latin script model
Salitan Pangasinan Pangasinan pag Latn Latin script model
Papiamentu Papiamento pap Latn Latin script model
Português (Portugal) Portuguese pt-PT Latn; European pt
Kechua / Runa Simi Quechua qu Latn Latin script model
Rumantsch Romansh rm Latn Latin script model
Romani čhib Romany rom Latn Latin script model
Ikirundi Rundi rn Latn Latin script model
Sakha Sakha sah Cyrl Cyrillic script model
Gagana faʻa Sāmoa Samoan sm Latn Latin script model
yângâ tî sängö Sango sg Latn Latin script model
(Braid) Scots, Lallans, Doric Scots sco Latn Latin script model
Gàidhlig Scottish Gaelic gd Latn Latin script model
chiShona Shona sn Latn Latin script model
Songhay Songhai son Latn Latin script model
Sesotho Southern Sotho st Latn Latin script model
Español (Latinoamérica) Spanish es-419 Latn; Latin American es
ᮘᮞ ᮞᮥᮔ᮪ᮓ , Basa Sunda Sundanese su Latn Latin script model
siSwati Swati ss Latn Latin script model
Reo Tahiti Tahitian ty Latn Latin script model
тоҷикӣ Tajik tg Cyrl Cyrillic script model
татар теле Tatar tt Cyrl/Latn Cyrillic script model
KʌThemnɛ Temne tem Latn Latin script model
lea faka-Tonga Tongan to Latn Latin script model
Xitsonga Tsonga ts Latn Latin script model
Setswana Tswana tn Latn Latin script model
Türkmençe Turkmen tk Latn Cyrillic script model
удмурт кыл Udmurt udm Cyrl Cyrillic script model
Tshivenḓa Venda ve Latn Latin script model
Vod Votic vot Cyrl/Latn Cyrillic script model
Frysk Western Frisian fy Latn Latin script model
Wolof Wolof wo Latn Latin script model
isiXhosa Xhosa xh Latn Latin script model
Èdè Yorùbá Yoruba yo Latn Latin script model
Diidxazá Zapotec zap Latn Latin script model