Types overview

AnalyzeEntitiesRequest

The entity analysis request message.
Fields
document

object (Document)

Required. Input document.

encodingType

enum

The encoding type used by the API to calculate offsets.

Enum type. Can be one of the following:
NONE If EncodingType is not specified, encoding-dependent information (such as begin_offset) will be set at -1.
UTF8 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-8 encoding of the input. C++ and Go are examples of languages that use this encoding natively.
UTF16 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-16 encoding of the input. Java and JavaScript are examples of languages that use this encoding natively.
UTF32 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-32 encoding of the input. Python is an example of a language that uses this encoding natively.

AnalyzeEntitiesResponse

The entity analysis response message.
Fields
entities[]

object (Entity)

The recognized entities in the input document.

language

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.

AnalyzeEntitySentimentRequest

The entity-level sentiment analysis request message.
Fields
document

object (Document)

Required. Input document.

encodingType

enum

The encoding type used by the API to calculate offsets.

Enum type. Can be one of the following:
NONE If EncodingType is not specified, encoding-dependent information (such as begin_offset) will be set at -1.
UTF8 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-8 encoding of the input. C++ and Go are examples of languages that use this encoding natively.
UTF16 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-16 encoding of the input. Java and JavaScript are examples of languages that use this encoding natively.
UTF32 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-32 encoding of the input. Python is an example of a language that uses this encoding natively.

AnalyzeEntitySentimentResponse

The entity-level sentiment analysis response message.
Fields
entities[]

object (Entity)

The recognized entities in the input document with associated sentiments.

language

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.

AnalyzeSentimentRequest

The sentiment analysis request message.
Fields
document

object (Document)

Required. Input document.

encodingType

enum

The encoding type used by the API to calculate sentence offsets.

Enum type. Can be one of the following:
NONE If EncodingType is not specified, encoding-dependent information (such as begin_offset) will be set at -1.
UTF8 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-8 encoding of the input. C++ and Go are examples of languages that use this encoding natively.
UTF16 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-16 encoding of the input. Java and JavaScript are examples of languages that use this encoding natively.
UTF32 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-32 encoding of the input. Python is an example of a language that uses this encoding natively.

AnalyzeSentimentResponse

The sentiment analysis response message.
Fields
documentSentiment

object (Sentiment)

The overall sentiment of the input document.

language

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.

sentences[]

object (Sentence)

The sentiment for all the sentences in the document.

AnalyzeSyntaxRequest

The syntax analysis request message.
Fields
document

object (Document)

Required. Input document.

encodingType

enum

The encoding type used by the API to calculate offsets.

Enum type. Can be one of the following:
NONE If EncodingType is not specified, encoding-dependent information (such as begin_offset) will be set at -1.
UTF8 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-8 encoding of the input. C++ and Go are examples of languages that use this encoding natively.
UTF16 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-16 encoding of the input. Java and JavaScript are examples of languages that use this encoding natively.
UTF32 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-32 encoding of the input. Python is an example of a language that uses this encoding natively.

AnalyzeSyntaxResponse

The syntax analysis response message.
Fields
language

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.

sentences[]

object (Sentence)

Sentences in the input document.

tokens[]

object (Token)

Tokens, along with their syntactic information, in the input document.

AnnotateTextRequest

The request message for the text annotation API, which can perform multiple analysis types (sentiment, entities, and syntax) in one call.
Fields
document

object (Document)

Required. Input document.

encodingType

enum

The encoding type used by the API to calculate offsets.

Enum type. Can be one of the following:
NONE If EncodingType is not specified, encoding-dependent information (such as begin_offset) will be set at -1.
UTF8 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-8 encoding of the input. C++ and Go are examples of languages that use this encoding natively.
UTF16 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-16 encoding of the input. Java and JavaScript are examples of languages that use this encoding natively.
UTF32 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-32 encoding of the input. Python is an example of a language that uses this encoding natively.
features

object (Features)

Required. The enabled features.

AnnotateTextResponse

The text annotations response message.
Fields
categories[]

object (ClassificationCategory)

Categories identified in the input document.

documentSentiment

object (Sentiment)

The overall sentiment for the document. Populated if the user enables AnnotateTextRequest.Features.extract_document_sentiment.

entities[]

object (Entity)

Entities, along with their semantic information, in the input document. Populated if the user enables AnnotateTextRequest.Features.extract_entities.

language

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.

sentences[]

object (Sentence)

Sentences in the input document. Populated if the user enables AnnotateTextRequest.Features.extract_syntax.

tokens[]

object (Token)

Tokens, along with their syntactic information, in the input document. Populated if the user enables AnnotateTextRequest.Features.extract_syntax.

ClassificationCategory

Represents a category returned from the text classifier.
Fields
confidence

number (float format)

The classifier's confidence of the category. Number represents how certain the classifier is that this category represents the given text.

name

string

The name of the category representing the document, from the predefined taxonomy.

ClassifyTextRequest

The document classification request message.
Fields
document

object (Document)

Required. Input document.

ClassifyTextResponse

The document classification response message.
Fields
categories[]

object (ClassificationCategory)

Categories representing the input document.

DependencyEdge

Represents dependency parse tree information for a token. (For more information on dependency labels, see http://www.aclweb.org/anthology/P13-2017
Fields
headTokenIndex

integer (int32 format)

Represents the head of this token in the dependency tree. This is the index of the token which has an arc going to this token. The index is the position of the token in the array of tokens returned by the API method. If this token is a root token, then the head_token_index is its own index.

label

enum

The parse label for the token.

Enum type. Can be one of the following:
UNKNOWN Unknown
ABBREV Abbreviation modifier
ACOMP Adjectival complement
ADVCL Adverbial clause modifier
ADVMOD Adverbial modifier
AMOD Adjectival modifier of an NP
APPOS Appositional modifier of an NP
ATTR Attribute dependent of a copular verb
AUX Auxiliary (non-main) verb
AUXPASS Passive auxiliary
CC Coordinating conjunction
CCOMP Clausal complement of a verb or adjective
CONJ Conjunct
CSUBJ Clausal subject
CSUBJPASS Clausal passive subject
DEP Dependency (unable to determine)
DET Determiner
DISCOURSE Discourse
DOBJ Direct object
EXPL Expletive
GOESWITH Goes with (part of a word in a text not well edited)
IOBJ Indirect object
MARK Marker (word introducing a subordinate clause)
MWE Multi-word expression
MWV Multi-word verbal expression
NEG Negation modifier
NN Noun compound modifier
NPADVMOD Noun phrase used as an adverbial modifier
NSUBJ Nominal subject
NSUBJPASS Passive nominal subject
NUM Numeric modifier of a noun
NUMBER Element of compound number
P Punctuation mark
PARATAXIS Parataxis relation
PARTMOD Participial modifier
PCOMP The complement of a preposition is a clause
POBJ Object of a preposition
POSS Possession modifier
POSTNEG Postverbal negative particle
PRECOMP Predicate complement
PRECONJ Preconjunt
PREDET Predeterminer
PREF Prefix
PREP Prepositional modifier
PRONL The relationship between a verb and verbal morpheme
PRT Particle
PS Associative or possessive marker
QUANTMOD Quantifier phrase modifier
RCMOD Relative clause modifier
RCMODREL Complementizer in relative clause
RDROP Ellipsis without a preceding predicate
REF Referent
REMNANT Remnant
REPARANDUM Reparandum
ROOT Root
SNUM Suffix specifying a unit of number
SUFF Suffix
TMOD Temporal modifier
TOPIC Topic marker
VMOD Clause headed by an infinite form of the verb that modifies a noun
VOCATIVE Vocative
XCOMP Open clausal complement
SUFFIX Name suffix
TITLE Name title
ADVPHMOD Adverbial phrase modifier
AUXCAUS Causative auxiliary
AUXVV Helper auxiliary
DTMOD Rentaishi (Prenominal modifier)
FOREIGN Foreign words
KW Keyword
LIST List for chains of comparable items
NOMC Nominalized clause
NOMCSUBJ Nominalized clausal subject
NOMCSUBJPASS Nominalized clausal passive
NUMC Compound of numeric modifier
COP Copula
DISLOCATED Dislocated relation (for fronted/topicalized elements)
ASP Aspect marker
GMOD Genitive modifier
GOBJ Genitive object
INFMOD Infinitival modifier
MES Measure
NCOMP Nominal complement of a noun

Document

########################################################## # Represents the input to API methods.
Fields
content

string

The content of the input in string format. Cloud audit logging exempt since it is based on user data.

gcsContentUri

string

The Google Cloud Storage URI where the file content is located. This URI must be of the form: gs://bucket_name/object_name. For more details, see https://cloud.google.com/storage/docs/reference-uris. NOTE: Cloud Storage object versioning is not supported.

language

string

The language of the document (if not specified, the language is automatically detected). Both ISO and BCP-47 language codes are accepted. Language Support lists currently supported languages for each API method. If the language (either specified by the caller or automatically detected) is not supported by the called API method, an INVALID_ARGUMENT error is returned.

type

enum

Required. If the type is not set or is TYPE_UNSPECIFIED, returns an INVALID_ARGUMENT error.

Enum type. Can be one of the following:
TYPE_UNSPECIFIED The content type is not specified.
PLAIN_TEXT Plain text
HTML HTML

Entity

Represents a phrase in the text that is a known entity, such as a person, an organization, or location. The API associates information, such as salience and mentions, with entities.
Fields
mentions[]

object (EntityMention)

The mentions of this entity in the input document. The API currently supports proper noun mentions.

metadata

map (key: string, value: string)

Metadata associated with the entity. For most entity types, the metadata is a Wikipedia URL (wikipedia_url) and Knowledge Graph MID (mid), if they are available. For the metadata associated with other entity types, see the Type table below.

name

string

The representative name for the entity.

salience

number (float format)

The salience score associated with the entity in the [0, 1.0] range. The salience score for an entity provides information about the importance or centrality of that entity to the entire document text. Scores closer to 0 are less salient, while scores closer to 1.0 are highly salient.

sentiment

object (Sentiment)

For calls to AnalyzeEntitySentiment or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the aggregate sentiment expressed for this entity in the provided document.

type

enum

The entity type.

Enum type. Can be one of the following:
UNKNOWN Unknown
PERSON Person
LOCATION Location
ORGANIZATION Organization
EVENT Event
WORK_OF_ART Artwork
CONSUMER_GOOD Consumer product
OTHER Other types of entities
PHONE_NUMBER Phone number The metadata lists the phone number, formatted according to local convention, plus whichever additional elements appear in the text: * number - the actual number, broken down into sections as per local convention * national_prefix - country code, if detected * area_code - region or area code, if detected * extension - phone extension (to be dialed after connection), if detected
ADDRESS Address The metadata identifies the street number and locality plus whichever additional elements appear in the text: * street_number - street number * locality - city or town * street_name - street/route name, if detected * postal_code - postal code, if detected * country - country, if detected< * broad_region - administrative area, such as the state, if detected * narrow_region - smaller administrative area, such as county, if detected * sublocality - used in Asian addresses to demark a district within a city, if detected
DATE Date The metadata identifies the components of the date: * year - four digit year, if detected * month - two digit month number, if detected * day - two digit day number, if detected
NUMBER Number The metadata is the number itself.
PRICE Price The metadata identifies the value and currency.

EntityMention

Represents a mention for an entity in the text. Currently, proper noun mentions are supported.
Fields
sentiment

object (Sentiment)

For calls to AnalyzeEntitySentiment or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the sentiment expressed for this mention of the entity in the provided document.

text

object (TextSpan)

The mention text.

type

enum

The type of the entity mention.

Enum type. Can be one of the following:
TYPE_UNKNOWN Unknown
PROPER Proper name
COMMON Common noun (or noun compound)

Features

All available features for sentiment, syntax, and semantic analysis. Setting each one to true will enable that specific analysis for the input.
Fields
classifyText

boolean

Classify the full document into categories.

extractDocumentSentiment

boolean

Extract document-level sentiment.

extractEntities

boolean

Extract entities.

extractEntitySentiment

boolean

Extract entities and their associated sentiment.

extractSyntax

boolean

Extract syntax information.

PartOfSpeech

Represents part of speech information for a token. Parts of speech are as defined in http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdf
Fields
aspect

enum

The grammatical aspect.

Enum type. Can be one of the following:
ASPECT_UNKNOWN Aspect is not applicable in the analyzed language or is not predicted.
PERFECTIVE Perfective
IMPERFECTIVE Imperfective
PROGRESSIVE Progressive
case

enum

The grammatical case.

Enum type. Can be one of the following:
CASE_UNKNOWN Case is not applicable in the analyzed language or is not predicted.
ACCUSATIVE Accusative
ADVERBIAL Adverbial
COMPLEMENTIVE Complementive
DATIVE Dative
GENITIVE Genitive
INSTRUMENTAL Instrumental
LOCATIVE Locative
NOMINATIVE Nominative
OBLIQUE Oblique
PARTITIVE Partitive
PREPOSITIONAL Prepositional
REFLEXIVE_CASE Reflexive
RELATIVE_CASE Relative
VOCATIVE Vocative
form

enum

The grammatical form.

Enum type. Can be one of the following:
FORM_UNKNOWN Form is not applicable in the analyzed language or is not predicted.
ADNOMIAL Adnomial
AUXILIARY Auxiliary
COMPLEMENTIZER Complementizer
FINAL_ENDING Final ending
GERUND Gerund
REALIS Realis
IRREALIS Irrealis
SHORT Short form
LONG Long form
ORDER Order form
SPECIFIC Specific form
gender

enum

The grammatical gender.

Enum type. Can be one of the following:
GENDER_UNKNOWN Gender is not applicable in the analyzed language or is not predicted.
FEMININE Feminine
MASCULINE Masculine
NEUTER Neuter
mood

enum

The grammatical mood.

Enum type. Can be one of the following:
MOOD_UNKNOWN Mood is not applicable in the analyzed language or is not predicted.
CONDITIONAL_MOOD Conditional
IMPERATIVE Imperative
INDICATIVE Indicative
INTERROGATIVE Interrogative
JUSSIVE Jussive
SUBJUNCTIVE Subjunctive
number

enum

The grammatical number.

Enum type. Can be one of the following:
NUMBER_UNKNOWN Number is not applicable in the analyzed language or is not predicted.
SINGULAR Singular
PLURAL Plural
DUAL Dual
person

enum

The grammatical person.

Enum type. Can be one of the following:
PERSON_UNKNOWN Person is not applicable in the analyzed language or is not predicted.
FIRST First
SECOND Second
THIRD Third
REFLEXIVE_PERSON Reflexive
proper

enum

The grammatical properness.

Enum type. Can be one of the following:
PROPER_UNKNOWN Proper is not applicable in the analyzed language or is not predicted.
PROPER Proper
NOT_PROPER Not proper
reciprocity

enum

The grammatical reciprocity.

Enum type. Can be one of the following:
RECIPROCITY_UNKNOWN Reciprocity is not applicable in the analyzed language or is not predicted.
RECIPROCAL Reciprocal
NON_RECIPROCAL Non-reciprocal
tag

enum

The part of speech tag.

Enum type. Can be one of the following:
UNKNOWN Unknown
ADJ Adjective
ADP Adposition (preposition and postposition)
ADV Adverb
CONJ Conjunction
DET Determiner
NOUN Noun (common and proper)
NUM Cardinal number
PRON Pronoun
PRT Particle or other function word
PUNCT Punctuation
VERB Verb (all tenses and modes)
X Other: foreign words, typos, abbreviations
AFFIX Affix
tense

enum

The grammatical tense.

Enum type. Can be one of the following:
TENSE_UNKNOWN Tense is not applicable in the analyzed language or is not predicted.
CONDITIONAL_TENSE Conditional
FUTURE Future
PAST Past
PRESENT Present
IMPERFECT Imperfect
PLUPERFECT Pluperfect
voice

enum

The grammatical voice.

Enum type. Can be one of the following:
VOICE_UNKNOWN Voice is not applicable in the analyzed language or is not predicted.
ACTIVE Active
CAUSATIVE Causative
PASSIVE Passive

Sentence

Represents a sentence in the input document.
Fields
sentiment

object (Sentiment)

For calls to AnalyzeSentiment or if AnnotateTextRequest.Features.extract_document_sentiment is set to true, this field will contain the sentiment for the sentence.

text

object (TextSpan)

The sentence text.

Sentiment

Represents the feeling associated with the entire text or entities in the text.
Fields
magnitude

number (float format)

A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).

score

number (float format)

Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).

Status

The Status type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by gRPC. Each Status message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the API Design Guide.
Fields
code

integer (int32 format)

The status code, which should be an enum value of google.rpc.Code.

details[]

object

A list of messages that carry the error details. There is a common set of message types for APIs to use.

message

string

A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client.

TextSpan

Represents an output piece of text.
Fields
beginOffset

integer (int32 format)

The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.

content

string

The content of the output text.

Token

Represents the smallest syntactic building block of the text.
Fields
dependencyEdge

object (DependencyEdge)

Dependency tree parse for this token.

lemma

string

Lemma of the token.

partOfSpeech

object (PartOfSpeech)

Parts of speech tag for this token.

text

object (TextSpan)

The token text.