Using the Language Client
Stay organized with collections
Save and categorize content based on your preferences.
Documents
The Google Natural Language API has the following supported methods:
and each method uses a Document
for representing
text.
>>> document = language.types.Document(
... content='Google, headquartered in Mountain View, unveiled the '
... 'new Android phone at the Consumer Electronic Show. '
... 'Sundar Pichai said in his keynote that users love '
... 'their new Android phones.',
... language='en',
... type='PLAIN_TEXT',
... )
The document’s language defaults to None
, which will cause the API to
auto-detect the language.
In addition, you can construct an HTML document:
>>> html_content = """\
... <html>
... <head>
... <title>El Tiempo de las Historias</time>
... </head>
... <body>
... <p>La vaca saltó sobre la luna.</p>
... </body>
... </html>
... """
>>> document = language.types.Document(
... content=html_content,
... language='es',
... type='HTML',
... )
The language
argument can be either ISO-639-1 or BCP-47 language
codes. The API reference page contains the full list of supported languages.
In addition to supplying the text / HTML content, a document can refer to content stored in Google Cloud Storage.
>>> document = language.types.Document(
... gcs_content_uri='gs://my-text-bucket/sentiment-me.txt',
... type=language.enums.HTML,
... )
Analyze Entities
The analyze_entities()
method finds named entities (i.e. proper names) in the text. This method
returns a AnalyzeEntitiesResponse
.
>>> document = language.types.Document(
... content='Michelangelo Caravaggio, Italian painter, is '
... 'known for "The Calling of Saint Matthew".',
... type=language.enums.Document.Type.PLAIN_TEXT,
... )
>>> response = client.analyze_entities(
... document=document,
... encoding_type='UTF32',
... )
>>> for entity in response.entities:
... print('=' * 20)
... print(' name: {0}'.format(entity.name))
... print(' type: {0}'.format(entity.type))
... print(' metadata: {0}'.format(entity.metadata))
... print(' salience: {0}'.format(entity.salience))
====================
name: Michelangelo Caravaggio
type: PERSON
metadata: {'wikipedia_url': 'https://en.wikipedia.org/wiki/Caravaggio'}
salience: 0.7615959
====================
name: Italian
type: LOCATION
metadata: {'wikipedia_url': 'https://en.wikipedia.org/wiki/Italy'}
salience: 0.19960518
====================
name: The Calling of Saint Matthew
type: EVENT
metadata: {'wikipedia_url': 'https://en.wikipedia.org/wiki/The_Calling_of_St_Matthew_(Caravaggio)'}
salience: 0.038798928
NOTE: It is recommended to send an encoding_type
argument to Natural
Language methods, so they provide useful offsets for the data they return.
While the correct value varies by environment, in Python you usually
want UTF32
.
Analyze Sentiment
The analyze_sentiment()
method
analyzes the sentiment of the provided text. This method returns a
AnalyzeSentimentResponse
.
>>> document = language.types.Document(
... content='Jogging is not very fun.',
... type='PLAIN_TEXT',
... )
>>> response = client.analyze_sentiment(
... document=document,
... encoding_type='UTF32',
... )
>>> sentiment = response.document_sentiment
>>> print(sentiment.score)
-1
>>> print(sentiment.magnitude)
0.8
NOTE: It is recommended to send an encoding_type
argument to Natural
Language methods, so they provide useful offsets for the data they return.
While the correct value varies by environment, in Python you usually
want UTF32
.
Analyze Entity Sentiment
The analyze_entity_sentiment()
method is effectively the amalgamation of
analyze_entities()
and
analyze_sentiment()
.
This method returns a
AnalyzeEntitySentimentResponse
.
>>> document = language.types.Document(
... content='Mona said that jogging is very fun.',
... type='PLAIN_TEXT',
... )
>>> response = client.analyze_entity_sentiment(
... document=document,
... encoding_type='UTF32',
... )
>>> entities = response.entities
>>> entities[0].name
'Mona'
>>> entities[1].name
'jogging'
>>> entities[1].sentiment.magnitude
0.8
>>> entities[1].sentiment.score
0.8
NOTE: It is recommended to send an encoding_type
argument to Natural
Language methods, so they provide useful offsets for the data they return.
While the correct value varies by environment, in Python you usually
want UTF32
.
Annotate Text
The annotate_text()
method
analyzes a document and is intended for users who are familiar with
machine learning and need in-depth text features to build upon. This method
returns a AnnotateTextResponse
.