While most Natural Language methods analyze what a given text is about,
the analyzeSyntax
method inspects the structure of the language itself.
Syntactic Analysis breaks up the given text into a series of sentences and
tokens (generally, words) and provides linguistic information about those tokens.
See Morphology & Dependency Trees for details
about the linguistic analysis and Language Support
for a list of the languages whose syntax the Natural Language API can analyze.
This section demonstrates a few ways to detect syntax in a document. For each document, you must submit a separate request.
Analyzing Syntax in a String
Here is an example of performing syntactic analysis on a text string sent directly to the Natural Language API:
Protocol
To analyze syntax in a document, make a POST
request to the
documents:analyzeSyntax
REST method and provide
the appropriate request body as shown in the following example.
The example uses the gcloud auth application-default print-access-token
command to obtain an access token for a service account set up for the
project using the Google Cloud Platform gcloud CLI.
For instructions on installing the gcloud CLI,
setting up a project with a service account
see the Quickstart.
curl -X POST \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'encodingType': 'UTF8', 'document': { 'type': 'PLAIN_TEXT', 'content': 'Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show. Sundar Pichai said in his keynote that users love their new Android phones.' } }" "https://language.googleapis.com/v1/documents:analyzeSyntax"
If you don't specify document.language
, then the language will be automatically
detected. For information on which languages are supported by the Natural Language API,
see Language Support. See the Document
reference documentation for more information on configuring the request
body.
If the request is successful, the server returns a 200 OK
HTTP status code and
the response in JSON format:
{ "sentences": [ { "text": { "content": "Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show.", "beginOffset": 0 } }, { "text": { "content": "Sundar Pichai said in his keynote that users love their new Android phones.", "beginOffset": 105 } } ], "tokens": [ { "text": { "content": "Google", "beginOffset": 0 }, "partOfSpeech": { "tag": "NOUN", "aspect": "ASPECT_UNKNOWN", "case": "CASE_UNKNOWN", "form": "FORM_UNKNOWN", "gender": "GENDER_UNKNOWN", "mood": "MOOD_UNKNOWN", "number": "SINGULAR", "person": "PERSON_UNKNOWN", "proper": "PROPER", "reciprocity": "RECIPROCITY_UNKNOWN", "tense": "TENSE_UNKNOWN", "voice": "VOICE_UNKNOWN" }, "dependencyEdge": { "headTokenIndex": 7, "label": "NSUBJ" }, "lemma": "Google" }, ... { "text": { "content": ".", "beginOffset": 179 }, "partOfSpeech": { "tag": "PUNCT", "aspect": "ASPECT_UNKNOWN", "case": "CASE_UNKNOWN", "form": "FORM_UNKNOWN", "gender": "GENDER_UNKNOWN", "mood": "MOOD_UNKNOWN", "number": "NUMBER_UNKNOWN", "person": "PERSON_UNKNOWN", "proper": "PROPER_UNKNOWN", "reciprocity": "RECIPROCITY_UNKNOWN", "tense": "TENSE_UNKNOWN", "voice": "VOICE_UNKNOWN" }, "dependencyEdge": { "headTokenIndex": 20, "label": "P" }, "lemma": "." } ], "language": "en" }
The tokens
array contains Token
objects representing the detected sentence tokens, which include information
such as a token's part of speech and its position in the sentence.
gcloud
Refer to the analyze-syntax
command for complete details.
To perform syntax analysis, use the gcloud CLI and
use the --content
flag to identify the content to analyze:
gcloud ml language analyze-syntax --content="Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show. Sundar Pichai said in his keynote that users love their new Android phones."
If the request is successful, the server returns a response in JSON format:
{ "sentences": [ { "text": { "content": "Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show.", "beginOffset": 0 } }, { "text": { "content": "Sundar Pichai said in his keynote that users love their new Android phones.", "beginOffset": 105 } } ], "tokens": [ { "text": { "content": "Google", "beginOffset": 0 }, "partOfSpeech": { "tag": "NOUN", "aspect": "ASPECT_UNKNOWN", "case": "CASE_UNKNOWN", "form": "FORM_UNKNOWN", "gender": "GENDER_UNKNOWN", "mood": "MOOD_UNKNOWN", "number": "SINGULAR", "person": "PERSON_UNKNOWN", "proper": "PROPER", "reciprocity": "RECIPROCITY_UNKNOWN", "tense": "TENSE_UNKNOWN", "voice": "VOICE_UNKNOWN" }, "dependencyEdge": { "headTokenIndex": 7, "label": "NSUBJ" }, "lemma": "Google" }, ... { "text": { "content": ".", "beginOffset": 179 }, "partOfSpeech": { "tag": "PUNCT", "aspect": "ASPECT_UNKNOWN", "case": "CASE_UNKNOWN", "form": "FORM_UNKNOWN", "gender": "GENDER_UNKNOWN", "mood": "MOOD_UNKNOWN", "number": "NUMBER_UNKNOWN", "person": "PERSON_UNKNOWN", "proper": "PROPER_UNKNOWN", "reciprocity": "RECIPROCITY_UNKNOWN", "tense": "TENSE_UNKNOWN", "voice": "VOICE_UNKNOWN" }, "dependencyEdge": { "headTokenIndex": 20, "label": "P" }, "lemma": "." } ], "language": "en" }
The tokens
array contains Token
objects representing the detected sentence tokens, which include information
such as a token's part of speech and its position in the sentence.
Go
To learn how to install and use the client library for Natural Language, see Natural Language client libraries. For more information, see the Natural Language Go API reference documentation.
To authenticate to Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Natural Language, see Natural Language client libraries. For more information, see the Natural Language Java API reference documentation.
To authenticate to Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Natural Language, see Natural Language client libraries. For more information, see the Natural Language Node.js API reference documentation.
To authenticate to Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Natural Language, see Natural Language client libraries. For more information, see the Natural Language Python API reference documentation.
To authenticate to Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Additional languages
C#: Please follow the C# setup instructions on the client libraries page and then visit the Natural Language reference documentation for .NET.
PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Natural Language reference documentation for PHP.
Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Natural Language reference documentation for Ruby.
Analyzing Syntax from Cloud Storage
For your convenience, the Natural Language API can perform syntactic analysis directly on a file located in Cloud Storage, without the need to send the contents of the file in the body of your request.
Here is an example of performing syntactic analysis on a file located in Cloud Storage.
Protocol
To analyze syntax in a document stored in Cloud Storage,
make a POST
request to the
documents:analyzeSyntax
REST method and provide
the appropriate request body with the path to the document
as shown in the following example.
curl -X POST \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'encodingType': 'UTF8', 'document': { 'type': 'PLAIN_TEXT', 'gcsContentUri': 'gs://<bucket-name>/<object-name>' } }" "https://language.googleapis.com/v1/documents:analyzeSyntax"
If you don't specify document.language
, then the language will be automatically
detected. For information on which languages are supported by the Natural Language API,
see Language Support. See the Document
reference documentation for more information on configuring the request body.
If the request is successful, the server returns a 200 OK
HTTP status code and
the response in JSON format:
{ "sentences": [ { "text": { "content": "Hello, world!", "beginOffset": 0 } } ], "tokens": [ { "text": { "content": "Hello", "beginOffset": 0 }, "partOfSpeech": { "tag": "X", // ... }, "dependencyEdge": { "headTokenIndex": 2, "label": "DISCOURSE" }, "lemma": "Hello" }, { "text": { "content": ",", "beginOffset": 5 }, "partOfSpeech": { "tag": "PUNCT", // ... }, "dependencyEdge": { "headTokenIndex": 2, "label": "P" }, "lemma": "," }, // ... ], "language": "en" }
The tokens
array contains Token
objects representing the detected sentence tokens, which include information
such as a token's part of speech and its position in the sentence.
gcloud
Refer to theanalyze-syntax
command for complete details.
To perform syntax analysis on a file in Cloud Storage, use the gcloud
command line tool and use the --content-file
flag to identify the file
path that contains the content to analyze:
gcloud ml language analyze-syntax --content-file=gs://YOUR_BUCKET_NAME/YOUR_FILE_NAME
If the request is successful, the server returns a response in JSON format:
{ "sentences": [ { "text": { "content": "Hello, world!", "beginOffset": 0 } } ], "tokens": [ { "text": { "content": "Hello", "beginOffset": 0 }, "partOfSpeech": { "tag": "X", // ... }, "dependencyEdge": { "headTokenIndex": 2, "label": "DISCOURSE" }, "lemma": "Hello" }, { "text": { "content": ",", "beginOffset": 5 }, "partOfSpeech": { "tag": "PUNCT", // ... }, "dependencyEdge": { "headTokenIndex": 2, "label": "P" }, "lemma": "," }, // ... ], "language": "en" }
The tokens
array contains Token
objects representing the detected sentence tokens, which include information
such as a token's part of speech and its position in the sentence.
Go
To learn how to install and use the client library for Natural Language, see Natural Language client libraries. For more information, see the Natural Language Go API reference documentation.
To authenticate to Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Natural Language, see Natural Language client libraries. For more information, see the Natural Language Java API reference documentation.
To authenticate to Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Natural Language, see Natural Language client libraries. For more information, see the Natural Language Node.js API reference documentation.
To authenticate to Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Natural Language, see Natural Language client libraries. For more information, see the Natural Language Python API reference documentation.
To authenticate to Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Additional languages
C#: Please follow the C# setup instructions on the client libraries page and then visit the Natural Language reference documentation for .NET.
PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Natural Language reference documentation for PHP.
Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the Natural Language reference documentation for Ruby.