Analyzing Entities

Entity Analysis inspects the given text for known entities (proper nouns such as public figures, landmarks, etc.), and returns information about those entities. Entity analysis is performed with the analyzeEntities method. For information on which languages are supported by the Natural Language API, see Language Support.

This section demonstrates a few ways to detect entities in a document.

Analyzing Entities in a String

Here is an example of performing entity analysis on a text string sent directly to the Natural Language API:

Protocol

Refer to the documents:analyzeEntities API endpoint for complete details.

To perform entity analysis, make a POST request and provide the appropriate request body:

POST https://language.googleapis.com/v1/documents:analyzeEntities?key=YOUR_API_KEY
{
  "encodingType": "UTF8",
  "document": {
    "type": "PLAIN_TEXT",
    "content": "President Obama is speaking at the White House."
  }
}

If you don't specify document.language, then the language will be automatically detected. For information on which languages are supported by the Natural Language API, see Language Support. See the Document reference documentation for more information on configuring the request body.

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

{
  "entities": [
    {
      "name": "Obama",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/02mjmr",
        "wikipedia_url": "http://en.wikipedia.org/wiki/Barack_Obama"
      },
      "salience": 0.9143443,
      "mentions": [
        {
          "text": {
            "content": "Obama",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "http://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.08565566,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 35
          },
          "type": "PROPER"
        }
      ]
    }
  ],
  "language": "en"
}

The entities array contains Entity objects representing the detected entities, which include information such as the entity name and type.

GCLOUD

Refer to the analyze-entities command for complete details.

To perform entity analysis, use the gcloud command line tool and use the --content flag to identify the content to analyze:

gcloud ml language analyze-entities --content="President Obama is speaking at the White House."

If the request is successful, the server returns a response in JSON format:

{
  "entities": [
    {
      "name": "Obama",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/02mjmr",
        "wikipedia_url": "http://en.wikipedia.org/wiki/Barack_Obama"
      },
      "salience": 0.9143443,
      "mentions": [
        {
          "text": {
            "content": "Obama",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "http://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.08565566,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 35
          },
          "type": "PROPER"
        }
      ]
    }
  ],
  "language": "en"
}

The entities array contains Entity objects representing the detected entities, which include information such as the entity name and type.

C#

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

private static void AnalyzeEntitiesFromText(string text)
{
    var client = LanguageServiceClient.Create();
    var response = client.AnalyzeEntities(new Document()
    {
        Content = text,
        Type = Document.Types.Type.PlainText
    });
    WriteEntities(response.Entities);
}

private static void WriteEntities(IEnumerable<Entity> entities)
{
    Console.WriteLine("Entities:");
    foreach (var entity in entities)
    {
        Console.WriteLine($"\tName: {entity.Name}");
        Console.WriteLine($"\tType: {entity.Type}");
        Console.WriteLine($"\tSalience: {entity.Salience}");
        Console.WriteLine("\tMentions:");
        foreach (var mention in entity.Mentions)
            Console.WriteLine($"\t\t{mention.Text.BeginOffset}: {mention.Text.Content}");
        Console.WriteLine("\tMetadata:");
        foreach (var keyval in entity.Metadata)
        {
            Console.WriteLine($"\t\t{keyval.Key}: {keyval.Value}");
        }
    }
}

Go

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

func analyzeEntities(ctx context.Context, client *language.Client, text string) (*languagepb.AnalyzeEntitiesResponse, error) {
	return client.AnalyzeEntities(ctx, &languagepb.AnalyzeEntitiesRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_Content{
				Content: text,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
		EncodingType: languagepb.EncodingType_UTF8,
	})
}

Java

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

// Instantiate the Language client com.google.cloud.language.v1.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  Document doc = Document.newBuilder()
      .setContent(text)
      .setType(Type.PLAIN_TEXT)
      .build();
  AnalyzeEntitiesRequest request = AnalyzeEntitiesRequest.newBuilder()
      .setDocument(doc)
      .setEncodingType(EncodingType.UTF16)
      .build();

  AnalyzeEntitiesResponse response = language.analyzeEntities(request);

  // Print the response
  for (Entity entity : response.getEntitiesList()) {
    System.out.printf("Entity: %s", entity.getName());
    System.out.printf("Salience: %.3f\n", entity.getSalience());
    System.out.println("Metadata: ");
    for (Map.Entry<String, String> entry : entity.getMetadataMap().entrySet()) {
      System.out.printf("%s : %s", entry.getKey(), entry.getValue());
    }
    for (EntityMention mention : entity.getMentionsList()) {
      System.out.printf("Begin offset: %d\n", mention.getText().getBeginOffset());
      System.out.printf("Content: %s\n", mention.getText().getContent());
      System.out.printf("Type: %s\n\n", mention.getType());
    }
  }
}

Node.js

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

// Imports the Google Cloud client library
const language = require('@google-cloud/language');

// Creates a client
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following line to run this code.
 */
// const text = 'Your text to analyze, e.g. Hello, world!';

// Prepares a document, representing the provided text
const document = {
  content: text,
  type: 'PLAIN_TEXT',
};

// Detects entities in the document
client
  .analyzeEntities({document: document})
  .then(results => {
    const entities = results[0].entities;

    console.log('Entities:');
    entities.forEach(entity => {
      console.log(entity.name);
      console.log(` - Type: ${entity.type}, Salience: ${entity.salience}`);
      if (entity.metadata && entity.metadata.wikipedia_url) {
        console.log(` - Wikipedia URL: ${entity.metadata.wikipedia_url}$`);
      }
    });
  })
  .catch(err => {
    console.error('ERROR:', err);
  });

PHP

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

namespace Google\Cloud\Samples\Language;

use Google\Cloud\Language\LanguageClient;

/**
 * Find the entities in text.
 * ```
 * analyze_entities('Do you know the way to San Jose?');
 * ```
 *
 * @param string $text The text to analyze.
 * @param string $projectId (optional) Your Google Cloud Project ID
 *
 */
function analyze_entities($text, $projectId = null)
{
    // Create the Natural Language client
    $language = new LanguageClient([
        'projectId' => $projectId,
    ]);

    // Call the analyzeEntities function
    $annotation = $language->analyzeEntities($text);

    // Print out information about each entity
    $entities = $annotation->entities();
    foreach ($entities as $entity) {
        printf('Name: %s' . PHP_EOL, $entity['name']);
        printf('Type: %s' . PHP_EOL, $entity['type']);
        printf('Salience: %s' . PHP_EOL, $entity['salience']);
        if (array_key_exists('wikipedia_url', $entity['metadata'])) {
            printf('Wikipedia URL: %s' . PHP_EOL, $entity['metadata']['wikipedia_url']);
        }
        if (array_key_exists('mid', $entity['metadata'])) {
            printf('Knowledge Graph MID: %s' . PHP_EOL, $entity['metadata']['mid']);
        }
        printf(PHP_EOL);
    }
}

Python

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

def entities_text(text):
    """Detects entities in the text."""
    client = language.LanguageServiceClient()

    if isinstance(text, six.binary_type):
        text = text.decode('utf-8')

    # Instantiates a plain text document.
    document = types.Document(
        content=text,
        type=enums.Document.Type.PLAIN_TEXT)

    # Detects entities in the document. You can also analyze HTML with:
    #   document.type == enums.Document.Type.HTML
    entities = client.analyze_entities(document).entities

    # entity types from enums.Entity.Type
    entity_type = ('UNKNOWN', 'PERSON', 'LOCATION', 'ORGANIZATION',
                   'EVENT', 'WORK_OF_ART', 'CONSUMER_GOOD', 'OTHER')

    for entity in entities:
        print('=' * 20)
        print(u'{:<16}: {}'.format('name', entity.name))
        print(u'{:<16}: {}'.format('type', entity_type[entity.type]))
        print(u'{:<16}: {}'.format('metadata', entity.metadata))
        print(u'{:<16}: {}'.format('salience', entity.salience))
        print(u'{:<16}: {}'.format('wikipedia_url',
              entity.metadata.get('wikipedia_url', '-')))

Ruby

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

# text_content = "Text to extract entities from"

require "google/cloud/language"

language = Google::Cloud::Language.new

response = language.analyze_entities content: text_content, type: :PLAIN_TEXT

entities = response.entities

entities.each do |entity|
  puts "Entity #{entity.name} #{entity.type}"

  if entity.metadata["wikipedia_url"]
    puts "URL: #{entity.metadata['wikipedia_url']}"
  end
end

Analyzing Entities in a Remote File

For your convenience, the Natural Language API can perform entity analysis directly on a file located in Google Cloud Storage, without the need to send the contents of the file in the body of your request.

Here is an example of performing entity analysis on a file located in Cloud Storage.

Protocol

Refer to the documents:analyzeEntities API endpoint for complete details.

To perform entity analysis on a file in Google Cloud Storage, make a POST request and provide the appropriate request body:

POST https://language.googleapis.com/v1/documents:analyzeEntities?key=YOUR_API_KEY
{
  "encodingType": "UTF8",
  "document": {
    "type": "PLAIN_TEXT",
    "gcsContentUri": "gs://YOUR_BUCKET_NAME/YOUR_FILE_NAME"
  }
}

If you don't specify document.language, then the language will be automatically detected. For information on which languages are supported by the Natural Language API, see Language Support. See the Document reference documentation for more information on configuring the request body.

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

{
  "entities": [
    {
      "name": "Obama",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/02mjmr",
        "wikipedia_url": "http://en.wikipedia.org/wiki/Barack_Obama"
      },
      "salience": 0.9143443,
      "mentions": [
        {
          "text": {
            "content": "Obama",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "http://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.08565566,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 35
          },
          "type": "PROPER"
        }
      ]
    }
  ],
  "language": "en"
}

The entities array contains Entity objects representing the detected entities, which include information such as the entity name and type.

GCLOUD

Refer to the analyze-entities command for complete details.

To perform entity analysis on a file in Google Cloud Storage, use the gcloud command line tool and use the --content-file flag to identify the file path that contains the content to analyze:

gcloud ml language analyze-entities --content-file=gs://YOUR_BUCKET_NAME/YOUR_FILE_NAME

If the request is successful, the server returns a response in JSON format:

{
  "entities": [
    {
      "name": "Obama",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/02mjmr",
        "wikipedia_url": "http://en.wikipedia.org/wiki/Barack_Obama"
      },
      "salience": 0.9143443,
      "mentions": [
        {
          "text": {
            "content": "Obama",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "http://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.08565566,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 35
          },
          "type": "PROPER"
        }
      ]
    }
  ],
  "language": "en"
}

The entities array contains Entity objects representing the detected entities, which include information such as the entity name and type.

C#

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

private static void AnalyzeEntitiesFromFile(string gcsUri)
{
    var client = LanguageServiceClient.Create();
    var response = client.AnalyzeEntities(new Document()
    {
        GcsContentUri = gcsUri,
        Type = Document.Types.Type.PlainText
    });
    WriteEntities(response.Entities);
}
private static void WriteEntities(IEnumerable<Entity> entities)
{
    Console.WriteLine("Entities:");
    foreach (var entity in entities)
    {
        Console.WriteLine($"\tName: {entity.Name}");
        Console.WriteLine($"\tType: {entity.Type}");
        Console.WriteLine($"\tSalience: {entity.Salience}");
        Console.WriteLine("\tMentions:");
        foreach (var mention in entity.Mentions)
            Console.WriteLine($"\t\t{mention.Text.BeginOffset}: {mention.Text.Content}");
        Console.WriteLine("\tMetadata:");
        foreach (var keyval in entity.Metadata)
        {
            Console.WriteLine($"\t\t{keyval.Key}: {keyval.Value}");
        }
    }
}

Go

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

func analyzeEntitiesFromGCS(ctx context.Context, gcsURI string) (*languagepb.AnalyzeEntitiesResponse, error) {
	return client.AnalyzeEntities(ctx, &languagepb.AnalyzeEntitiesRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_GcsContentUri{
				GcsContentUri: gcsURI,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
		EncodingType: languagepb.EncodingType_UTF8,
	})
}

Java

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

// Instantiate the Language client com.google.cloud.language.v1.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  // set the GCS Content URI path to the file to be analyzed
  Document doc = Document.newBuilder()
      .setGcsContentUri(gcsUri)
      .setType(Type.PLAIN_TEXT)
      .build();
  AnalyzeEntitiesRequest request = AnalyzeEntitiesRequest.newBuilder()
      .setDocument(doc)
      .setEncodingType(EncodingType.UTF16)
      .build();

  AnalyzeEntitiesResponse response = language.analyzeEntities(request);

  // Print the response
  for (Entity entity : response.getEntitiesList()) {
    System.out.printf("Entity: %s", entity.getName());
    System.out.printf("Salience: %.3f\n", entity.getSalience());
    System.out.println("Metadata: ");
    for (Map.Entry<String, String> entry : entity.getMetadataMap().entrySet()) {
      System.out.printf("%s : %s", entry.getKey(), entry.getValue());
    }
    for (EntityMention mention : entity.getMentionsList()) {
      System.out.printf("Begin offset: %d\n", mention.getText().getBeginOffset());
      System.out.printf("Content: %s\n", mention.getText().getContent());
      System.out.printf("Type: %s\n\n", mention.getType());
    }
  }
}

Node.js

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

// Imports the Google Cloud client library
const language = require('@google-cloud/language');

// Creates a client
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following lines to run this code
 */
// const bucketName = 'Your bucket name, e.g. my-bucket';
// const fileName = 'Your file name, e.g. my-file.txt';

// Prepares a document, representing a text file in Cloud Storage
const document = {
  gcsContentUri: `gs://${bucketName}/${fileName}`,
  type: 'PLAIN_TEXT',
};

// Detects entities in the document
client
  .analyzeEntities({document: document})
  .then(results => {
    const entities = results[0].entities;

    console.log('Entities:');
    entities.forEach(entity => {
      console.log(entity.name);
      console.log(` - Type: ${entity.type}, Salience: ${entity.salience}`);
      if (entity.metadata && entity.metadata.wikipedia_url) {
        console.log(` - Wikipedia URL: ${entity.metadata.wikipedia_url}$`);
      }
    });
  })
  .catch(err => {
    console.error('ERROR:', err);
  });

PHP

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

namespace Google\Cloud\Samples\Language;

use Google\Cloud\Language\LanguageClient;
use Google\Cloud\Storage\StorageClient;

/**
 * Find the entities in text stored in a Cloud Storage bucket.
 * ```
 * analyze_entities_from_file('my-bucket', 'file_with_text.txt');
 * ```
 *
 * @param string $bucketName The Cloud Storage bucket.
 * @param string $objectName The Cloud Storage object with text.
 * @param string $projectId (optional) Your Google Cloud Project ID
 *
 */
function analyze_entities_from_file($bucketName, $objectName, $projectId = null)
{
    // Create the Cloud Storage object
    $storage = new StorageClient();
    $bucket = $storage->bucket($bucketName);
    $storageObject = $bucket->object($objectName);

    // Create the Natural Language client
    $language = new LanguageClient([
        'projectId' => $projectId,
    ]);

    // Call the analyzeEntities function
    $annotation = $language->analyzeEntities($storageObject);

    // Print out information about each entity
    $entities = $annotation->entities();
    foreach ($entities as $entity) {
        printf('Name: %s' . PHP_EOL, $entity['name']);
        printf('Type: %s' . PHP_EOL, $entity['type']);
        printf('Salience: %s' . PHP_EOL, $entity['salience']);
        if (array_key_exists('wikipedia_url', $entity['metadata'])) {
            printf('Wikipedia URL: %s' . PHP_EOL, $entity['metadata']['wikipedia_url']);
        }
        if (array_key_exists('mid', $entity['metadata'])) {
            printf('Knowledge Graph MID: %s' . PHP_EOL, $entity['metadata']['mid']);
        }
        printf(PHP_EOL);
    }
}

Python

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

def entities_file(gcs_uri):
    """Detects entities in the file located in Google Cloud Storage."""
    client = language.LanguageServiceClient()

    # Instantiates a plain text document.
    document = types.Document(
        gcs_content_uri=gcs_uri,
        type=enums.Document.Type.PLAIN_TEXT)

    # Detects sentiment in the document. You can also analyze HTML with:
    #   document.type == enums.Document.Type.HTML
    entities = client.analyze_entities(document).entities

    # entity types from enums.Entity.Type
    entity_type = ('UNKNOWN', 'PERSON', 'LOCATION', 'ORGANIZATION',
                   'EVENT', 'WORK_OF_ART', 'CONSUMER_GOOD', 'OTHER')

    for entity in entities:
        print('=' * 20)
        print(u'{:<16}: {}'.format('name', entity.name))
        print(u'{:<16}: {}'.format('type', entity_type[entity.type]))
        print(u'{:<16}: {}'.format('metadata', entity.metadata))
        print(u'{:<16}: {}'.format('salience', entity.salience))
        print(u'{:<16}: {}'.format('wikipedia_url',
              entity.metadata.get('wikipedia_url', '-')))

Ruby

For more on installing and creating a Natural Language API client, refer to Natural Language API Client Libraries.

# storage_path = "Path to file in Google Cloud Storage, eg. gs://bucket/file"

require "google/cloud/language"

language = Google::Cloud::Language.new
response = language.analyze_entities gcs_content_uri: storage_path, type: :PLAIN_TEXT

entities = response.entities

entities.each do |entity|
  puts "Entity #{entity.name} #{entity.type}"

  if entity.metadata["wikipedia_url"]
    puts "URL: #{entity.metadata['wikipedia_url']}"
  end
end

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Google Cloud Natural Language API Documentation