Esta página foi traduzida pela API Cloud Translation.
Switch to English

Como analisar entidades

A Análise de entidade inspeciona o texto fornecido das entidades conhecidas, ou seja, nomes próprios como personalidades públicas, pontos de referência etc., e retorna informações sobre elas. A análise de entidade é realizada com o método analyzeEntities. Para mais informações sobre os tipos de entidades que o Natural Language identifica, consulte a documentação da Entidade. Para mais informações sobre quais idiomas são compatíveis com a API Natural Language, consulte Compatibilidade de idiomas.

Nesta seção, você verá algumas maneiras de detectar entidades em um documento.

Como analisar entidades em uma string

Veja um exemplo de análise de entidade em uma string de texto enviada diretamente para a Natural Language API:

Protocolo

Para analisar as entidades em um documento, crie uma solicitação POST para o método REST documents:analyzeEntities e forneça o corpo da solicitação apropriada, como mostrado no exemplo a seguir.

No exemplo, o comando gcloud auth application-default print-access-token é usado para gerar um token de acesso para uma conta de serviço configurada para o projeto usando o SDK do Cloud do Google Cloud Platform. Para ver instruções sobre como instalar o SDK do Cloud, configurar um projeto com uma conta de serviço, consulte o Guia de início rápido.

curl -X POST \
     -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
     -H "Content-Type: application/json; charset=utf-8" \
     --data "{
  'encodingType': 'UTF8',
  'document': {
    'type': 'PLAIN_TEXT',
    'content': 'President Trump will speak from the White House, located
  at 1600 Pennsylvania Ave NW, Washington, DC, on October 7.'
  }
}" "https://language.googleapis.com/v1/documents:analyzeEntities"

Se você não especificar document.language, o idioma será detectado automaticamente. Para mais informações sobre quais idiomas são compatíveis com a API Natural Language, consulte Compatibilidade de idiomas. Consulte a documentação de referência Document para mais informações sobre como configurar o corpo da solicitação.

Quando a solicitação é bem-sucedida, o servidor retorna um código de status HTTP 200 OK e a resposta no formato JSON:

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }
      ]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
      "name": "7",
      "type": "NUMBER",
      "metadata": {
        "value": "7"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "7",
            "beginOffset": 113
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
    }
  ],
  "language": "en"
}

A matriz entities contém objetos Entity que representam as entidades detectadas, com informações como o nome e o tipo da entidade.

gcloud

Consulte o comando analyze-entities para ver todos os detalhes.

Para realizar uma análise de entidade, use a ferramenta de linha de comando gcloud e use a sinalização --content para identificar o conteúdo a ser analisado:

gcloud ml language analyze-entities --content="President Trump will speak from the White House, located
  at 1600 Pennsylvania Ave NW, Washington, DC, on October 7."

Se a solicitação for bem-sucedida, o servidor retornará uma resposta no formato JSON:

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }
      ]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
       "name": "7",
       "type": "NUMBER",
       "metadata": {
         "value": "7"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "7",
             "beginOffset": 113
           },
         "type": "TYPE_UNKNOWN"
         }
        ]
     }
  ],
  "language": "en"
}

A matriz entities contém objetos Entity que representam as entidades detectadas, com informações como o nome e o tipo da entidade.

C#

private static void AnalyzeEntitiesFromText(string text)
{
    var client = LanguageServiceClient.Create();
    var response = client.AnalyzeEntities(new Document()
    {
        Content = text,
        Type = Document.Types.Type.PlainText
    });
    WriteEntities(response.Entities);
}

private static void WriteEntities(IEnumerable<Entity> entities)
{
    Console.WriteLine("Entities:");
    foreach (var entity in entities)
    {
        Console.WriteLine($"\tName: {entity.Name}");
        Console.WriteLine($"\tType: {entity.Type}");
        Console.WriteLine($"\tSalience: {entity.Salience}");
        Console.WriteLine("\tMentions:");
        foreach (var mention in entity.Mentions)
            Console.WriteLine($"\t\t{mention.Text.BeginOffset}: {mention.Text.Content}");
        Console.WriteLine("\tMetadata:");
        foreach (var keyval in entity.Metadata)
        {
            Console.WriteLine($"\t\t{keyval.Key}: {keyval.Value}");
        }
    }
}

Go


func analyzeEntities(ctx context.Context, client *language.Client, text string) (*languagepb.AnalyzeEntitiesResponse, error) {
	return client.AnalyzeEntities(ctx, &languagepb.AnalyzeEntitiesRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_Content{
				Content: text,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
		EncodingType: languagepb.EncodingType_UTF8,
	})
}

Java

// Instantiate the Language client com.google.cloud.language.v1.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  Document doc = Document.newBuilder().setContent(text).setType(Type.PLAIN_TEXT).build();
  AnalyzeEntitiesRequest request =
      AnalyzeEntitiesRequest.newBuilder()
          .setDocument(doc)
          .setEncodingType(EncodingType.UTF16)
          .build();

  AnalyzeEntitiesResponse response = language.analyzeEntities(request);

  // Print the response
  for (Entity entity : response.getEntitiesList()) {
    System.out.printf("Entity: %s", entity.getName());
    System.out.printf("Salience: %.3f\n", entity.getSalience());
    System.out.println("Metadata: ");
    for (Map.Entry<String, String> entry : entity.getMetadataMap().entrySet()) {
      System.out.printf("%s : %s", entry.getKey(), entry.getValue());
    }
    for (EntityMention mention : entity.getMentionsList()) {
      System.out.printf("Begin offset: %d\n", mention.getText().getBeginOffset());
      System.out.printf("Content: %s\n", mention.getText().getContent());
      System.out.printf("Type: %s\n\n", mention.getType());
    }
  }
}

Node.js

// Imports the Google Cloud client library
const language = require('@google-cloud/language');

// Creates a client
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following line to run this code.
 */
// const text = 'Your text to analyze, e.g. Hello, world!';

// Prepares a document, representing the provided text
const document = {
  content: text,
  type: 'PLAIN_TEXT',
};

// Detects entities in the document
const [result] = await client.analyzeEntities({document});

const entities = result.entities;

console.log('Entities:');
entities.forEach(entity => {
  console.log(entity.name);
  console.log(` - Type: ${entity.type}, Salience: ${entity.salience}`);
  if (entity.metadata && entity.metadata.wikipedia_url) {
    console.log(` - Wikipedia URL: ${entity.metadata.wikipedia_url}`);
  }
});

PHP

use Google\Cloud\Language\V1\Document;
use Google\Cloud\Language\V1\Document\Type;
use Google\Cloud\Language\V1\LanguageServiceClient;
use Google\Cloud\Language\V1\Entity\Type as EntityType;

/** Uncomment and populate these variables in your code */
// $text = 'The text to analyze.';

// Create the Natural Language client
$languageServiceClient = new LanguageServiceClient();
try {
    // Create a new Document, add text as content and set type to PLAIN_TEXT
    $document = (new Document())
        ->setContent($text)
        ->setType(Type::PLAIN_TEXT);

    // Call the analyzeEntities function
    $response = $languageServiceClient->analyzeEntities($document, []);
    $entities = $response->getEntities();
    // Print out information about each entity
    foreach ($entities as $entity) {
        printf('Name: %s' . PHP_EOL, $entity->getName());
        printf('Type: %s' . PHP_EOL, EntityType::name($entity->getType()));
        printf('Salience: %s' . PHP_EOL, $entity->getSalience());
        if ($entity->getMetadata()->offsetExists('wikipedia_url')) {
            printf('Wikipedia URL: %s' . PHP_EOL, $entity->getMetadata()->offsetGet('wikipedia_url'));
        }
        if ($entity->getMetadata()->offsetExists('mid')) {
            printf('Knowledge Graph MID: %s' . PHP_EOL, $entity->getMetadata()->offsetGet('mid'));
        }
        printf(PHP_EOL);
    }
} finally {
    $languageServiceClient->close();
}

Python

from google.cloud import language_v1

def sample_analyze_entities(text_content):
    """
    Analyzing Entities in a String

    Args:
      text_content The text content to analyze
    """

    client = language_v1.LanguageServiceClient()

    # text_content = 'California is a state.'

    # Available types: PLAIN_TEXT, HTML
    type_ = language_v1.Document.Type.PLAIN_TEXT

    # Optional. If not specified, the language is automatically detected.
    # For list of supported languages:
    # https://cloud.google.com/natural-language/docs/languages
    language = "en"
    document = {"content": text_content, "type_": type_, "language": language}

    # Available values: NONE, UTF8, UTF16, UTF32
    encoding_type = language_v1.EncodingType.UTF8

    response = client.analyze_entities(request = {'document': document, 'encoding_type': encoding_type})

    # Loop through entitites returned from the API
    for entity in response.entities:
        print(u"Representative name for the entity: {}".format(entity.name))

        # Get entity type, e.g. PERSON, LOCATION, ADDRESS, NUMBER, et al
        print(u"Entity type: {}".format(language_v1.Entity.Type(entity.type_).name))

        # Get the salience score associated with the entity in the [0, 1.0] range
        print(u"Salience score: {}".format(entity.salience))

        # Loop over the metadata associated with entity. For many known entities,
        # the metadata is a Wikipedia URL (wikipedia_url) and Knowledge Graph MID (mid).
        # Some entity types may have additional metadata, e.g. ADDRESS entities
        # may have metadata for the address street_name, postal_code, et al.
        for metadata_name, metadata_value in entity.metadata.items():
            print(u"{}: {}".format(metadata_name, metadata_value))

        # Loop over the mentions of this entity in the input document.
        # The API currently supports proper noun mentions.
        for mention in entity.mentions:
            print(u"Mention text: {}".format(mention.text.content))

            # Get the mention type, e.g. PROPER for proper noun
            print(
                u"Mention type: {}".format(language_v1.EntityMention.Type(mention.type_).name)
            )

    # Get the language of the text, which will be the same as
    # the language specified in the request or, if not specified,
    # the automatically-detected language.
    print(u"Language of the text: {}".format(response.language))

Ruby

# text_content = "Text to extract entities from"

require "google/cloud/language"

language = Google::Cloud::Language.language_service

document = { content: text_content, type: :PLAIN_TEXT }
response = language.analyze_entities document: document

entities = response.entities

entities.each do |entity|
  puts "Entity #{entity.name} #{entity.type}"

  puts "URL: #{entity.metadata['wikipedia_url']}" if entity.metadata["wikipedia_url"]
end

Analisar entidades do Google Cloud Storage

Para sua comodidade, a Natural Language API faz a análise da entidade diretamente em um arquivo localizado no Google Cloud Storage, sem a necessidade de enviar o conteúdo do arquivo no corpo da solicitação.

Veja um exemplo de análise de entidade em um arquivo localizado no Google Cloud Storage.

Protocolo

Para analisar as entidades de um documento armazenado no Google Cloud Storage, crie uma solicitação POST para o método REST documents:analyzeEntities e forneça o caminho para o documento ao corpo da solicitação apropriada, como mostrado no exemplo a seguir.

curl -X POST \
     -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
     -H "Content-Type: application/json; charset=utf-8" \
     --data "{
  'document':{
    'type':'PLAIN_TEXT',
    'gcsContentUri':'gs://<bucket-name>/<object-name>'
  }
}" "https://language.googleapis.com/v1/documents:analyzeEntities"

Se você não especificar document.language, o idioma será detectado automaticamente. Para ver mais informações sobre quais idiomas são compatíveis com a API Natural Language, consulte Compatibilidade de idiomas. Consulte a documentação de referência Document para mais informações sobre como configurar o corpo da solicitação.

Quando a solicitação é bem-sucedida, o servidor retorna um código de status HTTP 200 OK e a resposta no formato JSON:

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }
      ]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
      "name": "7",
      "type": "NUMBER",
      "metadata": {
        "value": "7"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "7",
            "beginOffset": 113
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
    }
  ],
  "language": "en"
}

A matriz entities contém objetos Entity que representam as entidades detectadas, com informações como o nome e o tipo da entidade.

gcloud

Consulte o comando analyze-entities para ver todos os detalhes.

Para realizar uma análise de entidade em um arquivo no Google Cloud Storage, use a ferramenta de linha de comando gcloud e use a sinalização --content-file para identificar o caminho do arquivo que contém o conteúdo a ser analisado:

gcloud ml language analyze-entities --content-file=gs://YOUR_BUCKET_NAME/YOUR_FILE_NAME

Se a solicitação for bem-sucedida, o servidor retornará uma resposta no formato JSON:

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }
      ]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
      "name": "7",
      "type": "NUMBER",
      "metadata": {
        "value": "7"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "7",
            "beginOffset": 113
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
    }
  ],
  "language": "en"
}

A matriz entities contém objetos Entity que representam as entidades detectadas, com informações como o nome e o tipo da entidade.

C#

private static void AnalyzeEntitiesFromFile(string gcsUri)
{
    var client = LanguageServiceClient.Create();
    var response = client.AnalyzeEntities(new Document()
    {
        GcsContentUri = gcsUri,
        Type = Document.Types.Type.PlainText
    });
    WriteEntities(response.Entities);
}
private static void WriteEntities(IEnumerable<Entity> entities)
{
    Console.WriteLine("Entities:");
    foreach (var entity in entities)
    {
        Console.WriteLine($"\tName: {entity.Name}");
        Console.WriteLine($"\tType: {entity.Type}");
        Console.WriteLine($"\tSalience: {entity.Salience}");
        Console.WriteLine("\tMentions:");
        foreach (var mention in entity.Mentions)
            Console.WriteLine($"\t\t{mention.Text.BeginOffset}: {mention.Text.Content}");
        Console.WriteLine("\tMetadata:");
        foreach (var keyval in entity.Metadata)
        {
            Console.WriteLine($"\t\t{keyval.Key}: {keyval.Value}");
        }
    }
}

Go


func analyzeEntitiesFromGCS(ctx context.Context, gcsURI string) (*languagepb.AnalyzeEntitiesResponse, error) {
	return client.AnalyzeEntities(ctx, &languagepb.AnalyzeEntitiesRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_GcsContentUri{
				GcsContentUri: gcsURI,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
		EncodingType: languagepb.EncodingType_UTF8,
	})
}

Java

// Instantiate the Language client com.google.cloud.language.v1.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  // set the GCS Content URI path to the file to be analyzed
  Document doc =
      Document.newBuilder().setGcsContentUri(gcsUri).setType(Type.PLAIN_TEXT).build();
  AnalyzeEntitiesRequest request =
      AnalyzeEntitiesRequest.newBuilder()
          .setDocument(doc)
          .setEncodingType(EncodingType.UTF16)
          .build();

  AnalyzeEntitiesResponse response = language.analyzeEntities(request);

  // Print the response
  for (Entity entity : response.getEntitiesList()) {
    System.out.printf("Entity: %s\n", entity.getName());
    System.out.printf("Salience: %.3f\n", entity.getSalience());
    System.out.println("Metadata: ");
    for (Map.Entry<String, String> entry : entity.getMetadataMap().entrySet()) {
      System.out.printf("%s : %s", entry.getKey(), entry.getValue());
    }
    for (EntityMention mention : entity.getMentionsList()) {
      System.out.printf("Begin offset: %d\n", mention.getText().getBeginOffset());
      System.out.printf("Content: %s\n", mention.getText().getContent());
      System.out.printf("Type: %s\n\n", mention.getType());
    }
  }
}

Node.js

// Imports the Google Cloud client library
const language = require('@google-cloud/language');

// Creates a client
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following lines to run this code
 */
// const bucketName = 'Your bucket name, e.g. my-bucket';
// const fileName = 'Your file name, e.g. my-file.txt';

// Prepares a document, representing a text file in Cloud Storage
const document = {
  gcsContentUri: `gs://${bucketName}/${fileName}`,
  type: 'PLAIN_TEXT',
};

// Detects entities in the document
const [result] = await client.analyzeEntities({document});
const entities = result.entities;

console.log('Entities:');
entities.forEach(entity => {
  console.log(entity.name);
  console.log(` - Type: ${entity.type}, Salience: ${entity.salience}`);
  if (entity.metadata && entity.metadata.wikipedia_url) {
    console.log(` - Wikipedia URL: ${entity.metadata.wikipedia_url}`);
  }
});

PHP

use Google\Cloud\Language\V1\Document;
use Google\Cloud\Language\V1\Document\Type;
use Google\Cloud\Language\V1\LanguageServiceClient;
use Google\Cloud\Language\V1\Entity\Type as EntityType;

/** Uncomment and populate these variables in your code */
// $uri = 'The cloud storage object to analyze (gs://your-bucket-name/your-object-name)';

// Create the Natural Language client
$languageServiceClient = new LanguageServiceClient();
try {
    // Create a new Document, pass GCS URI and set type to PLAIN_TEXT
    $document = (new Document())
        ->setGcsContentUri($uri)
        ->setType(Type::PLAIN_TEXT);

    // Call the analyzeEntities function
    $response = $languageServiceClient->analyzeEntities($document, []);
    $entities = $response->getEntities();
    // Print out information about each entity
    foreach ($entities as $entity) {
        printf('Name: %s' . PHP_EOL, $entity->getName());
        printf('Type: %s' . PHP_EOL, EntityType::name($entity->getType()));
        printf('Salience: %s' . PHP_EOL, $entity->getSalience());
        if ($entity->getMetadata()->offsetExists('wikipedia_url')) {
            printf('Wikipedia URL: %s' . PHP_EOL, $entity->getMetadata()->offsetGet('wikipedia_url'));
        }
        if ($entity->getMetadata()->offsetExists('mid')) {
            printf('Knowledge Graph MID: %s' . PHP_EOL, $entity->getMetadata()->offsetGet('mid'));
        }
        printf(PHP_EOL);
    }
} finally {
    $languageServiceClient->close();
}

Python

from google.cloud import language_v1

def sample_analyze_entities(gcs_content_uri):
    """
    Analyzing Entities in text file stored in Cloud Storage

    Args:
      gcs_content_uri Google Cloud Storage URI where the file content is located.
      e.g. gs://[Your Bucket]/[Path to File]
    """

    client = language_v1.LanguageServiceClient()

    # gcs_content_uri = 'gs://cloud-samples-data/language/entity.txt'

    # Available types: PLAIN_TEXT, HTML
    type_ = language_v1.Document.Type.PLAIN_TEXT

    # Optional. If not specified, the language is automatically detected.
    # For list of supported languages:
    # https://cloud.google.com/natural-language/docs/languages
    language = "en"
    document = {"gcs_content_uri": gcs_content_uri, "type_": type_, "language": language}

    # Available values: NONE, UTF8, UTF16, UTF32
    encoding_type = language_v1.EncodingType.UTF8

    response = client.analyze_entities(request = {'document': document, 'encoding_type': encoding_type})
    # Loop through entitites returned from the API
    for entity in response.entities:
        print(u"Representative name for the entity: {}".format(entity.name))
        # Get entity type, e.g. PERSON, LOCATION, ADDRESS, NUMBER, et al
        print(u"Entity type: {}".format(language_v1.Entity.Type(entity.type_).name))
        # Get the salience score associated with the entity in the [0, 1.0] range
        print(u"Salience score: {}".format(entity.salience))
        # Loop over the metadata associated with entity. For many known entities,
        # the metadata is a Wikipedia URL (wikipedia_url) and Knowledge Graph MID (mid).
        # Some entity types may have additional metadata, e.g. ADDRESS entities
        # may have metadata for the address street_name, postal_code, et al.
        for metadata_name, metadata_value in entity.metadata.items():
            print(u"{}: {}".format(metadata_name, metadata_value))

        # Loop over the mentions of this entity in the input document.
        # The API currently supports proper noun mentions.
        for mention in entity.mentions:
            print(u"Mention text: {}".format(mention.text.content))
            # Get the mention type, e.g. PROPER for proper noun
            print(
                u"Mention type: {}".format(language_v1.EntityMention.Type(mention.type_).name)
            )

    # Get the language of the text, which will be the same as
    # the language specified in the request or, if not specified,
    # the automatically-detected language.
    print(u"Language of the text: {}".format(response.language))

Ruby

# storage_path = "Path to file in Google Cloud Storage, eg. gs://bucket/file"

require "google/cloud/language"

language = Google::Cloud::Language.language_service

document = { gcs_content_uri: storage_path, type: :PLAIN_TEXT }
response = language.analyze_entities document: document

entities = response.entities

entities.each do |entity|
  puts "Entity #{entity.name} #{entity.type}"

  puts "URL: #{entity.metadata['wikipedia_url']}" if entity.metadata["wikipedia_url"]
end