このページは Cloud Translation API によって翻訳されました。

エンティティ分析

エンティティ分析は、指定されたテキストに既知のエンティティ（著名人、ランドマークなどの固有名詞）が含まれていないかどうかを調べて、それらのエンティティに関する情報を返します。エンティティ分析を行うには、analyzeEntities メソッドを使用します。Natural Language で識別されるエンティティの種類については、エンティティのドキュメントをご覧ください。Natural Language API でサポートされる言語については、言語のサポートをご覧ください。

このセクションでは、ドキュメント内のエンティティを検出するいくつかの方法を説明します。リクエストは、ドキュメントごとに送信する必要があります。

文字列のエンティティ分析

Natural Language API に直接送信されたテキスト文字列に対してエンティティ分析を行う例を次に示します。

プロトコル

ドキュメント内のエンティティを分析するには、documents:analyzeEntities REST メソッドに対して POST リクエストを行います。リクエストには、次の例に示す適切なリクエスト本文を指定します。

この例では、Google Cloud Platform の gcloud CLI を使用してプロジェクト用に設定されたサービスアカウントのアクセストークンを取得するために、gcloud auth application-default print-access-token コマンドを使用しています。gcloud CLI のインストールと、サービスアカウントを使用したプロジェクトの設定については、クイックスタートをご覧ください。

curl -X POST \
     -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
     -H "Content-Type: application/json; charset=utf-8" \
     --data "{
  'encodingType': 'UTF8',
  'document': {
    'type': 'PLAIN_TEXT',
    'content': 'President Trump will speak from the White House, located
  at 1600 Pennsylvania Ave NW, Washington, DC, on October 7.'
  }
}" "https://language.googleapis.com/v2/documents:analyzeEntities"

document.language_code を指定しない場合は、言語が自動的に検出されます。Natural Language API でサポートされる言語については、言語のサポートをご覧ください。リクエスト本文の構成の詳細については、Document のリファレンスドキュメントをご覧ください。

リクエストが成功すると、サーバーは 200 OK HTTP ステータスコードと JSON 形式のレスポンスを返します。

{
  "entities": [
    {
      "name": "October 7",
      "type": "DATE",
      "metadata": {
        "month": "10",
        "day": "7"
      },
      "mentions": [
        {
          "text": {
            "content": "October 7",
            "beginOffset": -1
          },
          "type": "TYPE_UNKNOWN",
          "probability": 1
        }
      ]
    },
    {
      "name": "1600",
      "type": "NUMBER",
      "metadata": {
        "value": "1600"
      },
      "mentions": [
        {
          "text": {
            "content": "1600",
            "beginOffset": -1
          },
          "type": "TYPE_UNKNOWN",
          "probability": 1
        }
      ]
    },
    {
      "name": "7",
      "type": "NUMBER",
      "metadata": {
        "value": "7"
      },
      "mentions": [
        {
          "text": {
            "content": "7",
            "beginOffset": -1
          },
          "type": "TYPE_UNKNOWN",
          "probability": 1
        }
      ]
    },
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "locality": "Washington",
        "narrow_region": "District of Columbia",
        "street_name": "Pennsylvania Avenue Northwest",
        "street_number": "1600",
        "broad_region": "District of Columbia",
        "country": "US"
      },
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": -1
          },
          "type": "TYPE_UNKNOWN",
          "probability": 1
        }
      ]
    },
    {
      "name": "1600 Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {},
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW",
            "beginOffset": -1
          },
          "type": "PROPER",
          "probability": 0.901
        }
      ]
    },
    {
      "name": "President",
      "type": "PERSON",
      "metadata": {},
      "mentions": [
        {
          "text": {
            "content": "President",
            "beginOffset": -1
          },
          "type": "COMMON",
          "probability": 0.941
        }
      ]
    },
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {},
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": -1
          },
          "type": "PROPER",
          "probability": 0.948
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {},
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": -1
          },
          "type": "PROPER",
          "probability": 0.92
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {},
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": -1
          },
          "type": "PROPER",
          "probability": 0.785
        }
      ]
    }
  ],
  "languageCode": "en",
  "languageSupported": true
}

entities 配列には、検出されたエンティティを表す Entity オブジェクトが格納されます。このオブジェクトには、エンティティの名前や型などの情報が含まれています。

gcloud

詳しくは、analyze-entities コマンドをご覧ください。

エンティティ分析を実行するには、gcloud CLI を使用し、--content フラグで分析するコンテンツを指定します。

gcloud ml language analyze-entities --content="President Trump will speak from the White House, located
  at 1600 Pennsylvania Ave NW, Washington, DC, on October 7."

リクエストが成功すると、サーバーは JSON 形式のレスポンスを返します。

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }
      ]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
       "name": "7",
       "type": "NUMBER",
       "metadata": {
         "value": "7"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "7",
             "beginOffset": 113
           },
         "type": "TYPE_UNKNOWN"
         }
        ]
     }
  ],
  "language": "en"
}

Go

Natural Language のクライアントライブラリをインストールして使用する方法については、Natural Language のクライアントライブラリをご覧ください。詳細については、Natural Language Go API のリファレンスドキュメントをご覧ください。

Natural Language で認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証の設定をご覧ください。

import (
	"context"
	"fmt"
	"io"

	language "cloud.google.com/go/language/apiv2"
	"cloud.google.com/go/language/apiv2/languagepb"
)

// analyzeEntities sends a string of text to the Cloud Natural Language API to
// detect the entities of the text.
func analyzeEntities(w io.Writer, text string) error {
	ctx := context.Background()

	// Initialize client.
	client, err := language.NewClient(ctx)
	if err != nil {
		return err
	}
	defer client.Close()

	resp, err := client.AnalyzeEntities(ctx, &languagepb.AnalyzeEntitiesRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_Content{
				Content: text,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
		EncodingType: languagepb.EncodingType_UTF8,
	})

	if err != nil {
		return fmt.Errorf("AnalyzeEntities: %w", err)
	}
	fmt.Fprintf(w, "Response: %q\n", resp)

	return nil
}

Java

Natural Language のクライアントライブラリをインストールして使用する方法については、Natural Language のクライアントライブラリをご覧ください。詳細については、Natural Language Java API のリファレンスドキュメントをご覧ください。

// Instantiate the Language client com.google.cloud.language.v2.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  Document doc = Document.newBuilder().setContent(text).setType(Type.PLAIN_TEXT).build();
  AnalyzeEntitiesRequest request =
      AnalyzeEntitiesRequest.newBuilder()
          .setDocument(doc)
          .setEncodingType(EncodingType.UTF16)
          .build();

  AnalyzeEntitiesResponse response = language.analyzeEntities(request);

  // Print the response
  for (Entity entity : response.getEntitiesList()) {
    System.out.printf("Entity: %s", entity.getName());
    System.out.println("Metadata: ");
    for (Map.Entry<String, String> entry : entity.getMetadataMap().entrySet()) {
      System.out.printf("%s : %s", entry.getKey(), entry.getValue());
    }
    for (EntityMention mention : entity.getMentionsList()) {
      System.out.printf("Begin offset: %d\n", mention.getText().getBeginOffset());
      System.out.printf("Content: %s\n", mention.getText().getContent());
      System.out.printf("Type: %s\n\n", mention.getType());
      System.out.printf("Probability: %s\n\n", mention.getProbability());
    }
  }
}

Node.js

Natural Language のクライアントライブラリをインストールして使用する方法については、Natural Language のクライアントライブラリをご覧ください。詳細については、Natural Language Node.js API のリファレンスドキュメントをご覧ください。

// Imports the Google Cloud client library
const language = require('@google-cloud/language').v2;

// Creates a client
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following line to run this code.
 */
// const text = 'Your text to analyze, e.g. Hello, world!';

// Prepares a document, representing the provided text
const document = {
  content: text,
  type: 'PLAIN_TEXT',
};

// Detects entities in the document
const [result] = await client.analyzeEntities({document});

const entities = result.entities;

console.log('Entities:');
entities.forEach(entity => {
  console.log(entity.name);
  console.log(` - Type: ${entity.type}`);
  if (entity.metadata) {
    console.log(` - Metadata: ${entity.metadata}`);
  }
});

Python

Natural Language のクライアントライブラリをインストールして使用する方法については、Natural Language のクライアントライブラリをご覧ください。詳細については、Natural Language Python API のリファレンスドキュメントをご覧ください。

from google.cloud import language_v2


def sample_analyze_entities(text_content: str = "California is a state.") -> None:
    """
    Analyzes Entities in a string.

    Args:
      text_content: The text content to analyze
    """

    client = language_v2.LanguageServiceClient()

    # Available types: PLAIN_TEXT, HTML
    document_type_in_plain_text = language_v2.Document.Type.PLAIN_TEXT

    # Optional. If not specified, the language is automatically detected.
    # For list of supported languages:
    # https://cloud.google.com/natural-language/docs/languages
    language_code = "en"
    document = {
        "content": text_content,
        "type_": document_type_in_plain_text,
        "language_code": language_code,
    }

    # Available values: NONE, UTF8, UTF16, UTF32.
    # See https://cloud.google.com/natural-language/docs/reference/rest/v2/EncodingType.
    encoding_type = language_v2.EncodingType.UTF8

    response = client.analyze_entities(
        request={"document": document, "encoding_type": encoding_type}
    )

    for entity in response.entities:
        print(f"Representative name for the entity: {entity.name}")

        # Get entity type, e.g. PERSON, LOCATION, ADDRESS, NUMBER, et al.
        # See https://cloud.google.com/natural-language/docs/reference/rest/v2/Entity#type.
        print(f"Entity type: {language_v2.Entity.Type(entity.type_).name}")

        # Loop over the metadata associated with entity.
        # Some entity types may have additional metadata, e.g. ADDRESS entities
        # may have metadata for the address street_name, postal_code, et al.
        for metadata_name, metadata_value in entity.metadata.items():
            print(f"{metadata_name}: {metadata_value}")

        # Loop over the mentions of this entity in the input document.
        # The API currently supports proper noun mentions.
        for mention in entity.mentions:
            print(f"Mention text: {mention.text.content}")

            # Get the mention type, e.g. PROPER for proper noun
            print(f"Mention type: {language_v2.EntityMention.Type(mention.type_).name}")

            # Get the probability score associated with the first mention of the entity in the (0, 1.0] range.
            print(f"Probability score: {mention.probability}")

    # Get the language of the text, which will be the same as
    # the language specified in the request or, if not specified,
    # the automatically-detected language.
    print(f"Language of the text: {response.language_code}")

その他の言語

C#: クライアントライブラリページの C# の設定手順を行ってから、.NET 用の Natural Language リファレンスドキュメントをご覧ください。

PHP: クライアントライブラリページの PHP の設定手順を行ってから、PHP 用の Natural Language リファレンスドキュメントをご覧ください。

Ruby: クライアントライブラリページの Ruby の設定手順を行ってから、Ruby 用の Natural Language リファレンスドキュメントをご覧ください。

Cloud Storage にあるエンティティの分析

Natural Language API は、Cloud Storage に存在するファイルに対して直接エンティティ分析を実行できるようになっています。そのファイルの内容をリクエストの本文に入れて送信する必要はありません。

Cloud Storage に置かれたファイルに対してエンティティ分析を実行する例を次に示します。