항목 분석

항목 분석은 주어진 텍스트에서 알려진 항목(유명 인사, 명소 등의 고유 명사)을 검사하고 항목의 정보를 반환합니다. 항목 분석은 analyzeEntities 메서드를 통해 수행됩니다. Natural Language가 식별하는 엔티티 유형에 대한 정보는 Entity 문서를 참조하십시오. Natural Language가 지원하는 언어에 대한 정보는 언어 지원을 참조하십시오.

이 섹션에서는 문서에서 항목을 감지하는 몇 가지 방법을 보여줍니다.

문자열 내 항목 분석

다음은 Natural Language에 직접 전송된 텍스트 문자열에 대한 항목 분석을 수행하는 예시입니다.

프로토콜

문서의 항목을 분석하려면 documents:analyzeEntities REST 메서드에 POST 요청을 하고 다음 예시와 같이 적절한 요청 본문을 제공해야 합니다.

이 예시에서는 gcloud auth application-default print-access-token 명령어를 사용하여 Google Cloud Platform Cloud SDK를 사용하는 프로젝트용으로 설정된 서비스 계정에 대한 액세스 토큰을 얻습니다. Cloud SDK 설치 및 서비스 계정을 통한 프로젝트 설정 지침을 보려면 빠른 시작을 참조하세요.

curl -X POST \
     -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
     -H "Content-Type: application/json; charset=utf-8" \
     --data "{
  'encodingType': 'UTF8',
  'document': {
    'type': 'PLAIN_TEXT',
    'content': 'President Trump will speak from the White House, located
  at 1600 Pennsylvania Ave NW, Washington, DC, on October 7.'
  }
}" "https://language.googleapis.com/v1/documents:analyzeEntities"

document.language를 지정하지 않으면 언어가 자동으로 감지됩니다. Natural Language에서 지원되는 언어에 대한 자세한 내용은 언어 지원을 참조하세요. 요청 본문 구성에 대한 자세한 내용은 Document 참조 문서를 확인하세요.

요청이 성공하면 서버가 200 OK HTTP 상태 코드와 응답을 JSON 형식으로 반환합니다.

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }
      ]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
      "name": "7",
      "type": "NUMBER",
      "metadata": {
        "value": "7"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "7",
            "beginOffset": 113
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
    }
  ],
  "language": "en"
}

entities 배열에는 항목 이름 및 유형과 같은 정보를 담고 있으며 감지된 항목을 나타내는 Entity 객체가 포함되어 있습니다.

gcloud 명령어

자세한 내용은 analyze-entities 명령어를 참조하세요.

항목 분석을 수행하려면 gcloud 명령줄 도구와 --content 플래그를 사용하여 분석할 콘텐츠를 식별합니다.

gcloud ml language analyze-entities --content="President Trump will speak from the White House, located
  at 1600 Pennsylvania Ave NW, Washington, DC, on October 7."

요청이 성공하면 서버는 JSON 형식의 응답을 반환합니다.

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }
      ]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
       "name": "7",
       "type": "NUMBER",
       "metadata": {
         "value": "7"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "7",
             "beginOffset": 113
           },
         "type": "TYPE_UNKNOWN"
         }
        ]
     }
  ],
  "language": "en"
}

entities 배열에는 항목 이름 및 유형과 같은 정보를 담고 있으며 감지된 항목을 나타내는 Entity 객체가 포함되어 있습니다.

C#

private static void AnalyzeEntitiesFromText(string text)
{
    var client = LanguageServiceClient.Create();
    var response = client.AnalyzeEntities(new Document()
    {
        Content = text,
        Type = Document.Types.Type.PlainText
    });
    WriteEntities(response.Entities);
}

private static void WriteEntities(IEnumerable<Entity> entities)
{
    Console.WriteLine("Entities:");
    foreach (var entity in entities)
    {
        Console.WriteLine($"\tName: {entity.Name}");
        Console.WriteLine($"\tType: {entity.Type}");
        Console.WriteLine($"\tSalience: {entity.Salience}");
        Console.WriteLine("\tMentions:");
        foreach (var mention in entity.Mentions)
            Console.WriteLine($"\t\t{mention.Text.BeginOffset}: {mention.Text.Content}");
        Console.WriteLine("\tMetadata:");
        foreach (var keyval in entity.Metadata)
        {
            Console.WriteLine($"\t\t{keyval.Key}: {keyval.Value}");
        }
    }
}

Go


func analyzeEntities(ctx context.Context, client *language.Client, text string) (*languagepb.AnalyzeEntitiesResponse, error) {
	return client.AnalyzeEntities(ctx, &languagepb.AnalyzeEntitiesRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_Content{
				Content: text,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
		EncodingType: languagepb.EncodingType_UTF8,
	})
}

자바

// Instantiate the Language client com.google.cloud.language.v1.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  Document doc = Document.newBuilder().setContent(text).setType(Type.PLAIN_TEXT).build();
  AnalyzeEntitiesRequest request =
      AnalyzeEntitiesRequest.newBuilder()
          .setDocument(doc)
          .setEncodingType(EncodingType.UTF16)
          .build();

  AnalyzeEntitiesResponse response = language.analyzeEntities(request);

  // Print the response
  for (Entity entity : response.getEntitiesList()) {
    System.out.printf("Entity: %s", entity.getName());
    System.out.printf("Salience: %.3f\n", entity.getSalience());
    System.out.println("Metadata: ");
    for (Map.Entry<String, String> entry : entity.getMetadataMap().entrySet()) {
      System.out.printf("%s : %s", entry.getKey(), entry.getValue());
    }
    for (EntityMention mention : entity.getMentionsList()) {
      System.out.printf("Begin offset: %d\n", mention.getText().getBeginOffset());
      System.out.printf("Content: %s\n", mention.getText().getContent());
      System.out.printf("Type: %s\n\n", mention.getType());
    }
  }
}

Node.js

// Imports the Google Cloud client library
const language = require('@google-cloud/language');

// Creates a client
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following line to run this code.
 */
// const text = 'Your text to analyze, e.g. Hello, world!';

// Prepares a document, representing the provided text
const document = {
  content: text,
  type: 'PLAIN_TEXT',
};

// Detects entities in the document
const [result] = await client.analyzeEntities({document});

const entities = result.entities;

console.log('Entities:');
entities.forEach(entity => {
  console.log(entity.name);
  console.log(` - Type: ${entity.type}, Salience: ${entity.salience}`);
  if (entity.metadata && entity.metadata.wikipedia_url) {
    console.log(` - Wikipedia URL: ${entity.metadata.wikipedia_url}`);
  }
});

PHP

use Google\Cloud\Language\V1\Document;
use Google\Cloud\Language\V1\Document\Type;
use Google\Cloud\Language\V1\LanguageServiceClient;
use Google\Cloud\Language\V1\Entity\Type as EntityType;

/** Uncomment and populate these variables in your code */
// $text = 'The text to analyze.';

// Create the Natural Language client
$languageServiceClient = new LanguageServiceClient();
try {
    // Create a new Document, add text as content and set type to PLAIN_TEXT
    $document = (new Document())
        ->setContent($text)
        ->setType(Type::PLAIN_TEXT);

    // Call the analyzeEntities function
    $response = $languageServiceClient->analyzeEntities($document, []);
    $entities = $response->getEntities();
    // Print out information about each entity
    foreach ($entities as $entity) {
        printf('Name: %s' . PHP_EOL, $entity->getName());
        printf('Type: %s' . PHP_EOL, EntityType::name($entity->getType()));
        printf('Salience: %s' . PHP_EOL, $entity->getSalience());
        if ($entity->getMetadata()->offsetExists('wikipedia_url')) {
            printf('Wikipedia URL: %s' . PHP_EOL, $entity->getMetadata()->offsetGet('wikipedia_url'));
        }
        if ($entity->getMetadata()->offsetExists('mid')) {
            printf('Knowledge Graph MID: %s' . PHP_EOL, $entity->getMetadata()->offsetGet('mid'));
        }
        printf(PHP_EOL);
    }
} finally {
    $languageServiceClient->close();
}

Python

from google.cloud import language_v1
from google.cloud.language_v1 import enums

def sample_analyze_entities(text_content):
    """
    Analyzing Entities in a String

    Args:
      text_content The text content to analyze
    """

    client = language_v1.LanguageServiceClient()

    # text_content = 'California is a state.'

    # Available types: PLAIN_TEXT, HTML
    type_ = enums.Document.Type.PLAIN_TEXT

    # Optional. If not specified, the language is automatically detected.
    # For list of supported languages:
    # https://cloud.google.com/natural-language/docs/languages
    language = "en"
    document = {"content": text_content, "type": type_, "language": language}

    # Available values: NONE, UTF8, UTF16, UTF32
    encoding_type = enums.EncodingType.UTF8

    response = client.analyze_entities(document, encoding_type=encoding_type)

    # Loop through entitites returned from the API
    for entity in response.entities:
        print(u"Representative name for the entity: {}".format(entity.name))

        # Get entity type, e.g. PERSON, LOCATION, ADDRESS, NUMBER, et al
        print(u"Entity type: {}".format(enums.Entity.Type(entity.type).name))

        # Get the salience score associated with the entity in the [0, 1.0] range
        print(u"Salience score: {}".format(entity.salience))

        # Loop over the metadata associated with entity. For many known entities,
        # the metadata is a Wikipedia URL (wikipedia_url) and Knowledge Graph MID (mid).
        # Some entity types may have additional metadata, e.g. ADDRESS entities
        # may have metadata for the address street_name, postal_code, et al.
        for metadata_name, metadata_value in entity.metadata.items():
            print(u"{}: {}".format(metadata_name, metadata_value))

        # Loop over the mentions of this entity in the input document.
        # The API currently supports proper noun mentions.
        for mention in entity.mentions:
            print(u"Mention text: {}".format(mention.text.content))

            # Get the mention type, e.g. PROPER for proper noun
            print(
                u"Mention type: {}".format(enums.EntityMention.Type(mention.type).name)
            )

    # Get the language of the text, which will be the same as
    # the language specified in the request or, if not specified,
    # the automatically-detected language.
    print(u"Language of the text: {}".format(response.language))

Ruby

# text_content = "Text to extract entities from"

require "google/cloud/language"

language = Google::Cloud::Language.language_service

document = { content: text_content, type: :PLAIN_TEXT }
response = language.analyze_entities document: document

entities = response.entities

entities.each do |entity|
  puts "Entity #{entity.name} #{entity.type}"

  puts "URL: #{entity.metadata['wikipedia_url']}" if entity.metadata["wikipedia_url"]
end

Google Cloud Storage에서 항목 분석하기

편의를 위해 Natural Language는 요청 본문에 파일 내용을 보내지 않고도 Google Cloud Storage에 있는 파일에서 직접 엔티티 분석을 수행할 수 있습니다.

다음은 Cloud Storage에 있는 파일에서 항목 분석을 수행하는 예입니다.

프로토콜

Google Cloud Storage에 저장된 문서에서 항목을 분석하려면 documents:analyzeEntities REST 메서드에 POST 요청을 하고 다음 예시와 같이 적절한 요청 본문 및 문서 경로를 제공해야 합니다.

curl -X POST \
     -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
     -H "Content-Type: application/json; charset=utf-8" \
     --data "{
  'document':{
    'type':'PLAIN_TEXT',
    'gcsContentUri':'gs://<bucket-name>/<object-name>'
  }
}" "https://language.googleapis.com/v1/documents:analyzeEntities"

document.language를 지정하지 않으면 언어가 자동으로 감지됩니다. Natural Language에서 지원되는 언어에 대한 자세한 내용은 언어 지원을 참조하세요. 요청 본문 구성에 대한 자세한 내용은 Document 참조 문서를 확인하세요.

요청이 성공하면 서버가 200 OK HTTP 상태 코드와 응답을 JSON 형식으로 반환합니다.

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }
      ]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
      "name": "7",
      "type": "NUMBER",
      "metadata": {
        "value": "7"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "7",
            "beginOffset": 113
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
    }
  ],
  "language": "en"
}

entities 배열에는 항목 이름 및 유형과 같은 정보를 담고 있으며 감지된 항목을 나타내는 Entity 객체가 포함되어 있습니다.

gcloud 명령어

자세한 내용은 analyze-entities 명령어를 참조하세요.

Google Cloud Storage의 파일에 대한 항목 분석을 수행하려면 gcloud 명령줄 도구와 --content-file 플래그를 사용하여 분석할 콘텐츠가 포함된 파일 경로를 식별합니다.

gcloud ml language analyze-entities --content-file=gs://YOUR_BUCKET_NAME/YOUR_FILE_NAME

요청이 성공하면 서버는 JSON 형식의 응답을 반환합니다.

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }
      ]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
      "name": "7",
      "type": "NUMBER",
      "metadata": {
        "value": "7"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "7",
            "beginOffset": 113
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
    }
  ],
  "language": "en"
}

entities 배열에는 항목 이름 및 유형과 같은 정보를 담고 있으며 감지된 항목을 나타내는 Entity 객체가 포함되어 있습니다.

C#

private static void AnalyzeEntitiesFromFile(string gcsUri)
{
    var client = LanguageServiceClient.Create();
    var response = client.AnalyzeEntities(new Document()
    {
        GcsContentUri = gcsUri,
        Type = Document.Types.Type.PlainText
    });
    WriteEntities(response.Entities);
}
private static void WriteEntities(IEnumerable<Entity> entities)
{
    Console.WriteLine("Entities:");
    foreach (var entity in entities)
    {
        Console.WriteLine($"\tName: {entity.Name}");
        Console.WriteLine($"\tType: {entity.Type}");
        Console.WriteLine($"\tSalience: {entity.Salience}");
        Console.WriteLine("\tMentions:");
        foreach (var mention in entity.Mentions)
            Console.WriteLine($"\t\t{mention.Text.BeginOffset}: {mention.Text.Content}");
        Console.WriteLine("\tMetadata:");
        foreach (var keyval in entity.Metadata)
        {
            Console.WriteLine($"\t\t{keyval.Key}: {keyval.Value}");
        }
    }
}

Go


func analyzeEntitiesFromGCS(ctx context.Context, gcsURI string) (*languagepb.AnalyzeEntitiesResponse, error) {
	return client.AnalyzeEntities(ctx, &languagepb.AnalyzeEntitiesRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_GcsContentUri{
				GcsContentUri: gcsURI,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
		EncodingType: languagepb.EncodingType_UTF8,
	})
}

자바

// Instantiate the Language client com.google.cloud.language.v1.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  // set the GCS Content URI path to the file to be analyzed
  Document doc =
      Document.newBuilder().setGcsContentUri(gcsUri).setType(Type.PLAIN_TEXT).build();
  AnalyzeEntitiesRequest request =
      AnalyzeEntitiesRequest.newBuilder()
          .setDocument(doc)
          .setEncodingType(EncodingType.UTF16)
          .build();

  AnalyzeEntitiesResponse response = language.analyzeEntities(request);

  // Print the response
  for (Entity entity : response.getEntitiesList()) {
    System.out.printf("Entity: %s\n", entity.getName());
    System.out.printf("Salience: %.3f\n", entity.getSalience());
    System.out.println("Metadata: ");
    for (Map.Entry<String, String> entry : entity.getMetadataMap().entrySet()) {
      System.out.printf("%s : %s", entry.getKey(), entry.getValue());
    }
    for (EntityMention mention : entity.getMentionsList()) {
      System.out.printf("Begin offset: %d\n", mention.getText().getBeginOffset());
      System.out.printf("Content: %s\n", mention.getText().getContent());
      System.out.printf("Type: %s\n\n", mention.getType());
    }
  }
}

Node.js

// Imports the Google Cloud client library
const language = require('@google-cloud/language');

// Creates a client
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following lines to run this code
 */
// const bucketName = 'Your bucket name, e.g. my-bucket';
// const fileName = 'Your file name, e.g. my-file.txt';

// Prepares a document, representing a text file in Cloud Storage
const document = {
  gcsContentUri: `gs://${bucketName}/${fileName}`,
  type: 'PLAIN_TEXT',
};

// Detects entities in the document
const [result] = await client.analyzeEntities({document});
const entities = result.entities;

console.log('Entities:');
entities.forEach(entity => {
  console.log(entity.name);
  console.log(` - Type: ${entity.type}, Salience: ${entity.salience}`);
  if (entity.metadata && entity.metadata.wikipedia_url) {
    console.log(` - Wikipedia URL: ${entity.metadata.wikipedia_url}`);
  }
});

PHP

use Google\Cloud\Language\V1\Document;
use Google\Cloud\Language\V1\Document\Type;
use Google\Cloud\Language\V1\LanguageServiceClient;
use Google\Cloud\Language\V1\Entity\Type as EntityType;

/** Uncomment and populate these variables in your code */
// $uri = 'The cloud storage object to analyze (gs://your-bucket-name/your-object-name)';

// Create the Natural Language client
$languageServiceClient = new LanguageServiceClient();
try {
    // Create a new Document, pass GCS URI and set type to PLAIN_TEXT
    $document = (new Document())
        ->setGcsContentUri($uri)
        ->setType(Type::PLAIN_TEXT);

    // Call the analyzeEntities function
    $response = $languageServiceClient->analyzeEntities($document, []);
    $entities = $response->getEntities();
    // Print out information about each entity
    foreach ($entities as $entity) {
        printf('Name: %s' . PHP_EOL, $entity->getName());
        printf('Type: %s' . PHP_EOL, EntityType::name($entity->getType()));
        printf('Salience: %s' . PHP_EOL, $entity->getSalience());
        if ($entity->getMetadata()->offsetExists('wikipedia_url')) {
            printf('Wikipedia URL: %s' . PHP_EOL, $entity->getMetadata()->offsetGet('wikipedia_url'));
        }
        if ($entity->getMetadata()->offsetExists('mid')) {
            printf('Knowledge Graph MID: %s' . PHP_EOL, $entity->getMetadata()->offsetGet('mid'));
        }
        printf(PHP_EOL);
    }
} finally {
    $languageServiceClient->close();
}

Python

from google.cloud import language_v1
from google.cloud.language_v1 import enums

def sample_analyze_entities(gcs_content_uri):
    """
    Analyzing Entities in text file stored in Cloud Storage

    Args:
      gcs_content_uri Google Cloud Storage URI where the file content is located.
      e.g. gs://[Your Bucket]/[Path to File]
    """

    client = language_v1.LanguageServiceClient()

    # gcs_content_uri = 'gs://cloud-samples-data/language/entity.txt'

    # Available types: PLAIN_TEXT, HTML
    type_ = enums.Document.Type.PLAIN_TEXT

    # Optional. If not specified, the language is automatically detected.
    # For list of supported languages:
    # https://cloud.google.com/natural-language/docs/languages
    language = "en"
    document = {"gcs_content_uri": gcs_content_uri, "type": type_, "language": language}

    # Available values: NONE, UTF8, UTF16, UTF32
    encoding_type = enums.EncodingType.UTF8

    response = client.analyze_entities(document, encoding_type=encoding_type)
    # Loop through entitites returned from the API
    for entity in response.entities:
        print(u"Representative name for the entity: {}".format(entity.name))
        # Get entity type, e.g. PERSON, LOCATION, ADDRESS, NUMBER, et al
        print(u"Entity type: {}".format(enums.Entity.Type(entity.type).name))
        # Get the salience score associated with the entity in the [0, 1.0] range
        print(u"Salience score: {}".format(entity.salience))
        # Loop over the metadata associated with entity. For many known entities,
        # the metadata is a Wikipedia URL (wikipedia_url) and Knowledge Graph MID (mid).
        # Some entity types may have additional metadata, e.g. ADDRESS entities
        # may have metadata for the address street_name, postal_code, et al.
        for metadata_name, metadata_value in entity.metadata.items():
            print(u"{}: {}".format(metadata_name, metadata_value))

        # Loop over the mentions of this entity in the input document.
        # The API currently supports proper noun mentions.
        for mention in entity.mentions:
            print(u"Mention text: {}".format(mention.text.content))
            # Get the mention type, e.g. PROPER for proper noun
            print(
                u"Mention type: {}".format(enums.EntityMention.Type(mention.type).name)
            )

    # Get the language of the text, which will be the same as
    # the language specified in the request or, if not specified,
    # the automatically-detected language.
    print(u"Language of the text: {}".format(response.language))

Ruby

# storage_path = "Path to file in Google Cloud Storage, eg. gs://bucket/file"

require "google/cloud/language"

language = Google::Cloud::Language.language_service

document = { gcs_content_uri: storage_path, type: :PLAIN_TEXT }
response = language.analyze_entities document: document

entities = response.entities

entities.each do |entity|
  puts "Entity #{entity.name} #{entity.type}"

  puts "URL: #{entity.metadata['wikipedia_url']}" if entity.metadata["wikipedia_url"]
end