实体分析检测已知实体(如公众人物、地标等专用名词)的给定文本,并返回这些实体的相关信息。实体分析是通过 analyzeEntities
方法执行的。如需了解 Natural Language 标识的实体类型的信息,请参见 Entity 文档。如需了解 Natural Language API 支持的语言,请参阅语言支持。
本部分介绍几种在文档中检测实体的方法。 您必须针对每个文档分别提交请求。
分析字符串中的实体
下面的示例说明如何对直接发送至 Natural Language API 的文本字符串进行实体分析:
协议
如需分析文档中的实体,请按照下面示例中所示,向 documents:analyzeEntities
REST 方法发出 POST
请求,并提供相应的请求正文。
该示例使用 gcloud auth application-default print-access-token
命令获取通过 Google Cloud Platform gcloud CLI 为项目设置的服务账号的访问令牌。如需了解有关安装 gcloud CLI 以及使用服务账号设置项目的说明,请参阅快速入门。
curl -X POST \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'encodingType': 'UTF8', 'document': { 'type': 'PLAIN_TEXT', 'content': 'President Trump will speak from the White House, located at 1600 Pennsylvania Ave NW, Washington, DC, on October 7.' } }" "https://language.googleapis.com/v2/documents:analyzeEntities"
如果您不指定 document.language_code
,则系统将自动检测语言。有关 Natural Language API 支持哪些语言的信息,请参阅语言支持。如需详细了解如何配置请求正文,请参阅 Document
参考文档。
如果请求成功,服务器将返回一个 200 OK
HTTP 状态代码以及 JSON 格式的响应:
{ "entities": [ { "name": "October 7", "type": "DATE", "metadata": { "month": "10", "day": "7" }, "mentions": [ { "text": { "content": "October 7", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "1600", "type": "NUMBER", "metadata": { "value": "1600" }, "mentions": [ { "text": { "content": "1600", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "7", "type": "NUMBER", "metadata": { "value": "7" }, "mentions": [ { "text": { "content": "7", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "1600 Pennsylvania Ave NW, Washington, DC", "type": "ADDRESS", "metadata": { "locality": "Washington", "narrow_region": "District of Columbia", "street_name": "Pennsylvania Avenue Northwest", "street_number": "1600", "broad_region": "District of Columbia", "country": "US" }, "mentions": [ { "text": { "content": "1600 Pennsylvania Ave NW, Washington, DC", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "1600 Pennsylvania Ave NW", "type": "LOCATION", "metadata": {}, "mentions": [ { "text": { "content": "1600 Pennsylvania Ave NW", "beginOffset": -1 }, "type": "PROPER", "probability": 0.901 } ] }, { "name": "President", "type": "PERSON", "metadata": {}, "mentions": [ { "text": { "content": "President", "beginOffset": -1 }, "type": "COMMON", "probability": 0.941 } ] }, { "name": "Trump", "type": "PERSON", "metadata": {}, "mentions": [ { "text": { "content": "Trump", "beginOffset": -1 }, "type": "PROPER", "probability": 0.948 } ] }, { "name": "Washington, DC", "type": "LOCATION", "metadata": {}, "mentions": [ { "text": { "content": "Washington, DC", "beginOffset": -1 }, "type": "PROPER", "probability": 0.92 } ] }, { "name": "White House", "type": "LOCATION", "metadata": {}, "mentions": [ { "text": { "content": "White House", "beginOffset": -1 }, "type": "PROPER", "probability": 0.785 } ] } ], "languageCode": "en", "languageSupported": true }
entities
数组包含代表检测到的实体的 Entity
对象,其中包括实体名称和类型等信息。
gcloud
如需查看完整的详细信息,请参阅 analyze-entities
命令。
如需执行实体分析,请使用 gcloud CLI 并使用 --content
标志来标识要分析的内容:
gcloud ml language analyze-entities --content="President Trump will speak from the White House, located at 1600 Pennsylvania Ave NW, Washington, DC, on October 7."
如果请求成功,则服务器返回 JSON 格式的响应:
{ "entities": [ { "name": "Trump", "type": "PERSON", "metadata": { "mid": "/m/0cqt90", "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump" }, "salience": 0.7936003, "mentions": [ { "text": { "content": "Trump", "beginOffset": 10 }, "type": "PROPER" }, { "text": { "content": "President", "beginOffset": 0 }, "type": "COMMON" } ] }, { "name": "White House", "type": "LOCATION", "metadata": { "mid": "/m/081sq", "wikipedia_url": "https://en.wikipedia.org/wiki/White_House" }, "salience": 0.09172433, "mentions": [ { "text": { "content": "White House", "beginOffset": 36 }, "type": "PROPER" } ] }, { "name": "Pennsylvania Ave NW", "type": "LOCATION", "metadata": { "mid": "/g/1tgb87cq" }, "salience": 0.085507184, "mentions": [ { "text": { "content": "Pennsylvania Ave NW", "beginOffset": 65 }, "type": "PROPER" } ] }, { "name": "Washington, DC", "type": "LOCATION", "metadata": { "mid": "/m/0rh6k", "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C." }, "salience": 0.029168168, "mentions": [ { "text": { "content": "Washington, DC", "beginOffset": 86 }, "type": "PROPER" } ] } { "name": "1600 Pennsylvania Ave NW, Washington, DC", "type": "ADDRESS", "metadata": { "country": "US", "sublocality": "Fort Lesley J. McNair", "locality": "Washington", "street_name": "Pennsylvania Avenue Northwest", "broad_region": "District of Columbia", "narrow_region": "District of Columbia", "street_number": "1600" }, "salience": 0, "mentions": [ { "text": { "content": "1600 Pennsylvania Ave NW, Washington, DC", "beginOffset": 60 }, "type": "TYPE_UNKNOWN" } ] } } { "name": "1600", "type": "NUMBER", "metadata": { "value": "1600" }, "salience": 0, "mentions": [ { "text": { "content": "1600", "beginOffset": 60 }, "type": "TYPE_UNKNOWN" } ] }, { "name": "October 7", "type": "DATE", "metadata": { "day": "7", "month": "10" }, "salience": 0, "mentions": [ { "text": { "content": "October 7", "beginOffset": 105 }, "type": "TYPE_UNKNOWN" } ] } { "name": "7", "type": "NUMBER", "metadata": { "value": "7" }, "salience": 0, "mentions": [ { "text": { "content": "7", "beginOffset": 113 }, "type": "TYPE_UNKNOWN" } ] } ], "language": "en" }
entities
数组包含代表检测到的实体的 Entity
对象,其中包括实体名称和类型等信息。
Go
如需了解如何安装和使用 Natural Language 客户端库,请参阅 Natural Language 客户端库。 有关详情,请参阅 Natural Language Go API 参考文档。
如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
Java
如需了解如何安装和使用 Natural Language 客户端库,请参阅 Natural Language 客户端库。 如需了解详情,请参阅 Natural Language Java API 参考文档。
如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
Node.js
如需了解如何安装和使用 Natural Language 的客户端库,请参阅 Natural Language 客户端库。 如需了解详情,请参阅 Natural Language Node.js API 参考文档。
如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
Python
如需了解如何安装和使用 Natural Language 的客户端库,请参阅 Natural Language 客户端库。 有关详情,请参阅 Natural Language Python API 参考文档。
如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
其他语言
C#: 请遵循 C# 设置说明 在客户端库页面上 然后访问 适用于 .NET 的 Natural Language 参考文档。
PHP:请按照客户端库页面上的 PHP 设置说明操作,然后访问 PHP 版 Natural Language 参考文档。
Ruby: 请遵循 Ruby 设置说明 在客户端库页面上 然后访问 Ruby 版 Natural Language 参考文档。
分析 Cloud Storage 中的实体
为您方便起见,Natural Language API 可以直接对位于 Cloud Storage 的文件执行实体分析,而无需在请求正文中发送文件内容。
以下示例介绍如何对 Cloud Storage 中的文件执行实体分析。
协议
如需分析 Cloud Storage 中存储的文档的实体,请向 documents:analyzeEntities
REST 方法发出 POST
请求,并提供带有文档路径的相应请求正文,如以下示例所示。
curl -X POST \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'document':{ 'type':'PLAIN_TEXT', 'gcsContentUri':'gs://<bucket-name>/<object-name>' } }" "https://language.googleapis.com/v2/documents:analyzeEntities"
如果您不指定 document.language_code
,则系统将自动检测语言。有关 Natural Language API 支持哪些语言的信息,请参阅语言支持。如需详细了解如何配置请求正文,请参阅 Document
参考文档。
如果请求成功,服务器将返回一个 200 OK
HTTP 状态代码以及 JSON 格式的响应:
{ "entities": [ { "name": "October 7", "type": "DATE", "metadata": { "month": "10", "day": "7" }, "mentions": [ { "text": { "content": "October 7", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "1600", "type": "NUMBER", "metadata": { "value": "1600" }, "mentions": [ { "text": { "content": "1600", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "7", "type": "NUMBER", "metadata": { "value": "7" }, "mentions": [ { "text": { "content": "7", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "1600 Pennsylvania Ave NW, Washington, DC", "type": "ADDRESS", "metadata": { "locality": "Washington", "narrow_region": "District of Columbia", "street_name": "Pennsylvania Avenue Northwest", "street_number": "1600", "broad_region": "District of Columbia", "country": "US" }, "mentions": [ { "text": { "content": "1600 Pennsylvania Ave NW, Washington, DC", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "1600 Pennsylvania Ave NW", "type": "LOCATION", "metadata": {}, "mentions": [ { "text": { "content": "1600 Pennsylvania Ave NW", "beginOffset": -1 }, "type": "PROPER", "probability": 0.901 } ] }, { "name": "President", "type": "PERSON", "metadata": {}, "mentions": [ { "text": { "content": "President", "beginOffset": -1 }, "type": "COMMON", "probability": 0.941 } ] }, { "name": "Trump", "type": "PERSON", "metadata": {}, "mentions": [ { "text": { "content": "Trump", "beginOffset": -1 }, "type": "PROPER", "probability": 0.948 } ] }, { "name": "Washington, DC", "type": "LOCATION", "metadata": {}, "mentions": [ { "text": { "content": "Washington, DC", "beginOffset": -1 }, "type": "PROPER", "probability": 0.92 } ] }, { "name": "White House", "type": "LOCATION", "metadata": {}, "mentions": [ { "text": { "content": "White House", "beginOffset": -1 }, "type": "PROPER", "probability": 0.785 } ] } ], "languageCode": "en", "languageSupported": true }
entities
数组包含代表检测到的实体的 Entity
对象,其中包括实体名称和类型等信息。
gcloud
如需查看完整的详细信息,请参阅 analyze-entities
命令。
如需对 Cloud Storage 中的文件执行实体分析,请使用 gcloud
命令行工具并使用 --content-file
标志来标识包含待分析内容的文件路径:
gcloud ml language analyze-entities --content-file=gs://YOUR_BUCKET_NAME/YOUR_FILE_NAME
如果请求成功,则服务器返回 JSON 格式的响应:
{ "entities": [ { "name": "October 7", "type": "DATE", "metadata": { "month": "10", "day": "7" }, "mentions": [ { "text": { "content": "October 7", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "1600", "type": "NUMBER", "metadata": { "value": "1600" }, "mentions": [ { "text": { "content": "1600", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "7", "type": "NUMBER", "metadata": { "value": "7" }, "mentions": [ { "text": { "content": "7", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "1600 Pennsylvania Ave NW, Washington, DC", "type": "ADDRESS", "metadata": { "locality": "Washington", "narrow_region": "District of Columbia", "street_name": "Pennsylvania Avenue Northwest", "street_number": "1600", "broad_region": "District of Columbia", "country": "US" }, "mentions": [ { "text": { "content": "1600 Pennsylvania Ave NW, Washington, DC", "beginOffset": -1 }, "type": "TYPE_UNKNOWN", "probability": 1 } ] }, { "name": "1600 Pennsylvania Ave NW", "type": "LOCATION", "metadata": {}, "mentions": [ { "text": { "content": "1600 Pennsylvania Ave NW", "beginOffset": -1 }, "type": "PROPER", "probability": 0.901 } ] }, { "name": "President", "type": "PERSON", "metadata": {}, "mentions": [ { "text": { "content": "President", "beginOffset": -1 }, "type": "COMMON", "probability": 0.941 } ] }, { "name": "Trump", "type": "PERSON", "metadata": {}, "mentions": [ { "text": { "content": "Trump", "beginOffset": -1 }, "type": "PROPER", "probability": 0.948 } ] }, { "name": "Washington, DC", "type": "LOCATION", "metadata": {}, "mentions": [ { "text": { "content": "Washington, DC", "beginOffset": -1 }, "type": "PROPER", "probability": 0.92 } ] }, { "name": "White House", "type": "LOCATION", "metadata": {}, "mentions": [ { "text": { "content": "White House", "beginOffset": -1 }, "type": "PROPER", "probability": 0.785 } ] } ], "languageCode": "en", "languageSupported": true }
entities
数组包含代表检测到的实体的 Entity
对象,其中包括实体名称和类型等信息。
Go
如需了解如何安装和使用 Natural Language 的客户端库,请参阅 Natural Language 客户端库。 有关详情,请参阅 Natural Language Go API 参考文档。
如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
Java
如需了解如何安装和使用 Natural Language 的客户端库,请参阅 Natural Language 客户端库。 有关详情,请参阅 Natural Language Java API 参考文档。
如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
Node.js
如需了解如何安装和使用 Natural Language 客户端库,请参阅 Natural Language 客户端库。 如需了解详情,请参阅 Natural Language Node.js API 参考文档。
如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
Python
如需了解如何安装和使用 Natural Language 客户端库,请参阅 Natural Language 客户端库。 如需了解详情,请参阅 Natural Language Python API 参考文档。
如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证。
其他语言
C#: 请遵循 C# 设置说明 在客户端库页面上 然后访问 适用于 .NET 的 Natural Language 参考文档。
PHP: 请遵循 PHP 设置说明 在客户端库页面上 然后访问 适用于 PHP 的 Natural Language 参考文档。
Ruby: 请遵循 Ruby 设置说明 在客户端库页面上 然后访问 Ruby 版 Natural Language 参考文档。