Using the Healthcare Natural Language API

This page explains how to enable the Healthcare Natural Language API, configure permissions, and call the analyzeEntities method to extract medical insights from medical text.

Overview

The Healthcare Natural Language API provides machine learning solutions for deriving insights from medical text. The Healthcare Natural Language API is part of the Cloud Healthcare API. For an overview of the Healthcare Natural Language API, see the Healthcare Natural Language API conceptual documentation.

The Healthcare Natural Language API parses unstructured medical text such as medical records or insurance claims. It then generates a structured data representation of the medical knowledge entities stored in these data sources for downstream analysis and automation. For example, you can:

  • Extract information about medical concepts like diseases, medications, medical devices, procedures, and their clinically relevant attributes
  • Map medical concepts to standard medical vocabularies such as RxNorm, ICD-10, and MeSH
  • Derive medical insights from text and integrate them with data analytics products in Google Cloud

Available locations

The Healthcare Natural Language API is available in the following locations:

Location name Location description
us-central1 Iowa, USA
europe-west4 Netherlands

Enabling the Healthcare Natural Language API

Before you begin using the Healthcare Natural Language API, you must enable the API for your Google Cloud project. You can use the Healthcare Natural Language API without enabling or using features of the Cloud Healthcare API.

To enable the API, complete the following steps:

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to the project selector page

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Set up authentication:
    1. In the Cloud Console, go to the Create service account key page.

      Go to the Create Service Account Key page
    2. From the Service account list, select New service account.
    3. In the Service account name field, enter a name.
    4. From the Role list, select Project > Owner.

    5. Click Create. A JSON file that contains your key downloads to your computer.
  5. Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again.

  6. Enable the Cloud Healthcare API.

    Enable the API

  7. Install and initialize the Cloud SDK.

Set up permissions

To use the features in this guide, you must have the healthcare.nlpservce.analyzeEntities permission, which is included in the healthcare.nlpServiceViewer role.

To assign this role, run the gcloud projects add-iam-policy-binding command:

gcloud projects add-iam-policy-binding PROJECT_ID \
    --member serviceAccount:SERVICE_ACCOUNT_ID \
    --role roles/healthcare.nlpServiceViewer

Extracting entities, relations, and contextual attributes

The Healthcare Natural Language API uses context-aware models to extract medical entities, relations, and contextual attributes. Each text entity is extracted into a medical dictionary entry. To extract this level of medical insights from medical text, use the projects.locations.services.nlp.analyzeEntities method.

To extract medical insights from medical text using the Healthcare Natural Language API, make a POST request and specify the following information in the request:

  • The name of the parent service, including the project ID and location
  • The target text. The maximum size is 10,000 unicode characters.

curl

The following sample shows a POST request using curl:

curl -X POST \
   -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
   -H "Content-Type:application/json" \
   --data "{
    'nlpService':'projects/PROJECT_ID/locations/LOCATION/services/nlp',
    'documentContent':'Insulin regimen human 5 units IV administered.'
   }" \
   "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/services/nlp:analyzeEntities"

PowerShell

The following sample shows a POST request using Windows PowerShell:

$cred = gcloud auth application-default print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
    'nlpService':'projects/PROJECT_ID/locations/LOCATION/services/nlp',
    'documentContent':'Insulin regimen human 5 units IV administered.'
   }" `
   -Uri "https://healthcare.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/services/nlp:analyzeEntities"  | Select-Object -Expand Content

If the request is successful, the response includes the following information:

  • Recognized medical knowledge entities
  • Functional features
  • Relations between the recognized entities
  • Contextual attributes
  • Mappings of the medical knowledge entities into standard terminologies

For a list of supported entity, attribute, and relation types, see the Healthcare Natural Language API conceptual documentation.

The following response from the preceding samples identified Therapeutic Insulin, the entity with code C581 in the NCI terminology system, as the medication. The response also includes the confidence score assigned to the response. For more information about the response fields, see the analyzeEntities documentation.

{
  "entityMentions": [
    {
      "mentionId": "1",
      "type": "MEDICINE",
      "text": {
        "content": "Insulin regimen human"
      },
      "linkedEntities": [
        {
          "entityId": "UMLS/3537244"
        },
        {
          "entityId": "UMLS/3714501"
        },
        {
          "entityId": "UMLS/21641"
        },
        {
          "entityId": "UMLS/795635"
        },
        {
          "entityId": "UMLS/1533581"
        },
        {
          "entityId": "UMLS/4721402"
        }
      ],
      "temporalAssessment": {
        "value": "CURRENT",
        "confidence": 0.87631082534790039
      },
      "certaintyAssessment": {
        "value": "LIKELY",
        "confidence": 0.9999774694442749
      },
      "subject": {
        "value": "PATIENT",
        "confidence": 0.99999970197677612
      },
      "confidence": 0.41636556386947632
    },
    {
      "mentionId": "2",
      "type": "MED_DOSE",
      "text": {
        "content": "5 units",
        "beginOffset": 22
      },
      "confidence": 0.56910794973373413
    },
    {
      "mentionId": "3",
      "type": "MED_ROUTE",
      "text": {
        "content": "IV",
        "beginOffset": 30
      },
      "linkedEntities": [
        {
          "entityId": "UMLS/348016"
        }
      ],
      "confidence": 0.9180646538734436
    }
  ],
  "entities": [
    {
      "entityId": "UMLS/1533581",
      "preferredTerm": "Therapeutic Insulin",
      "vocabularyCodes": [
        "MTH/NOCODE",
        "NCI/C581"
      ]
    },
    {
      "entityId": "UMLS/21641",
      "preferredTerm": "Insulin",
      "vocabularyCodes": [
        "FMA/83365",
        "LNC/LA15805-7",
        "LNC/LP14676-8",
        "LNC/LP16325-0",
        "LNC/LP32542-0",
        "LNC/LP70329-5",
        "LNC/MTHU002108",
        "LNC/MTHU019392",
        "MSH/D007328",
        "MTH/NOCODE"
      ]
    },
    {
      "entityId": "UMLS/348016",
      "preferredTerm": "Intravenous",
      "vocabularyCodes": [
        "LNC/LA9437-0",
        "LNC/LP32453-0",
        "MTH/NOCODE",
        "NCI/C13346"
      ]
    },
    {
      "entityId": "UMLS/3537244",
      "preferredTerm": "Insulins",
      "vocabularyCodes": [
        "MSH/D061385",
        "MTH/NOCODE"
      ]
    },
    {
      "entityId": "UMLS/3714501",
      "preferredTerm": "Insulin Drug Class",
      "vocabularyCodes": [
        "MTH/NOCODE",
        "VANDF/4021631"
      ]
    },
    {
      "entityId": "UMLS/4721402",
      "preferredTerm": "INS protein, human",
      "vocabularyCodes": [
        "MTH/NOCODE",
        "NCI/C2271"
      ]
    },
    {
      "entityId": "UMLS/795635",
      "preferredTerm": "insulin, regular, human",
      "vocabularyCodes": [
        "LNC/LP17001-6",
        "MSH/D061386",
        "MTH/NOCODE",
        "NCI/C29125",
        "RXNORM/253182",
        "VANDF/4017559",
        "VANDF/4017569",
        "VANDF/4019786"
      ]
    }
  ],
  "relationships": [
    {
      "subjectId": "1",
      "objectId": "2",
      "confidence": 0.53775161504745483
    },
    {
      "subjectId": "1",
      "objectId": "3",
      "confidence": 0.95007365942001343
    }
  ]
}