This page explains how to enable the Healthcare Natural Language API, configure
permissions, and call the analyzeEntities
method to extract medical insights from medical text.
Overview
The Healthcare Natural Language API provides machine learning solutions for deriving insights from medical text. The Healthcare Natural Language API is part of the Cloud Healthcare API. For an overview of the Healthcare Natural Language API, see the Healthcare Natural Language API conceptual documentation.
The Healthcare Natural Language API parses unstructured medical text such as medical records or insurance claims. It then generates a structured data representation of the medical knowledge entities stored in these data sources for downstream analysis and automation. For example, you can:
- Extract information about medical concepts like diseases, medications, medical devices, procedures, and their clinically relevant attributes
- Map medical concepts to standard medical vocabularies such as RxNorm, ICD-10, MeSH, and SNOMED CT (US users only)
- Derive medical insights from text and integrate them with data analytics products in Google Cloud
Available locations
The Healthcare Natural Language API is available in the following locations:
Location name | Location description |
---|---|
asia-south1 |
Mumbai, India |
australia-southeast1 |
Sydney, Australia |
europe-west2 |
London, UK |
europe-west4 |
Netherlands |
northamerica-northeast1 |
Montréal, Canada |
us-central1 |
Iowa, USA |
Enabling the Healthcare Natural Language API
Before you begin using the Healthcare Natural Language API, you must enable the API for your Google Cloud project. You can use the Healthcare Natural Language API without enabling or using features of the Cloud Healthcare API.
To enable the API, complete the following steps:
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Create a service account:
-
In the Google Cloud console, go to the Create service account page.
Go to Create service account - Select your project.
-
In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
In the Service account description field, enter a description. For example,
Service account for quickstart
. - Click Create and continue.
-
To provide access to your project, grant the Project > Owner role to your service account.
To grant the role, find the Select a role list, then select Project > Owner.
- Click Continue.
-
Click Done to finish creating the service account.
Do not close your browser window. You will use it in the next step.
-
-
Create a service account key:
- In the Google Cloud console, click the email address for the service account that you created.
- Click Keys.
- Click Add key, and then click Create new key.
- Click Create. A JSON key file is downloaded to your computer.
- Click Close.
-
Set the environment variable
GOOGLE_APPLICATION_CREDENTIALS
to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again. -
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Create a service account:
-
In the Google Cloud console, go to the Create service account page.
Go to Create service account - Select your project.
-
In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.
In the Service account description field, enter a description. For example,
Service account for quickstart
. - Click Create and continue.
-
To provide access to your project, grant the Project > Owner role to your service account.
To grant the role, find the Select a role list, then select Project > Owner.
- Click Continue.
-
Click Done to finish creating the service account.
Do not close your browser window. You will use it in the next step.
-
-
Create a service account key:
- In the Google Cloud console, click the email address for the service account that you created.
- Click Keys.
- Click Add key, and then click Create new key.
- Click Create. A JSON key file is downloaded to your computer.
- Click Close.
-
Set the environment variable
GOOGLE_APPLICATION_CREDENTIALS
to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again. -
Enable the Cloud Healthcare API.
- Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
Set up permissions
To use the features in this guide, you must have the
healthcare.nlpservice.analyzeEntities
permission, which is
included in the healthcare.nlpServiceViewer
role.
To assign this role, run the
gcloud projects add-iam-policy-binding
command:
gcloud projects add-iam-policy-binding PROJECT_ID \ --member serviceAccount:SERVICE_ACCOUNT_ID \ --role roles/healthcare.nlpServiceViewer
Extracting entities, relations, and contextual attributes
The Healthcare Natural Language API uses context-aware models to extract medical
entities, relations, and contextual attributes. Each text entity is extracted
into a medical dictionary entry. To extract this level of medical insights from
medical text, use the
projects.locations.services.nlp.analyzeEntities
method.
To include licensed vocabularies, such as International Classification of Diseases, Tenth Revision, Clinical Modification (ICD10CM) and SNOMED Clinical Terms, US Version (SNOMEDCT_US), in your request, see Including licensed vocabularies.
To extract medical insights from medical text using the
Healthcare Natural Language API, make a POST
request and specify the following
information in the request:
- The name of the parent service, including the project ID and location
- The target text. The maximum size is 20,000 unicode characters.
curl
The following sample shows a POST
request using curl
:
curl -X POST \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ -H "Content-Type:application/json" \ --data "{ 'nlpService':'projects/PROJECT_ID/locations/LOCATION/services/nlp', 'documentContent':'Insulin regimen human 5 units IV administered.' }" \ "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/services/nlp:analyzeEntities"
PowerShell
The following sample shows a POST
request using Windows PowerShell:
$cred = gcloud auth application-default print-access-token $headers = @{ Authorization = "Bearer $cred" } Invoke-WebRequest ` -Method Post ` -Headers $headers ` -ContentType: "application/json; charset=utf-8" ` -Body "{ 'nlpService':'projects/PROJECT_ID/locations/LOCATION/services/nlp', 'documentContent':'Insulin regimen human 5 units IV administered.' }" ` -Uri "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/services/nlp:analyzeEntities" | Select-Object -Expand Content
If the request is successful, the response includes the following information:
- Recognized medical knowledge entities
- Functional features
- Relations between the recognized entities
- Contextual attributes
- Mappings of the medical knowledge entities into standard terminologies
For a list of supported entity, attribute, and relation types, see the Healthcare Natural Language API conceptual documentation.
The following response from the preceding samples identified Therapeutic Insulin, the entity with
code C581
in the NCI terminology system, as the medication. The response
also includes the confidence score assigned to the response. For more
information about the response fields, see the
analyzeEntities
documentation.
{ "entityMentions": [ { "mentionId": "1", "type": "MEDICINE", "text": { "content": "Insulin regimen human" }, "linkedEntities": [ { "entityId": "UMLS/3537244" }, { "entityId": "UMLS/3714501" }, { "entityId": "UMLS/21641" }, { "entityId": "UMLS/795635" }, { "entityId": "UMLS/1533581" }, { "entityId": "UMLS/4721402" } ], "temporalAssessment": { "value": "CURRENT", "confidence": 0.87631082534790039 }, "certaintyAssessment": { "value": "LIKELY", "confidence": 0.9999774694442749 }, "subject": { "value": "PATIENT", "confidence": 0.99999970197677612 }, "confidence": 0.41636556386947632 }, { "mentionId": "2", "type": "MED_DOSE", "text": { "content": "5 units", "beginOffset": 22 }, "confidence": 0.56910794973373413 }, { "mentionId": "3", "type": "MED_ROUTE", "text": { "content": "IV", "beginOffset": 30 }, "linkedEntities": [ { "entityId": "UMLS/348016" } ], "confidence": 0.9180646538734436 } ], "entities": [ { "entityId": "UMLS/1533581", "preferredTerm": "Therapeutic Insulin", "vocabularyCodes": [ "MTH/NOCODE", "NCI/C581" ] }, { "entityId": "UMLS/21641", "preferredTerm": "Insulin", "vocabularyCodes": [ "FMA/83365", "LNC/LA15805-7", "LNC/LP14676-8", "LNC/LP16325-0", "LNC/LP32542-0", "LNC/LP70329-5", "LNC/MTHU002108", "LNC/MTHU019392", "MSH/D007328", "MTH/NOCODE" ] }, { "entityId": "UMLS/348016", "preferredTerm": "Intravenous", "vocabularyCodes": [ "LNC/LA9437-0", "LNC/LP32453-0", "MTH/NOCODE", "NCI/C13346" ] }, { "entityId": "UMLS/3537244", "preferredTerm": "Insulins", "vocabularyCodes": [ "MSH/D061385", "MTH/NOCODE" ] }, { "entityId": "UMLS/3714501", "preferredTerm": "Insulin Drug Class", "vocabularyCodes": [ "MTH/NOCODE", "VANDF/4021631" ] }, { "entityId": "UMLS/4721402", "preferredTerm": "INS protein, human", "vocabularyCodes": [ "MTH/NOCODE", "NCI/C2271" ] }, { "entityId": "UMLS/795635", "preferredTerm": "insulin, regular, human", "vocabularyCodes": [ "LNC/LP17001-6", "MSH/D061386", "MTH/NOCODE", "NCI/C29125", "RXNORM/253182", "VANDF/4017559", "VANDF/4017569", "VANDF/4019786" ] } ], "relationships": [ { "subjectId": "1", "objectId": "2", "confidence": 0.53775161504745483 }, { "subjectId": "1", "objectId": "3", "confidence": 0.95007365942001343 } ] }
Including licensed vocabularies
You can include the following licensed vocabularies in your requests to the Healthcare Natural Language API:
- International Classification of Diseases, Tenth Revision, Clinical Modification (ICD10CM): used to code and classify morbidity data from inpatient and outpatient records, physician offices, and most National Center for Health Statistics (NCHS) surveys.
- SNOMED Clinical Terms, US Version (SNOMEDCT_US): provides core general terminology for electronic health records.
The following samples show how to make a POST
request to the Healthcare Natural Language API
and include both of the available licensed vocabularies in the licensedVocabularies
object. You can specify one or more of the available
licensed vocabularies.
curl
The following sample shows a POST
request using curl
:
curl -X POST \ -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \ -H "Content-Type:application/json" \ --data "{ 'nlpService':'projects/PROJECT_ID/locations/us-central1/services/nlp', 'documentContent':'Diabetes. Insulin regimen human 5 units IV administered.', 'licensedVocabularies':['SNOMEDCT_US','ICD10CM'] }" \ "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/services/nlp:analyzeEntities"
Replace PROJECT_ID with your Google Cloud project ID.
The output is the following. The output from specifying the licensed vocabularies is in bold:
{ "entityMentions": [ { "mentionId": "1", "type": "PROBLEM", "text": { "content": "Diabetes" }, "linkedEntities": [ { "entityId": "UMLS/C0011847" }, { "entityId": "UMLS/C0011849" }, { "entityId": "UMLS/C0241863" } ], "temporalAssessment": { "value": "CURRENT", "confidence": 0.98781299591064453 }, "certaintyAssessment": { "value": "LIKELY", "confidence": 0.872421145439148 }, "subject": { "value": "PATIENT", "confidence": 0.99975031614303589 }, "confidence": 0.99663406610488892 }, { "mentionId": "2", "type": "MEDICINE", "text": { "content": "Insulin regimen", "beginOffset": 10 }, "linkedEntities": [ { "entityId": "UMLS/C0795635" }, { "entityId": "UMLS/C0021641" }, { "entityId": "UMLS/C3537244" }, { "entityId": "UMLS/C1533581" }, { "entityId": "UMLS/C3714501" } ], "temporalAssessment": { "value": "CURRENT", "confidence": 0.91042423248291016 }, "certaintyAssessment": { "value": "LIKELY", "confidence": 0.99766635894775391 }, "subject": { "value": "PATIENT", "confidence": 0.999998152256012 }, "confidence": 0.716249406337738 }, { "mentionId": "3", "type": "MEDICINE", "text": { "content": "human", "beginOffset": 26 }, "temporalAssessment": { "value": "CURRENT", "confidence": 0.64570724964141846 }, "certaintyAssessment": { "value": "LIKELY", "confidence": 0.90325617790222168 }, "subject": { "value": "PATIENT", "confidence": 0.97613298892974854 }, "confidence": 0.57638454437255859 }, { "mentionId": "4", "type": "MED_DOSE", "text": { "content": "5 units", "beginOffset": 32 }, "confidence": 0.92076742649078369 }, { "mentionId": "5", "type": "MED_ROUTE", "text": { "content": "IV", "beginOffset": 40 }, "linkedEntities": [ { "entityId": "UMLS/C0348016" } ], "confidence": 0.967033863067627 } ], "entities": [ { "entityId": "UMLS/C0011847", "preferredTerm": "Diabetes", "vocabularyCodes": [ "ICD10CM/E11", "LNC/LA10529-8", "LNC/LP128793-9", "LNC/MTHU040702", "MTH/NOCODE", "OMIM/MTHU050182" ] }, { "entityId": "UMLS/C0011849", "preferredTerm": "Diabetes Mellitus", "vocabularyCodes": [ "HPO/HP:0000819", "ICD10CM/E08-E13", "ICD9CM/250", "LNC/LA14291-1", "LNC/LA27539-8", "LNC/MTHU020781", "MEDLINEPLUS/4", "MEDLINEPLUS/45", "MSH/D003920", "MTH/NOCODE", "MTH/U000263", "NCI/C2985", "NCI/OMFAQ", "NCI/TCGA", "OMIM/MTHU036798", "SNOMEDCT_US/73211009" ] }, { "entityId": "UMLS/C0021641", "preferredTerm": "Insulin", "vocabularyCodes": [ "FMA/83365", "LNC/LA15805-7", "LNC/LP14676-8", "LNC/LP16325-0", "LNC/LP32542-0", "LNC/LP70329-5", "LNC/MTHU002108", "LNC/MTHU019392", "MEDLINEPLUS/4935", "MSH/D007328", "MTH/NOCODE", "SNOMEDCT_US/39487003", "SNOMEDCT_US/412222002", "SNOMEDCT_US/67866001" ] }, { "entityId": "UMLS/C0241863", "preferredTerm": "Diabetic", "vocabularyCodes": [ "LNC/LA26134-9" ] }, { "entityId": "UMLS/C0348016", "preferredTerm": "Intravenous", "vocabularyCodes": [ "LNC/LA9437-0", "LNC/LP32453-0", "MTH/NOCODE", "NCI/C13346", "SNOMEDCT_US/255560000" ] }, { "entityId": "UMLS/C0795635", "preferredTerm": "insulin, regular, human", "vocabularyCodes": [ "LNC/LP17001-6", "MSH/D061386", "MTH/NOCODE", "NCI/C29125", "RXNORM/253182", "SNOMEDCT_US/126210001", "SNOMEDCT_US/417423002", "SNOMEDCT_US/420609005", "SNOMEDCT_US/96367001", "VANDF/4017559", "VANDF/4017569", "VANDF/4019786" ] }, { "entityId": "UMLS/C1533581", "preferredTerm": "Therapeutic Insulin", "vocabularyCodes": [ "MTH/NOCODE", "NCI/C581" ] }, { "entityId": "UMLS/C3537244", "preferredTerm": "Insulins", "vocabularyCodes": [ "MSH/D061385", "MTH/NOCODE" ] }, { "entityId": "UMLS/C3714501", "preferredTerm": "Insulin Drug Class", "vocabularyCodes": [ "MTH/NOCODE", "VANDF/4021631" ] } ], "relationships": [ { "subjectId": "2", "objectId": "4", "confidence": 0.99827027320861816 }, { "subjectId": "2", "objectId": "5", "confidence": 0.99729859828948975 }, { "subjectId": "3", "objectId": "4", "confidence": 0.80851161479949951 }, { "subjectId": "3", "objectId": "5", "confidence": 0.67507040500640869 } ] }
PowerShell
The following sample shows a POST
request using Windows PowerShell:
$cred = gcloud auth application-default print-access-token $headers = @{ Authorization = "Bearer $cred" } Invoke-WebRequest ` -Method Post ` -Headers $headers ` -ContentType: "application/json; charset=utf-8" ` -Body "{ 'nlpService':'projects/PROJECT_ID/locations/us-central1/services/nlp', 'documentContent': 'Diabetes. Insulin regimen human 5 units IV administered.', 'licensedVocabularies': ['SNOMEDCT_US','ICD10CM'] }" ` -Uri "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/services/nlp:analyzeEntities" | Select-Object -Expand Content
Replace PROJECT_ID with your Google Cloud project ID.
The output is the following. The output from specifying the licensed vocabularies is in bold:
{ "entityMentions": [ { "mentionId": "1", "type": "PROBLEM", "text": { "content": "Diabetes" }, "linkedEntities": [ { "entityId": "UMLS/C0011847" }, { "entityId": "UMLS/C0011849" }, { "entityId": "UMLS/C0241863" } ], "temporalAssessment": { "value": "CURRENT", "confidence": 0.98781299591064453 }, "certaintyAssessment": { "value": "LIKELY", "confidence": 0.872421145439148 }, "subject": { "value": "PATIENT", "confidence": 0.99975031614303589 }, "confidence": 0.99663406610488892 }, { "mentionId": "2", "type": "MEDICINE", "text": { "content": "Insulin regimen", "beginOffset": 10 }, "linkedEntities": [ { "entityId": "UMLS/C0795635" }, { "entityId": "UMLS/C0021641" }, { "entityId": "UMLS/C3537244" }, { "entityId": "UMLS/C1533581" }, { "entityId": "UMLS/C3714501" } ], "temporalAssessment": { "value": "CURRENT", "confidence": 0.91042423248291016 }, "certaintyAssessment": { "value": "LIKELY", "confidence": 0.99766635894775391 }, "subject": { "value": "PATIENT", "confidence": 0.999998152256012 }, "confidence": 0.716249406337738 }, { "mentionId": "3", "type": "MEDICINE", "text": { "content": "human", "beginOffset": 26 }, "temporalAssessment": { "value": "CURRENT", "confidence": 0.64570724964141846 }, "certaintyAssessment": { "value": "LIKELY", "confidence": 0.90325617790222168 }, "subject": { "value": "PATIENT", "confidence": 0.97613298892974854 }, "confidence": 0.57638454437255859 }, { "mentionId": "4", "type": "MED_DOSE", "text": { "content": "5 units", "beginOffset": 32 }, "confidence": 0.92076742649078369 }, { "mentionId": "5", "type": "MED_ROUTE", "text": { "content": "IV", "beginOffset": 40 }, "linkedEntities": [ { "entityId": "UMLS/C0348016" } ], "confidence": 0.967033863067627 } ], "entities": [ { "entityId": "UMLS/C0011847", "preferredTerm": "Diabetes", "vocabularyCodes": [ "ICD10CM/E11", "LNC/LA10529-8", "LNC/LP128793-9", "LNC/MTHU040702", "MTH/NOCODE", "OMIM/MTHU050182" ] }, { "entityId": "UMLS/C0011849", "preferredTerm": "Diabetes Mellitus", "vocabularyCodes": [ "HPO/HP:0000819", "ICD10CM/E08-E13", "ICD9CM/250", "LNC/LA14291-1", "LNC/LA27539-8", "LNC/MTHU020781", "MEDLINEPLUS/4", "MEDLINEPLUS/45", "MSH/D003920", "MTH/NOCODE", "MTH/U000263", "NCI/C2985", "NCI/OMFAQ", "NCI/TCGA", "OMIM/MTHU036798", "SNOMEDCT_US/73211009" ] }, { "entityId": "UMLS/C0021641", "preferredTerm": "Insulin", "vocabularyCodes": [ "FMA/83365", "LNC/LA15805-7", "LNC/LP14676-8", "LNC/LP16325-0", "LNC/LP32542-0", "LNC/LP70329-5", "LNC/MTHU002108", "LNC/MTHU019392", "MEDLINEPLUS/4935", "MSH/D007328", "MTH/NOCODE", "SNOMEDCT_US/39487003", "SNOMEDCT_US/412222002", "SNOMEDCT_US/67866001" ] }, { "entityId": "UMLS/C0241863", "preferredTerm": "Diabetic", "vocabularyCodes": [ "LNC/LA26134-9" ] }, { "entityId": "UMLS/C0348016", "preferredTerm": "Intravenous", "vocabularyCodes": [ "LNC/LA9437-0", "LNC/LP32453-0", "MTH/NOCODE", "NCI/C13346", "SNOMEDCT_US/255560000" ] }, { "entityId": "UMLS/C0795635", "preferredTerm": "insulin, regular, human", "vocabularyCodes": [ "LNC/LP17001-6", "MSH/D061386", "MTH/NOCODE", "NCI/C29125", "RXNORM/253182", "SNOMEDCT_US/126210001", "SNOMEDCT_US/417423002", "SNOMEDCT_US/420609005", "SNOMEDCT_US/96367001", "VANDF/4017559", "VANDF/4017569", "VANDF/4019786" ] }, { "entityId": "UMLS/C1533581", "preferredTerm": "Therapeutic Insulin", "vocabularyCodes": [ "MTH/NOCODE", "NCI/C581" ] }, { "entityId": "UMLS/C3537244", "preferredTerm": "Insulins", "vocabularyCodes": [ "MSH/D061385", "MTH/NOCODE" ] }, { "entityId": "UMLS/C3714501", "preferredTerm": "Insulin Drug Class", "vocabularyCodes": [ "MTH/NOCODE", "VANDF/4021631" ] } ], "relationships": [ { "subjectId": "2", "objectId": "4", "confidence": 0.99827027320861816 }, { "subjectId": "2", "objectId": "5", "confidence": 0.99729859828948975 }, { "subjectId": "3", "objectId": "4", "confidence": 0.80851161479949951 }, { "subjectId": "3", "objectId": "5", "confidence": 0.67507040500640869 } ] }