Healthcare Natural Language API

Stay organized with collections Save and categorize content based on your preferences.

The Healthcare Natural Language API is a part of the Cloud Healthcare API that uses natural language models to extract healthcare information from medical text.

This conceptual guide explains the basics of using the Healthcare Natural Language API, including:

  • The types of requests you can make to the Healthcare Natural Language API
  • How to construct requests to the Healthcare Natural Language API
  • How to handle responses from the Healthcare Natural Language API

Overview

The Healthcare Natural Language API extracts healthcare information from medical text. This healthcare information can include:

  • Medical concepts, such as medications, procedures, and medical conditions
  • Functional features, such as temporal relationships, subjects, and certainty assessments
  • Relations, such as side effects and medication dosage

Choosing between the Healthcare Natural Language API and AutoML Entity Extraction for Healthcare

The Healthcare Natural Language API offers pre-trained natural language models to extract medical concepts and relationships from medical text. The Healthcare Natural Language API maps text into a predefined set of medical knowledge categories.

AutoML Entity Extraction for Healthcare allows you to create a custom entity extraction model trained using your own annotated medical text and using your own categories. For more information, see the AutoML Entity Extraction for Healthcare documentation.

Available locations

The Healthcare Natural Language API is available in the following locations:

Location name Location description
asia-south1 Mumbai, India
australia-southeast1 Sydney, Australia
europe-west2 London, UK
europe-west4 Netherlands
northamerica-northeast1 Montréal, Canada
us-central1 Iowa, USA

Healthcare Natural Language API features

The Healthcare Natural Language API inspects medical text for medical concepts and relations. You perform entity analysis using the analyzeEntities method.

Entity analysis request fields

The Healthcare Natural Language API is a REST API and consists of JSON requests and responses. The following sample shows a simple Healthcare Natural Language API request using curl:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json; charset=utf-8" \
  --data "{
    'nlpService': 'projects/PROJECT_ID/locations/LOCATION/services/nlp',
    'documentContent': 'Insulin regimen human 5 units IV administered.'
}" "https://healthcare.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/services/nlp:analyzeEntities"

The sample request demonstrates the following fields:

  • nlpService contains the resource name of the NLP service.
  • documentContent contains the data for the request, which consists of medical text. The maximum size of the medical text is 10,000 unicode characters.

Entity analysis response fields

Entity analysis returns a set of detected medical knowledge mentions, medical concepts, and relations between medical knowledge mentions.

The following response shows the response to the sample request in Entity analysis request fields:

{
  "entityMentions": [
    {
      "mentionId": "1",
      "type": "MEDICINE",
      "text": {
        "content": "Insulin regimen human"
      },
      "linkedEntities": [
        {
          "entityId": "UMLS/C3537244"
        },
        {
          "entityId": "UMLS/C3714501"
        },
        {
          "entityId": "UMLS/C0021641"
        },
        {
          "entityId": "UMLS/C0795635"
        },
        {
          "entityId": "UMLS/C1533581"
        },
        {
          "entityId": "UMLS/C4721402"
        }
      ],
      "temporalAssessment": {
        "value": "CURRENT",
        "confidence": 0.87631082534790039
      },
      "certaintyAssessment": {
        "value": "LIKELY",
        "confidence": 0.9999774694442749
      },
      "subject": {
        "value": "PATIENT",
        "confidence": 0.99999970197677612
      },
      "confidence": 0.41636556386947632
    },
    {
      "mentionId": "2",
      "type": "MED_DOSE",
      "text": {
        "content": "5 units",
        "beginOffset": 22
      },
      "confidence": 0.56910794973373413
    },
    {
      "mentionId": "3",
      "type": "MED_ROUTE",
      "text": {
        "content": "IV",
        "beginOffset": 30
      },
      "linkedEntities": [
        {
          "entityId": "UMLS/C0348016"
        }
      ],
      "confidence": 0.9180646538734436
    }
  ],
  "entities": [
    {
      "entityId": "UMLS/C1533581",
      "preferredTerm": "Therapeutic Insulin",
      "vocabularyCodes": [
        "MTH/NOCODE",
        "NCI/C581"
      ]
    },
    {
      "entityId": "UMLS/C0021641",
      "preferredTerm": "Insulin",
      "vocabularyCodes": [
        "FMA/83365",
        "LNC/LA15805-7",
        "LNC/LP14676-8",
        "LNC/LP16325-0",
        "LNC/LP32542-0",
        "LNC/LP70329-5",
        "LNC/MTHU002108",
        "LNC/MTHU019392",
        "MSH/D007328",
        "MTH/NOCODE"
      ]
    },
    {
      "entityId": "UMLS/C0348016",
      "preferredTerm": "Intravenous",
      "vocabularyCodes": [
        "LNC/LA9437-0",
        "LNC/LP32453-0",
        "MTH/NOCODE",
        "NCI/C13346"
      ]
    },
    {
      "entityId": "UMLS/C3537244",
      "preferredTerm": "Insulins",
      "vocabularyCodes": [
        "MSH/D061385",
        "MTH/NOCODE"
      ]
    },
    {
      "entityId": "UMLS/C3714501",
      "preferredTerm": "Insulin Drug Class",
      "vocabularyCodes": [
        "MTH/NOCODE",
        "VANDF/4021631"
      ]
    },
    {
      "entityId": "UMLS/C4721402",
      "preferredTerm": "INS protein, human",
      "vocabularyCodes": [
        "MTH/NOCODE",
        "NCI/C2271"
      ]
    },
    {
      "entityId": "UMLS/C0795635",
      "preferredTerm": "insulin, regular, human",
      "vocabularyCodes": [
        "LNC/LP17001-6",
        "MSH/D061386",
        "MTH/NOCODE",
        "NCI/C29125",
        "RXNORM/253182",
        "VANDF/4017559",
        "VANDF/4017569",
        "VANDF/4019786"
      ]
    }
  ],
  "relationships": [
    {
      "subjectId": "1",
      "objectId": "2",
      "confidence": 0.53775161504745483
    },
    {
      "subjectId": "1",
      "objectId": "3",
      "confidence": 0.95007365942001343
    }
  ]
}

The sample demonstrates the following response fields:

  • entityMentions are occurrences of medical knowledge entities in the source medical text. Each entity mention has the following fields:

    • mentionId is a unique identifier for an entity mention in the response.
    • type is the medical knowledge category of the entity mention.
    • text consists of textContent, the excerpt of the medical text containing the entity mention, and offset, the location of the entity mention in the source medical text.
    • temporalAssessment specifies how the linked entity relates to the entity mention, one of CURRENT, CLINICAL_HISTORY, FAMILY_HISTORY, UPCOMING, or OTHER.
    • certaintyAssessment is the negation or qualification of the medical concept, one of LIKELY, SOMEWHAT_LIKELY, UNCERTAIN, SOMEWHAT_UNLIKELY, UNLIKELY, or CONDITIONAL.
    • subject specifies the subject that the medical concept relates to, one of PATIENT, FAMILY_MEMBER, or OTHER.
    • linkedEntities lists medical concepts that could be related to this entity mention. Linked entities specify the entityId, which links a medical concept to an an entity in entities.
  • entities describes the medical concepts from the linked entities fields. Each entity is described using the following fields:

    • entityId is the unique identifier from linkedEntities.
    • preferredTerm the preferred term for the medical concept.
    • vocabularyCodes are the representation of the medical concept in supported medical vocabularies.
  • relationships define directed relationships between entity mentions. In the sample, the subject of the relationship is "Insulin regimen human" and the object of the relationship is "5 units". confidence indicates the model's confidence in the relationship as a number between 0 and 1.

Supported languages

The Healthcare Natural Language API only supports extracting healthcare information from English text.

Supported medical vocabularies

The Healthcare Natural Language API supports the following medical vocabularies:

  • Foundational Model of Anatomy
  • Gene Ontology
  • HUGO Gene Nomenclature Committee
  • Human Phenotype Ontology
  • ICD-10 Procedure Coding System
  • ICD-10-CM (available for US users only)
  • ICD-9-CM
  • LOINC
  • MeSH
  • MedlinePlus Health Topics
  • Metathesaurus Names
  • NCBI Taxonomy
  • NCI Thesaurus
  • National Drug File
  • Online Mendelian Inheritance in Man
  • RXNORM
  • SNOMED CT (available for US users only)

Supported medical knowledge categories

The Healthcare Natural Language API supports the following medical knowledge categories:

Medical knowledge category Description
ANATOMICAL_STRUCTURE Complex part of the human body
BF_RESULT Body function result
BM_RESULT Body measurement result
BM_UNIT Body measurement unit
BM_VALUE Value of a body measurement
BODY_FUNCTION Function carried out by the human body
BODY_MEASUREMENT A normal measurement of the human body, such as a vital sign
LABORATORY_DATA Results of testing a bodily sample
LAB_RESULT Qualitative description of laboratory data, such as increased, decreased, positive, or negative
LAB_UNIT Unit of measurement for a laboratory value
LAB_VALUE Value of an instance of laboratory data
MEDICAL_DEVICE Physical or virtual instrument
MEDICINE Drug or other preparation for treatment or prevention of disease
MED_DOSE Medication dose
MED_DURATION Medication duration
MED_FORM Physical characteristics of the specific medication
MED_FREQUENCY How often the medication is taken
MED_ROUTE Body location at which the medication is administered
MED_STATUS For an existing medication, status can be a modifier such as "continue", "start", "restart", "stop", "switch", "increase", "decrease".
MED_STRENGTH Amount of active ingredient in a dose of medication
MED_TOTALDOSE Quantity of medication to take at one time
MED_UNIT Unit of measurement for the active ingredient in a medication
PROBLEM Medical condition, including findings and diseases
PROCEDURE_RESULT Results of a procedure
PROCEDURE Diagnostic or treatment procedure
PROC_METHOD Method in which a procedure is conducted
SEVERITY Severity of the medical condition
SUBSTANCE_ABUSE Abuse of a psychoactive substance

Supported functional feature categories

The Healthcare Natural Language API can infer functional features, or attributes, of an entity mention from context. For example, in the statement "Kusuma's mother has diabetes", the condition "diabetes" has the functional feature of subject FAMILY_MEMBER.

Temporal relationships

Temporal relationships, returned in the temporalAssessment field, describe how this entity mention relates to the subject temporally.

The Healthcare Natural Language API supports the following temporal relationships:

  • CURRENT
  • CLINICAL_HISTORY
  • FAMILY_HISTORY
  • UPCOMING
  • OTHER

Subjects

Subjects, returned in the subject field, describe the individual the entity mention relates to.

The Healthcare Natural Language API supports the following subjects:

  • PATIENT
  • FAMILY_MEMBER
  • OTHER

Certainty assessments

Certainty assessments, returned in the certaintyAssessment field, describe the original note taker's confidence. For example, if the original note contains "The patient has a sore throat", the certainty assessment returns a LIKELY value to indicate the note taker's confidence that it was likely that the patient had a sore throat. If the original note contains "The patient does not have a sore throat", the certainty assessment returns an UNLIKELY value to indicate the note taker's confidence that it was unlikely that the patient had a sore throat.

Certainty assessments can be one of the following values:

  • LIKELY
  • SOMEWHAT_LIKELY
  • UNCERTAIN
  • SOMEWHAT_UNLIKELY
  • UNLIKELY
  • CONDITIONAL

Supported relationships between entity mentions

The Healthcare Natural Language API can infer relationships between entity mentions based on the surrounding medical text. In the response, the subject of the relationship is identified by subjectId and the object of the relationship is identified by objectId.

The Healthcare Natural Language API supports the following relationships between entity mentions:

Subject Object
ANATOMICAL_STRUCTURE MEDICAL_DEVICE
BODY_FUNCTION BF_RESULT
BODY_MEASUREMENT BM_RESULT
BODY_MEASUREMENT BM_UNIT
BODY_MEASUREMENT BM_VALUE
LABORATORY_DATA LAB_RESULT
LABORATORY_DATA LAB_UNIT
LABORATORY_DATA LAB_VALUE
MEDICINE MED_DOSE
MEDICINE MED_DURATION
MEDICINE MED_FORM
MEDICINE MED_FREQUENCY
MEDICINE MED_ROUTE
MEDICINE MED_STATUS
MEDICINE MED_STRENGTH
MEDICINE MED_TOTALDOSE
MEDICINE MED_UNIT
PROBLEM ANATOMICAL_STRUCTURE
PROBLEM MEDICINE
PROBLEM PROCEDURE
PROBLEM SEVERITY
PROCEDURE ANATOMICAL_STRUCTURE
PROCEDURE PROC_METHOD
PROCEDURE PROCEDURE_RESULT
SUBSTANCE_ABUSE SEVERITY