talk - xml structuring clinical narrative
TRANSCRIPT
XML Structuring of Clinical XML Structuring of Clinical Narrative Using Natural Language Narrative Using Natural Language
ProcessingProcessing
Naomi SagerHL7-CDA2 Acapulco, Mexico
October 20, 2004
October 20, 2004 XML Structuring of Clinical Narrative using NLP—2
Good morning.I would like to thank the Program Committee for this opportunity to
introduce you to Natural Language Processing (NLP). Perhaps my presence
here means that people no longer pose the question:
Why Process Clinical Narrative?
October 20, 2004 XML Structuring of Clinical Narrative using NLP—3
Why process clinical narrative ?Why process clinical narrative ?• Natural language patient documents contain important
information— details and context of findings— time features of disease process
• Structured Data Entry (SDE) cannot capture it all— menus too detailed ?— menus too brief ?
• Natural language is natural— known— powerful— habitual
October 20, 2004 XML Structuring of Clinical Narrative using NLP—4
SOME BACKGROUND
NLP goes back some 45 years.
In the late 1950’s, the US National Science Foundation was concerned with
the post-war explosion of the scientific and technical literature. They sought
new means of processing and retrieving textual information. They turned to
linguists to help solve the problem, with surprising initial success.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—5
First English parsing programFirst English parsing program
• University of Pennsylvania, Department of Linguistics, 1959
• UNIVAC I— Vacuum tubes— 1,000 words of storage (backed by tapes)
• Parsed 1-page scientific text.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—6
The first English parsing program ran successfully in 1959, on the Univac, one of
the very first computers.Few of you can call up an image of the
Univac, but I can, because my office at the University of Pennsylvania was on top of it.
That is, I was on the second floor and the Univac occupied a very large room on the
ground floor below.The walls of that room were lined from floor
to ceiling with racks of chassis filled with vacuum tubes. Yes, vacuum tubes. It was
one person's sole job to hunt down and replace tubes that were no longer lit.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—7
Technical issuesTechnical issues
• Need high speed and large memory• Need large and rich lexicon• Need new forms of rules
October 20, 2004 XML Structuring of Clinical Narrative using NLP—8
If a parser functioned 45 years ago, you may ask: why is it taking so long for NLP
applications to emerge?
For one, the technology had to catch up with the possibility.
At first it took 20 minutes to parse a sentence.
Also, a sample text needed only a small dictionary—just the words of the text with
their parts of speech and certain attributes. So dictionaries, or lexicons,
had to be built.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—9
Then unanticipated issues arose. For example, if the parsing grammar covered all the possible ways you can compose a
sentence, then unless constrained, it would build many parses for a single
sentence.
So new kinds of constraining rules had to be implemented.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—10
NLP issuesNLP issues
• Massive detail of language— How to organize it
• Meaning— How to characterize it
• Information— How to represent it
October 20, 2004 XML Structuring of Clinical Narrative using NLP—11
As the issues became defined, researchers generally associated these three major
issues with three major levels of processing:
Syntax, Semantics, Pragmatics.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—12
On the Syntax level, the process was to —Obtain the grammatical structure of
sentences (what came to be known as parsing)
On the Semantics level, the process was to — Treat word meanings and relations by some operational system of attributes
attached to wordsOn the Pragmatics level it was
understood that we had to — Develop representational structures (perhaps in
the then emerging database framework) and create the appropriate application
algorithms.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—13
Major levels of processingMajor levels of processing
• Syntax— grammatical structure of sentences
• Semantics— word meanings and relations
• Pragmatics— representational structures and
application algorithms
October 20, 2004 XML Structuring of Clinical Narrative using NLP—14
Not so easy! Progress came to be measured first in years, and then in decades. Here, briefly, is how those
decades were spent.
The first decade saw different theories of grammar being implemented in a variety
of parsing algorithms.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—15
NLP NLP …… by decadesby decades
1965-1975: Parsing using linguistics— Rule collections [Harvard]— Transformational Generative Grammar
[IBM]— Linguistic String Analysis [NYU]— Augmented Transition Network [BBN]
October 20, 2004 XML Structuring of Clinical Narrative using NLP—16
Strangely, parsing a sentence, that is, obtaining a grammatical representation that corresponded to the meaning of the
sentence, proved to be unexpectedly difficult.
By the end of the decade, after much time and money had been spent, most of the
parsing efforts were abandoned.
Almost uniquely, not at NYU.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—17
NLP NLP …… by decadesby decades
1975-1985: Semantic Representation— Semantic Primitives [Yale]— Conceptual Graphs [IBM]— Semantic Nets— Artificial Intelligence (Block World)— Sublanguage Analysis [NYU]
October 20, 2004 XML Structuring of Clinical Narrative using NLP—18
In the second decade, along came the semanticists and the new field of artificial
intelligence. No more parsing. The attack on meaning was direct.
A Yale researcher proposed a set of semantic primitives into which all word
meanings and relations were to be decomposed.
Others worked with representational formalisms, such as Conceptual Graphs
and Semantic Nets.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—19
A successful AI program was able to interpret such instructions as “Place the green block on the red block” and cause
the block images on the screen to carry out the action. Unfortunately, these
efforts remained on the block world level.
At NYU, we continued with linguistic methods, specializing the general NLP
system for the “sublanguage” of medicine.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—20
NLP NLP …… by decadesby decades
1985-1995: Parsing Using Statistics— Corpus-Based Text Processing
October 20, 2004 XML Structuring of Clinical Narrative using NLP—21
Decade 3. Back to parsing. For information, it was inescapable.
The new power of computers suggested to some researchers that grammatical
relations and word associations could be discovered automatically from gigabytes
of text strings.
This work is ongoing.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—22
NLP NLP …… by decadesby decades
1995-now: DiverseGoogle search of “natural language processing”— 785 actual hits (out of 496,000 reported
hits)
October 20, 2004 XML Structuring of Clinical Narrative using NLP—23
Over the last decade, the technology of the Internet has spawned diverse efforts. A google search for ‘Natural Language
Processing’ yielded 785 relevant hits.
Only time will tell which ones prove successful.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—24
NLP NLP …… by decadesby decades
1995-now: Medical Language Processing— MLP— MedLEE— Language and Computing— A-Life
October 20, 2004 XML Structuring of Clinical Narrative using NLP—25
Some work of medical interest in the last decade includes
+ The MLP System, which I will focus on today;
+ MedLEE, developed at Columbia Presbyterian Medical Center;
+ Language and Computing, from Europe, and
+ A-Life, a relative newcomer to the field.
The first two are academic projects; the second two are commercial.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—26
Since the remainder of my talk will be devoted to medical language processing,
I thought I would start by sharing with you some gems of clinical narrative.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—27
• “Discharge status: Alive but without permission. The patient will need disposition, and therefore we will get Dr. Blank to dispose of him.”
• “By the time he was admitted, his rapid heart had stopped, and he was feeling better.”
• “On the second day the knee was better and on the third day it had completely disappeared.”
• “The patient has been depressed ever since she began seeing me in 1983.”
Medical MemosMedical MemosThe following quotes were taken from actual medical records dictated by physicians. They appeared in a column written by Richard Lederer, Ph.D., for the Journal of Court Reporting:
October 20, 2004 XML Structuring of Clinical Narrative using NLP—28
Actually my favorite is the statement: “Discharge Status: Alive but
without permission.”
When we process this sentence into its informational components, what we call Health Information Units, or HIUs, the
result is less funny but more regular; the information content is made explicit.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—29
Health Information Units (Health Information Units (HIUsHIUs))
“Discharge status : alive but without permission .”
• HIU #1: “Discharge status : alive”• CONNECTIVE: “but”• HIU #2: “Discharge status : without
permission”
October 20, 2004 XML Structuring of Clinical Narrative using NLP—30
Here, “Discharge Status” has been copied into the second HIU to create a
complete information unit.So we have 2 units:
“Discharge Status: alive”“Discharge Status: without permission”
occurring with the connective “but”.This is a small example of what language
does to information.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—31
In its millennia of evolution, language developed ways of shortening the
message without losing content. For example, here, as readers, we fill out a
statement with its missing words because the missing words are repeats of previous
words in a parallel position: before and after the conjunction “but”.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—32
A major job of Natural Language Processing is, so to speak, to undo
evolution and present the underlying content in a more regular form.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—33
Transformation into information unitsTransformation into information units
Original sentence from an anonymized patient document:
Today, she has no cough, chest pain, or shortness of breath.
is transformed into single information units:
Today, she has no cough, today, she has no chest pain,and today, she has no shortness of breath.
where:• Time word “today” is distributed to the basic statements;• Negative word “no” is distributed to every object;• “or” in distributed negative statement is transformed to “and”.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—34
Here is another example of NLP restoring a complex sentence to its
underlying information units.
The sentence is “Today she has no cough, chest pain or shortness of breath”. It is
transformed into 3 single information units:
“Today she has no cough,”“Today she has no chest pain”
and“Today she has no shortness of breath”.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—35
NLP obtained them from the original sentence by expanding around the
conjunctions, copying parallel material, and changing ‘or’ to ‘and’ under
negation.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—36
Still, you might ask: “What is the utility in breaking up complex information into
its more elementary components?”The answer is: while text contains
valuable information, there is simply too much of it.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—37
How to treat textual contentHow to treat textual content
• Granted: Text contains valuable information.• Text is too voluminous for sequential viewing.• Develop a method for selective viewing:
— Identify fact units— Tag words with their medical content— Provide a mean of sorting facts by their tags— Link sorted facts to their textual context.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—38
A clinician may confront a patient chart containing 20 or 30 or even 50
documents. It is not possible to read through them all in order to find the facts
relevant to an immediate concern; for example, to follow a particular patient
problem.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—39
By now it is accepted that textual content must be included in the Electronic Health
Record. But blobs of text are unwieldy when it comes to accessing specific
content.
Here is where Natural Language Processing may help.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—40
By identifying up discreet information units, tagging the words with their
medical content, providing a means of sorting facts by their tags and linking the
sorted facts to their textual context, the system creates “hooks” into the text for
selective viewing and other applications.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—41
• Viewer of Dolin
October 20, 2004 XML Structuring of Clinical Narrative using NLP—42
Here you see an example of selective viewing, using a viewer specifically
developed to work with patient documents that have been processed by MLP.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—43
Along the top you see (in red) that CHART QUERIES have been chosen.
The selected Query Type is SUMMARY SHEET.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—44
October 20, 2004 XML Structuring of Clinical Narrative using NLP—45
About SUMMARY SHEETS:Clinicians use a variety of approaches to organizing information contained in the
chart, for efficient retrieval and rapid review of historical information. One of the most common views is a SUMMARY
SHEET that summarizes key information useful in managing a patient's medical
problems.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—46
MLP processed documents allow for multiple query approaches and different
organizational views of the clinical information contained in the chart. This
is due to the comprehensive tagging of the data, starting at the document level
information, and progressing down to the clinical content contained within each
HIU. We will demonstrate the retrieval and display of clinical information on one
patient, but many other views are possible.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—47
The Patient is HL0130, a male whose date of birth September 24, 1932.
We choose to sort the medical facts, the HIUs, by Anatomic System.
After clicking on SUBMIT you see on the left the SUMMARY SHEET for Patient
HL0130, who is represented in the database by 1 document.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—48
October 20, 2004 XML Structuring of Clinical Narrative using NLP—49
DIAGNOSIS, SIGNS AND SYMPTOMS, MEDICATION, ALLERGIES, and
HEALTH-RELATED HABITS.
The subheadings in the SUMMARY SHEET depend on the content of the
given patient's documents.
Patient HL0130, for example, has findings in 4 anatomic systems: the
cardiovascular, integumentary, muculoskeletal, and respiratory.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—50
The HIUs obtained by the MLP system for this document are sorted (due to our
choice above) by Anatomic System. Thus, for example, if we choose (under
DIAGNOSIS, SIGNS AND SYMPTOMS) to see data regarding the patients
CARDIOVASCULAR SYSTEM,
October 20, 2004 XML Structuring of Clinical Narrative using NLP—51
VIEWER Click on CARDIOVASCULAR SYSTEM, under DIAGNOSIS, SIGNS AND
SYMPTOMS
October 20, 2004 XML Structuring of Clinical Narrative using NLP—52
and click on this subheading, we see the HIUs that contain a tag for the
Cardiovascular System. Each HIU carries the date of the visit, arranged in
reverse chronological order.
VIEWER Click on "Hypertension, well-controlled”
October 20, 2004 XML Structuring of Clinical Narrative using NLP—53
October 20, 2004 XML Structuring of Clinical Narrative using NLP—54
By clicking on an HIU, for example, the first HIU under Cardiovascular System,
“Hypertension, well-controlled”, the sentence containing that HIU appears at the top of the right screen in the context
of the given document.
You will perhaps recognize here the text of the Consultation Note concerning
Henry Levin the 7th, here anonymized to Patient HL0130. The text appears as an
example in the HL7 CDA Release 2 Document.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—55
VIEWER Click on Respiratory System
If we click on Respiratory System, we see a greater number of HIUs than under
Cardiovascular System, a quick indication that this is a major problem
area for this patient,
October 20, 2004 XML Structuring of Clinical Narrative using NLP—56
VIEWER Click on the last HIU under Respiratory System
who, has, in fact, been referred for management of his asthma, as we see in
one of the Respiratory System HIUs.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—57
October 20, 2004 XML Structuring of Clinical Narrative using NLP—58
Under Health Related Habits,
VIEWER Click on HIU "prior smoking history",
we find that the patient has a “prior smoking history”,
VIEWER Click on HIU “1 pack per day between the ages of 20 and 55”.
detailed as “1 pack per day between the ages of 20 and 55”.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—59
October 20, 2004 XML Structuring of Clinical Narrative using NLP—60
Note that negative statements are highlighted. Here, the HIU “Smoking:
then he quit” is understood to be a negation of smoking.
VIEWER Click oh HIU “Smoking: then he quit”
This came from ‘and then he quit’ under ‘Smoking’. ‘Smoking’ was copied into the
HIU the way ‘Discharge Status’ was copied in ‘Discharge Status: Alive but
without permission’.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—61
As we noted before, the HIUs for this document were sorted by Anatomic System. Another view of the data is
obtained by choosing to sort by BODY REGION.
VIEWER Sort on BODY REGIONSUBMIT
October 20, 2004 XML Structuring of Clinical Narrative using NLP—62
Now we may see, for example, the data that pertains to EXTREMITY
VIEWER Click on EXTREMITYIlluminate HIUs
"Skin: erythmatous rash, left index finger” and “Osteoarthrytis, right knee”.
where we see several HIUs.“Skin: erythmatous rash, left index
finger”and “Osteoarthrytis, right knee”.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—63
October 20, 2004 XML Structuring of Clinical Narrative using NLP—64
October 20, 2004 XML Structuring of Clinical Narrative using NLP—65
Selective viewing can become important when a patient has numerous documents.
For example,
VIEWER Pull down PATIENTS to SPF and click
Return Sort to Anatomic System
October 20, 2004 XML Structuring of Clinical Narrative using NLP—66
This patient, with 36 documents in the database, has problems in almost every
anatomic system, and is, or was, on a dozen types of medication. The
SUMMARY SHEET for this patient thus contains more headings than were displayed for the previous patient.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—67
October 20, 2004 XML Structuring of Clinical Narrative using NLP—68
A Viewer is one possible application of Medical Language Processing, perhaps
the most important in terms of patient care.
We will return to the Viewer for further examples later.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—69
Now, how is all this accomplished?
For medical language processing, we need to determine the regular forms that
are specific for clinical content, yet based on general properties of language.
Summarizing in 6 points what lies behind what we have seen thus far, first, we
recognize that
October 20, 2004 XML Structuring of Clinical Narrative using NLP—70
1. There is an underlying informational structure in all natural language sentences.
2. The information content of a sentence is given by its syntactic structure and the meaning of the individual words:
3. Structure is given by parsing. A word’s meaning is determined by what other words it occurs with.
structure + word meaning = information
Basis of Medical Language ProcessingBasis of Medical Language Processing
October 20, 2004 XML Structuring of Clinical Narrative using NLP—71
A word may have intrinsic meaning but it functions as a part of language by its
relations to other words.
In linguist Firth’s words: Know a word by the company it keeps.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—72
Basis of Medical Language ProcessingBasis of Medical Language Processing
severe respiratory distressdevelopedpatientcoughdevelopedpatientsevere pain in abdomendevelopedpatient
feverdevelopedpatient
mild colddevelopedpatientOBJECTVERBSUBJECT
4. A semantic class is formed of words that occur in similar environments
SYMPTOM CLASS
October 20, 2004 XML Structuring of Clinical Narrative using NLP—73
A semantic class is formed of words that occur in similar environments. Thus, for
example, the formation of the symptom class.
‘Cold’ in ‘mild cold’ is the central noun in the object of ‘developed’, in ‘Patient developed mild
cold’;‘Fever’ occurs similarly in ‘Patient developed
fever’.So also ‘pain’ in ‘Patient developed severe pain
in abdomen’.And again, for ‘cough’ in ‘Patient developed
cough’.And for ‘distress’ in ‘Patient developed severe
respiratory distress’.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—74
While we know these words are all symptoms based on medical knowledge,
the significance for computer processing is that they form a class, that, together
with other classes, form patterned occurrences.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—75
5. A statement type in a subject area is formed of semantic classes co-occurring frequently in a syntactic relation.
6. Clinical statement types comprise a computable semantic structure for housing narrative clinical information.
Basis of Medical Language ProcessingBasis of Medical Language Processing
todaypainchestnohassheTIMESYMPTPTPARTNEGV-PTPT
October 20, 2004 XML Structuring of Clinical Narrative using NLP—76
Here we see an example of a patient-state statement type, based on the frequent co-
occurrence of words in the classes for Patient, Patient-verb, and Symptom in the Subject-Verb-Object relation,
corresponding to the words‘She has pain’.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—77
We see here only a flattened version of the structure. It does not display explicitly the modifier relations:
‘chest’ as a patient-part modifier of ‘pain’;
‘no’ as a negation modifier of ‘pain’and
‘today’ as a time modifier of the whole statement.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—78
Instances of statement types in texts when output as XML structures enriched with
tags that represent their medical content, become HIUs, Health Information Units.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—79
For each document sentence, the first step is to produce a parse tree.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—80
A parse of one sentenceA parse of one sentenceSENTENCE|TEXTLET|ONESENT----------------------------------------------------------------------------------MORESENT|INTRODUCER---CENTER---------------------------------------------------------ENDMARK
| |ASSERTION---------------------------------------PAREN-FRAG '.'| |SA-------SUBJECT---TENSE---VERB--------------OBJECT----SA .| | | |SAOPTS NSTG LV---VVAR----RV NSTGO| | | |NSTGT LNR TV NSTG| | | |
| | VHAVE has LNRTIME-PHRASE | | |
| LN---------------------NVAR---RN LN----------------------NVAR---RN| | | | || TPOS--QPOS--APOS--NPOS PRO TPOS--QPOS--APOS--NPOS N| | | |
| she LTR coughLTIME--NSTG H-PT/H-FAMILY |
| LT----T---RT H-INDICLNR NTIME2 |
| noLN----------------------NVAR--RN--COMMASTG| | | H-NEGTPOS--QPOS--APOS--NPOS N ','
| |
Today ,
*SID=990318P2 030.20B.03.02Today , she has no cough .
October 20, 2004 XML Structuring of Clinical Narrative using NLP—81
The overall structure is an Assertion, with a Subject, Verb and Object.
Words are associated with the bottom-most or “terminal” nodes, which are
parts of speech. Thus, the last word ‘cough’ is a noun N in the lexicon that
matches the terminal node N in the parse tree. The pink symbols here are attributes
carried by the matched word in the lexicon. For example, ‘no’ has the
attribute H-NEG in the lexicon, H for Healthcare sublanguage.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—82
XML output of one sentenceXML output of one sentence<SID id="990318P2 098.20B.03.02"><!-- Today , she has no cough . -->
<PATIENT-STATE-HIU id="990318P2 098.20B.03.02“ sect="REVIEW OF SYSTEMS" row=“1"><EVENT-TIME>
<REF-PT> <_4152><tm><tm_tm-loc> Today </tm_tm-loc></tm></_4152> , </REF-PT></EVENT-TIME>
<PT-DEMOG><GENDER>[FEMALE]</GENDER></PT-DEMOG><SUBJECT> <_5705><per> she </per></_5705> </SUBJECT><VERB> <_7168><li><li_vhv> has </li_vhv></li></_7168>
<TENSE>[PRESENT]</TENSE></VERB><PSTATE-DATA>
<SIGN-SYMP><MODS><NEG> <_3440><md><md_ng> no </md_ng></md></_3440></NEG></MODS><_802><s-s><a-s_resp><b-r_m-r> cough </b-r_m-r></a-s_resp></s-s></_802>
</SIGN-SYMP></PSTATE-DATA></PATIENT-STATE-HIU></SID>
October 20, 2004 XML Structuring of Clinical Narrative using NLP—83
After several stages of processing the parse tree has been transformed into a
medically labeled XML structure, an HIU, in which the individual terms carry
XML tags that represent their medical content.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—84
This is a Patient State type HIU.The EVENT-TIME is “today”
The Patient DEMOGraphic information is the GENDER ‘female’ from the lexicon
entry for the SUBJECT word ‘she’.The SUBJECT is ‘she’and the VERB is ‘has’.
The PSTATE DATA is a SIGN-SYMPTOM ‘cough’ with the modifier
NEG whose value is the word “no”.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—85
Notice that each word is carrying XML tags of its own.
For example, ‘cough’ is tagged as s-s(sign-symptom),
a-s_resp (anatomic system, respiratory), b-r_m-r (body region, multi-region).
These tags are drawn from the Structured Health Markup Language,
or SHML.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—86
We will return to the SHML as it is used in the overall process of converting clinical narrative into a structured, medically tagged, representation of
content.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—87
MLP with SHML linkageMLP with SHML linkageMedical
documents
Preprocessing (standardization)
Documents with SIDs
MLP
MLP and SHML Dictionaries
Documents in HIU’s with SHML and MLP tags
GENERATORS• SHML/DTD• SHML/XSL• SHML/XQL
OtherApplications
Viewer
October 20, 2004 XML Structuring of Clinical Narrative using NLP—88
In the overall process, medical documents first pass through a
preprocessing stage where every sentence receives a Sentence Identifier.
The documents are then processed by the MLP system, and then, by drawing on a dictionary containing the SHML tags of
the words, the documents obtain a representation as HIUs. This
representation, along with the original documents, serves as input to a viewer or
other applications.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—89
PreprocessingPreprocessing
• Identification of sections and SIDs• Name identification (person, geographical
location, institution,…)• Spelling• Punctuation• Time, date, unit, number standardization
(ANSI standard)
October 20, 2004 XML Structuring of Clinical Narrative using NLP—90
In the course of processing documents from 15 institutional sources,
encompassing close to 118,000 sentences, we have encountered
37 document types and 491 different section names.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—91
Document typesDocument typesAcute Care VisitAdmission NoteBreast Clinic NoteBreast ONC Interval NoteCardiology Assoc Clinic NoteCardiology Assoc. Clinic NoteCardiology Associate CCU Admission NoteCardiology Associates Admission NoteCardiology Associates Progress NoteClinic NoteClinic NotesConsultant NoteConsultation ReportDischarge SummaryEEG ReportEmergency Department ReportEncounter NoteFollow-Up Clinic NoteGIM Acute Care Visit
GIM Return VisitGood Health Clinic Consultation noteInterval NoteNeurology New Patient EvalNew Patient EvaluationOHNS Clinic NoteOperative ReportOrthopaedic Clinic NotePhysical and Occ. Therapy NotePre-Operative VisitProcedure Note/ReportPulmonary Consultation NotePulmonary Return VisitRenal New Patient EvaluationRenal Return Patient EvalRenal Return Patient EvaluationReturn VisitRheumatology Clinic Note117,581 sentences
October 20, 2004 XML Structuring of Clinical Narrative using NLP—92
Clearly, this is only a sample of what lies out there.
But a sufficient sample to reveal a number of issues in preparing documents for standardized processing, whether by
NLP or other means.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—93
SectionsSectionsPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIESPROCEDURES/THERAPIES
LABLABLABLABLABLABLABLABLABLABLABLABLABLABLABLABLABLAB
------------------
DDDDDDDDDDDDDDDDDD
LAB / TELEMETRYLAB WORK BLOOD DRAWLABORATORIESLABORATORIES AT ADMISSIONLABORATORYLABORATORY ADDENDUMLABORATORY DATALABORATORY DATA / TEST RESULTSLABORATORY DATA AT ADMISSIONLABORATORY DATA ON ADMISSIONLABORATORY EVALUATIONLABORATORY OF NOTELABORATORY ON ADMISSIONLABORATORY RESULTSLABORATORY STUDIESLABORATORY TESTS ON ADMISSIONLABORATORY VALUESLABS
October 20, 2004 XML Structuring of Clinical Narrative using NLP—94
As you well know, section names range over a wide gamut. The 491 different
section names that we have encountered probably only scratch the surface. It is
not easy to group section names into gross classes but some are variants on a
single theme.
Here, for example, are different section names concerning Laboratory Tests. For document processing, variants are given
a single designation, in this case, for LAB, the letter “D”.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—95
Sentence id (SID)Sentence id (SID)
*SID=MLPC15 HL0130.001B.01.002XPT-HL0130 is a 67 year old male referred for further asthma management .
Institutional document
set
Patient Number
Record number
Section id
Paragraph number
Sentence number within paragraph
October 20, 2004 XML Structuring of Clinical Narrative using NLP—96
The section letter appears in the Sentence Identifier, the SID, along with other
coded information.The SID identifies:
• the institutional document set,• a patient number,
• record number, • section identifier (here, B for mainly
historical information), • the paragraph number within the
section, and • the sentence number within the
paragraph.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—97
MLP with SHML linkageMLP with SHML linkageMedical
documents
Preprocessing (standardization)
Documents with SIDs
MLP
MLP and SHML Dictionaries
Documents in HIU’s with SHML and MLP tags
GENERATORS• SHML/DTD• SHML/XSL• SHML/XQL
OtherApplications
Viewer
October 20, 2004 XML Structuring of Clinical Narrative using NLP—98
We return to the overall process.After preprocessing, documents enter the
MLP proper.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—99
ENGLISH Source Text
ENGLISH PARSING
FRENCH PARSING
GERMAN PARSING
ENGLISH SELECTION
GERMAN SELECTION
FRENCH SELECTION
GERMAN Source Text
FRENCH Source Text
ENGLISH Transformation
GERMAN Transformation
FRENCH Transformation
REGULARIZATION
INFORMATION FORMATTING
SYNTACTIC & MEDICAL LEXICON
GRAMMAR RULES: BNF & RESTRICTIONS
MEDICAL COOCCURRENCE
PATTERNS
MEDICAL REPRESENTATION
STRUCTURE
RELATIONAL dBMS / XML & SHML VIEWER
PARSER
October 20, 2004 XML Structuring of Clinical Narrative using NLP—100
It makes good sense to modularize natural language processing.
We noted earlier that initially researchers identified 3 major levels of processing:
syntax, semantics, pragmatics. In practice, the functions become more
specific, particularly in response to implementation issues.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—101
Here we see the process divided into 5 sequential components.
The first component is PARSING. It draws on a lexicon and a grammar. The MLP system has been implemented in 3
languages, actually 4, as Dutch has recently added by a European colleague.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—102
The lexicons are, naturally, language-specific; the grammars for related
languages are similar. In fact, the French and German MLP
grammars were developed as updates to the English MLP grammar.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—103
The second component, SELECTION, resolves ambiguity where possible, based
on medical word-class co-occurrence patterns. We will see some examples.
The third component transforms complex sentences into their individual
information units, as we saw for the sentence:
"Today she has no cough, chest pain or shortness of breath".
October 20, 2004 XML Structuring of Clinical Narrative using NLP—104
The fourth component provides a uniform connective structure for sentences with
more than 1 information unit.By this time the different language
versions can use the same program.
The fifth component maps each information unit into the appropriate medical statement type, which has an
XML representation and will become an HIU with the addition of SHML tags.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—105
HIU TypesHIU Types• PATIENT-STATE (34.59%)• ALLERGIES (0.73%)• MEDICATIONS-INFO (8.27%)• IMAGING-INFO (1.67%)• MED-SURG-PROCEDURES (4.32%)• LAB-TEST (5.17%)• FAMILY-FRIEND (0.19%)• MISC-TREATMENTS (1.80%)• EKG-TEST (0.61%)• DOCUMENT-INFO (2.46%)• TEXTPLUS (8.19%)
October 20, 2004 XML Structuring of Clinical Narrative using NLP—106
A working set of HIU types classifies information at the highest level and is
useful for retrieval of information.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—107
PATIENT-STATE is by far the most frequently occurring HIU type. It covers
all descriptions of the patient, the patients problems, risk factors, functionality, and
historical information.
ALLERGIES, as an HIU type, has been singled out from the PATEINT-STATE
HIU type because of its singular importance in patient care.
MEDICATION-INFO HIUs include the named medication and dose, along with
whatever time or change information occurs in the statement.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—108
IMAGING-INFO has its own importance among diagnostic tools. Other
procedures could also be singled out to become HIU types.
MED-SURG-PROCEDURES also could be further divided. It functions on a high level to distinguish procedures from all
other information.LAB-TEST HIUs cover blood work,
urinalysis, culturing and the like.FAMILY-FRIEND HIUs cover Family
History, and other statements involving persons other that the patient or care-
givers.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—109
MISC-TREATMENTS covers such complementary treatments as bedrest,
physical therapy and the like, but also an MLP statement type of "general medical
management".EKG-INFO as an HIU type is an example
where the special language of a diagnostic test requires, virtually, a
subgrammar to process it and a special structure to house the information. Many more of this type will certainly be needed.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—110
DOCUMENT-INFO is an HIU type to hold information about a document being
referenced or discussed.
TEXTPLUS is an HIU type that picks up un-analyzed sentences. The words are
tagged, but no parse was possible over that sentence or stretch of words within a
sentence.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—111
I am often asked, which part of the MLP process is the hardest?
The answer is PARSING. Who would have thought there could be so many ways to analyze a sentence? Or how important is the correct assigning of
structure to obtaining a correct representation of information.
Equally important, though, is the resolution of ambiguity. Ambiguity comes in 2 flavors: word-sense and
syntactic.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—112
In word sense ambiguity, a word has 2 or more distinct meanings.
Consider ‘depression’ in 2 occurrences:
October 20, 2004 XML Structuring of Clinical Narrative using NLP—113
Word sense ambiguityWord sense ambiguity
Patient suffers from severe depression.vs.
Electrocardiogram shows ST depressionin lead 5.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—114
‘Patient suffers from severe depression’versus
‘Electrocardiogram shows ST depression in lead 5’
Clearly, the different contexts distinguish the different meanings. But storing
contexts on the level of the words themselves is not feasible because of the
large number of words and the variety of contexts. However, storing contexts in
terms of word classes is manageable.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—115
Resolution of word sense ambiguityResolution of word sense ambiguity
feltmassesH-INDIC
orOR
no growthH-INDIC
H-PTFUNC
normaldevelopmentH-PTFUNC
andAND
growthH-INDIC
H-PTFUNC
October 20, 2004 XML Structuring of Clinical Narrative using NLP—116
For example, word class patterns resolve the ambiguity of 'growth'.
• ‘Growth’ in ‘growth and development normal’ is a word describing a normal
patient physiological function, in the Healthcare sublanguage word class H-
PTFUNC.• ‘Growth’ in ‘no growth or masses felt’is a disease-indicator word, in the class
H-INDIC. Conjoined nouns should be in the same
class (or certain compatible classes), so this decides which sense of ‘growth’ is
correct in these occurrences.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—117
Resolution of syntactic ambiguityResolution of syntactic ambiguity
swelling in kneesandswelling in hands
swelling in kneesandfever
H-PTPARTH-PTPARTH-INDIChands)and(kneesinswelling
NCONJNPN
H-INDICH-PTPARTH-INDICfeverandknees)in(swelling
October 20, 2004 XML Structuring of Clinical Narrative using NLP—118
In addition to word sense ambiguity, there is syntactic ambiguity.
From the same sequence of Noun, Preposition, Noun, Conjunction, Noun we
can have two different groupings.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—119
Again we rely on matching medical word classes.
• In ‘swelling in knees and hands’ the match is H-PTPART: ‘knees and hands’
• In ‘swelling in knees and fever’, the match is H-INDIC: ‘swelling and fever’
This matching of subclasses is important because, if we treat the two structures as equivalent, the system could generate the
incorrect ‘swelling in fever’.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—120
Word class patternsWord class patterns• 58 semantic classes• Conjunction Equivalent Classes: 47 patterns.• Computed Phrase (Left Adjunct+Noun): 147 patterns.• Computed Phrase (Noun+Noun): 58 patterns.• Computed Phrase (Noun+Right Adjunct): 165 patterns.• Noun−Preposition−Noun: 3,383 patterns.• Adjective−Noun: 727 patterns.• Noun−Noun: 546 patterns.• Subject−Verb−Object: 566 patterns.• Subject−Be−Object: 100 patterns.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—121
Resolution of ambiguity by matching subclasses requires the accumulation of
many instances of well-formed word class co-occurrence patterns.
On the one hand, it is daunting to see the amount of language data that is needed
to accomplish the conversion of free text to structured information.
On the other hand, it is rather remarkable that it can be done with as
few as 58 word classes.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—122
MLP with SHML linkageMLP with SHML linkageMedical
documents
Preprocessing (standardization)
Documents with SIDs
MLP
MLP and SHML Dictionaries
Documents in HIU’s with SHML and MLP tags
GENERATORS• SHML/DTD• SHML/XSL• SHML/XQL
OtherApplications
Viewer
October 20, 2004 XML Structuring of Clinical Narrative using NLP—123
Now, having dwelt on the MLP part of the overall process, let us return to the SHML tags, the “hooks” into specific
medical content.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—124
SHMLSHMLStructured Health Markup LanguageStructured Health Markup Language
• Medical knowledge XML tag set• Designed to work with Medical Language
Processing via XML• Authors: David Rothwell, MD, Richard
Wheeler, MD, Ngô Thanh Nhàn, Ph.D.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—125
SHML is a medical knowledge XML tag set, designed to work with medical
language processing. It is the work over the past several years primarily by Dr.
David Rothwell, well known for his authorship of Snomed, along with Dr.
Richard Wheeler, formerly Chief Medical Manager of Healthmatics, and Dr. Ngô
Thanh Nhàn, the computer scientist who created the XML implementation of the
SHML.
Dr. Nhàn will be speaking in this Conference on Friday.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—126
SHML tags are a mix of linguistic and medical categories needed to perform
information-sensitive tasks: For example, to provide a physician selective viewing of the content of a patient’s documents,
as we have seen; or to perform a retrieval over a patient population, such as to meet a JCAHO requirement, an application we
will see shortly.
Some of the features of SHML are
October 20, 2004 XML Structuring of Clinical Narrative using NLP—127
SHMLSHML
• Uses XML formalism• Data and document are combined• SHML tags are metadata—medical
information not explicit in text• Preserves fundamental structure of
document (EHR)• Users can create their own tags and tag
extensions
October 20, 2004 XML Structuring of Clinical Narrative using NLP—128
SHML tags: exampleSHML tags: example<dx>
<a-s_resp_l-r><b-r_m-r><dx-prcss_imm_all><dx-kind_d-k-resp_r-a-d>Asthma</dx-kind_d-k-resp_r-a-d></dx-prcss_imm_all></b-r_m-r></a-s_resp_l-r>
</dx>
October 20, 2004 XML Structuring of Clinical Narrative using NLP—129
As an example of SHML tagging, ‘Asthma’ is a diagnosis dx, associated
with the anatomic system a-s, specifically respiratory resp, lower respiratory
system l-r.It is associated with the body region b-r,
of the type multi-region m-r. It is associated with the disease process
immunologic allergic. And in a typing of diagnoses by group, it is respiratory,
more specifically reactive airways disease r-a-d.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—130
Example with Example with SnomedSnomed codecode<dx>
<a-s_resp_l-r><b-r_m-r><dx-prcss_imm_all><dx-kind_d-k-resp_r-a-d><Snomed_D2-51000>Asthma</Snomed_D2-51000></dx-kind_d-k-resp_r-a-d></dx-prcss_imm_all></b-r_m-r></a-s_resp_l-r>
</dx>
October 20, 2004 XML Structuring of Clinical Narrative using NLP—131
SHML tagging is not coding, but codes can be added as additional tags. As you
see here, the Snomed code for Asthma has been added as another tag.
Dr. Rothwell has kindly allowed me to use a number of his slides to illustrate
various features of the SHML.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—132
SHML tagsSHML tags——formal taxonomiesformal taxonomies
• Anatomy <a-s>• Body region <b-r>• Organisms <or>• Chemicals <chem>• Meds <med>• Diagnoses <dx>• Procedures <pr>• …
October 20, 2004 XML Structuring of Clinical Narrative using NLP—133
One can describe SHML tags as a formal taxonomies.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—134
SHML Tag TypesSHML Tag Types• Activities (sports,…)• Medications: (Multum),
med-class• Chemicals• Time: freq, repetition, exact,
begin, end• Links• Modifiers: modal, negation,
changes, amount, desc, s-q• Person: kin, civil• Demographic
• Anatomic structure• Body region• Sign-symptom• Diagnosis• Dx-process• Dx group by system• Procedures• Organisms• Allergies• Pt social behavior• Health status (adl…)
October 20, 2004 XML Structuring of Clinical Narrative using NLP—135
These are the main tag types now in use.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—136
SHML tag systemSHML tag system
b-rb-r_h-n_hdb-r_h-n_hdb-r_h-n_hdb-r_tk_thxb-r_tk_thxb-r_tk_thxb-r_tk_thxb-r_tk_thx
a-sa-s_nra-s_nr_cnsa-s_nr_cns_brna-s_rspa-s_rsp_u-ra-s_rsp_l-ra-s_rsp_l-r_lnga-s_gi_gi-tr_u-gi_stm
anatomic systemneurologic systemcentral nervous systembrainrespiratory systemupper respiratory tractlower respiratory tractlungstomach
Body RegionAnatomic StructureTag name
October 20, 2004 XML Structuring of Clinical Narrative using NLP—137
The encapsulated hierarchic structure of the anatomic system is illustrated here by
the tag for 'brain' in the central nervous system, of the neurologic system, an
anatomic system.
A feature of the SHML is that a term which has an a-s tag also has a b-r tag.
Medical conditions are located both with regard to anatomy and the body region in
which they occur.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—138
SHML tagsSHML tagsOrganisms <or>– microorganism <or_mc>– bacteria <or_mc_bct>– Gram positive <or_mc_bct_gm-pos>– Gram negative <or_mc_bct_gm-neg>– virus <or_mc_vr>– Rickettsia <or_mc_rck>– fungus <or_mc_fgs>– parasite <or_mc_par>– arthropod <or_mc_arthropod>
October 20, 2004 XML Structuring of Clinical Narrative using NLP—139
This an example of the tag classes for organisms—
at the highest level: ‘or’ for organisms;the subclass microorganisms ‘or_mc’;
the further subclass bacteria ‘or_mc_bact’;
and so forth.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—140
SHML tag systemSHML tag system
med-cl_antiinf_pcnmed-cl_antiinf_pcnmed-cl_aceinhmed-cl_aceinh
medmedmedmed
PenicillinAmpicillinAce inhibitorCaptopril
SHML support tagsSHML tagTerm
October 20, 2004 XML Structuring of Clinical Narrative using NLP—141
Here we see a sample of how medications are tagged: with ‘med’ and the “support class” of their medication
class,for example, penicillin, carrying the
support class tag:med-cl_anti-inf_penicillin.
indicating its medication class is anti-inflammatory, type penicillin.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—142
SHML tag systemSHML tag system
dxdx-prcssdx-prcss_infdx-prcss_immdx-prcss_np
dx-kinddx-kind_nrdx-kind_nr_migrdx-kind_resp_r-a-ddx-kind_resp_r-a-d
DiagnosisDiagnostic processInfectious diagnostic processImmune diagnostic processNeoplastic diagnostic process
Diagnosis groupNeurologic diseaseMigraineReactive Airway DiseaseAsthma
dx and Support TagsTag Name
October 20, 2004 XML Structuring of Clinical Narrative using NLP—143
Here we we see the dx tag for diagnosis with examples of 2 types of support tags,
one for the diagnostic process and one for the diagnostic group. We saw earlier
examples of both support tags in the tagging of the diagnosis ASTHMA.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—144
Where the tags deal with general language, they follow the classes of the
MLP system. For example, words expressing time.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—145
tm_prepPH-TMPREPsincetm_durADJH-TMDURhabitualtm_durADJH-TMDURfleetingtm_endNH-TMENDend stagetm_endVH-TMENDdiscontinuetm_begADJH-TMBEGemergenttm_begDH-TMBEGinitiallytm_locDH-TMLOCon admissiontm_locVH-TMLOCantecede
SHML tagPart of SpeechMLP classTerm
Terms expressing timeTerms expressing time
October 20, 2004 XML Structuring of Clinical Narrative using NLP—146
and negation.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—147
Terms expressing negationTerms expressing negation
md_ngPH-NEGwithoutmd_ngVH-NEGrejectmd_ngPROH-NEGnothingmd_ngADJH-NEGnot ablemd_ngDH-NEGnevermd_ngVH-NEGexcludemd_ngPH-NEGin absence ofmd_ngVH-NEGdenymd_ngVH-NEGdecline
SHML TagPart of SpeechMLP classTerm
October 20, 2004 XML Structuring of Clinical Narrative using NLP—148
There are a remarkable number of ways to express negation. The MLP lexicon contains 282 negation terms — nouns,
adjectives, adverbs and verbs. For example, the verb ‘declined’ in ‘patient
declined surgery’ is a patient action, but in terms of whether surgery was
performed, it is a negation.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—149
Of course, ‘decline’ has another meaning in relation to quantities, indicating a lessened value, as in
‘hemoglobin declined slightly from 12.3’but that is a matter of ambiguity
resolution.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—150
Terms expressing uncertaintyTerms expressing uncertainty
md_modalADJH-MODALhypotheticalmd_modalTVH-MODALhypothesizemd_modalNH-MODALhypothesismd_modalADJH-MODALdoubtfulmd_modalDH-MODALconceivablymd_modalNH-MODALassumptionmd_modalTVH-MODALassumemd_modalDH-MODALallegedly
SHML tagPart of SpeechMLP classTerm
October 20, 2004 XML Structuring of Clinical Narrative using NLP—151
The tag ‘md_modal’ corresponds to the MLP class H-MODAL that contains
terms that express uncertainty, a surprising 905 terms in the MLP lexicon.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—152
MLP with SHML linkageMLP with SHML linkageMedical
documents
Preprocessing (standardization)
Documents with SIDs
MLP
MLP and SHML Dictionaries
Documents in HIU’s with SHML and MLP tags
GENERATORS• SHML/DTD• SHML/XSL• SHML/XQL
OtherApplications
Viewer
October 20, 2004 XML Structuring of Clinical Narrative using NLP—153
To see how the combined MLP-SHML system functions, we return to the viewer,
this time for an example involving a patient population.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—154
JCAHO ORYX core measuresJCAHO ORYX core measures
• JCAHO – Joint Commission for the Accreditation of Healthcare Organization
• ORYX – JCAHO’s Quality Measurement System to allow quality care comparison between organizations.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—155
A number of accrediting organizations are requiring healthcare providers to
demonstrate the quality of the care being delivered. An example is the JCAHO
ORYX Core Measures.
As these measures have been refined, more clinical information is being
required, which often must be abstracted from clinic notes. This puts an added
burden and cost on provider organizations.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—156
A JCAHO ORYX core measures A JCAHO ORYX core measures for congestive heart failurefor congestive heart failure
• What percent of patients with congestive heart failure (CHF) and a low ejection fraction (EF) are on ACE inhibitors ?
October 20, 2004 XML Structuring of Clinical Narrative using NLP—157
One of the requirements of the Congestive Heart Failure measure is
what percent of patients with CHF, and a low ejection fraction, are on ACE
Inhibitors.
ACE is Angiotensin Converting Enzyme. ACE Inhibitor is a medication that can be
used to lower blood pressure, but has also been shown to decrease morbidity
and mortality in patients with Congestive Heart Failure, or a recent MI.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—158
For most organizations, accessing the Ejection Fraction requires abstracting
the chart.
This information can also be accessed by processing the documents and running a
query on the processed documents.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—159
Viewer: JCAHO requirementViewer: JCAHO requirement
• A set of CHF patients, with EF < 40% on ACE inhibitors
October 20, 2004 XML Structuring of Clinical Narrative using NLP—160
VIEWER:-Select JCAHO Core Measures at
the top of the Viewer-Select JCAHO CHF in the Query
Type field.-Select idtmerg in the Document Set
field, -push Submit
October 20, 2004 XML Structuring of Clinical Narrative using NLP—161
I have selected a data set of discharge summaries for 95 patients who were hospitalized for a variety of cardiac
conditions.
We find that there were 42 CHF Patients.— of those 42, 11 had a documented
Ejection Fraction of < 40%(the query also accepted 40%)
— of those 11, 8 were on ACE Inhibitors.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—162
We can open some of the HIU's that gave us the results for these patients.
VIEWER Open 42 CHF Pts
As an example of CHF patients
VIEWER Choose IDT051Click on HIU ‘She was found to be in
Congestive Heart Failure’
October 20, 2004 XML Structuring of Clinical Narrative using NLP—163
One HIU: ‘She was found to be in Congestive Heart Failure’
By clicking on this HIU, we see the text context.
The link to the text is important. NLP is not perfect, and 1 HIU may not be the
whole story.
Other patients have more than 1 HIU
October 20, 2004 XML Structuring of Clinical Narrative using NLP—164
VIEWER Choose IDT061Click on HIU ‘Chest Xray showed severe
congestive heart failure’This patient qualifies for CHF by the first
HIU: ‘Chest Xray showed severe congestive heart failure’
and by others.VIEWER Choose IDT061
Click on last HIU ‘congestive heart failure’
CHF was the Discharge Diagnosis.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—165
Of the 42 CHF patients, 11 had EF less than 40 %
Clicking on the first patient in this group
VIEWERGo to 11 Pts with EF < 40 %
Click on IDT001
October 20, 2004 XML Structuring of Clinical Narrative using NLP—166
we see the HIU ‘Most recent echocardiogram in 05/00/92 (May 1992)
showed an ejection fraction of 20 %’.
VIEWER Click on HIU‘Most recent echocardiogram in 05/00/92
showed an ejection fraction of 20 %’.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—167
For this patient there might me a question whether the ORYX measure is met since
the ‘most recent’ EF was obtained in May, and this admission is in September
The other criteria are met. We find the HIU ‘IMPRESSION: Congestive Heart
Failure’
VIEWER Click on HIU‘IMPRESSION: Congestive Heart
Failure’
October 20, 2004 XML Structuring of Clinical Narrative using NLP—168
and numerous HIUs showing the patient is taking Captipril, an ACE Inhibitor.
VIEWER Highlight (without clicking) the 4 Captipril HIUs
October 20, 2004 XML Structuring of Clinical Narrative using NLP—169
If time permits, or at some other time during the conference, we may explore
other examples.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—170
To conclude, I would like to summarize the program for treating text in the EHR, using natural language processing, that I
have just presented.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—171
Text in EHRText in EHRTEXT MLP … XML/SHML HIU’s VIEWER
October 20, 2004 XML Structuring of Clinical Narrative using NLP—172
Text is captured by whatever means are available at the time, and are
preprocessed.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—173
Text in EHRText in EHRTEXT MLP … XML/SHML HIU’s VIEWER
1. Electronic formTranscriptionVoiceOCR
2. Preprocessing
October 20, 2004 XML Structuring of Clinical Narrative using NLP—174
Text sentence are then passed through the 5 components of the MLP system to
produce XML trees.NIMPH is a quality control procedure
applied to NLP output trees.XML-SHML tags are added to text words,
creating HIUs, a representation of clinical facts, which are input to the
Viewer.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—175
Text in EHRText in EHRTEXT MLP … XML/SHML HIU’s VIEWER
1. Electronic formTranscriptionVoiceOCR
2. Preprocessing
tagging clinical facts
5 MLP steps
dBtrees + NIMPH
October 20, 2004 XML Structuring of Clinical Narrative using NLP—176
Templates are provided to sort and display HIUs in their document context,
or to perform other tasks on the database created by MLP.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—177
Text in EHRText in EHRTEXT MLP … XML/SHML HIU’s VIEWER
1. Electronic formTranscriptionVoiceOCR
2. Preprocessing
5 MLP steps
dBtrees + NIMPH
tagging clinical facts
SortDisplay/Access
Perform analysisPerform tasks
Tem
plat
es
October 20, 2004 XML Structuring of Clinical Narrative using NLP—178
Whether or not an NLP system, the MLP or another, will come to play a role in the
future EHR, I hope today I have given you an inkling of what is involved in
creating an NLP system, and also a hint of what might be its contribution.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—179
And I thank you for your patient attention.
October 20, 2004 XML Structuring of Clinical Narrative using NLP—180
The endThe end