formal description of temporal knowledge in case reports

32
Artificial Intelligence in Medicine 16 (1999) 251 – 282 Formal description of temporal knowledge in case reports A.A.F. van der Maas a, *, A.H.M. ter Hofstede b,1 , P.F. de Vries Robbe ´ a,2 a Department of Medical Informatics, Epidemiology and Statistics, Faculty of Medicine, Uni6ersity of Nijmegen, Kapittelweg 54, 6525 Nijmegen, The Netherlands b Cooperati6e Information Systems Research Centre, Faculty of Information Technology, Queensland Uni6ersity of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia Received 6 July 1998; received in revised form 13 October 1998; accepted 29 October 1998 Abstract Patient case analysis is an elementary and crucial process which clinicians are daily confronted with. The importance and complexity is reflected in the need to discuss cases in clinicopathological conferences and the documentation of more than 70 000 patient cases in MEDLINE. This paper introduces a generic patient case report language (PCRL) based on general medical temporal concepts to formalise temporal knowledge as present in case descriptions. The lack of such a generic technique is reflected by the fact that computers are very restrictive in accepting patient specific temporal information. Acceptance is almost always controlled and guided by specific predefined disease or treatment models. We strive for a case library consisting of unambiguous patient case descriptions formulated indepen- dent from future use. © 1999 Elsevier Science B.V. All rights reserved. Keywords: Formal representation technique; Patient case report modelling; Temporal know- ledge representation; Incompleteness; Granularity * Corresponding author. Tel.: +31-24-3613125; fax: +31-24-3613505. E-mail address: [email protected] (A.A.F. van der Maas) 1 [email protected] 2 [email protected] 0933-3657/99/$ - see front matter © 1999 Elsevier Science B.V. All rights reserved. PII:S0933-3657(99)00007-X

Upload: aaf-van-der-maas

Post on 18-Sep-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Artificial Intelligence in Medicine 16 (1999) 251–282

Formal description of temporal knowledge in casereports

A.A.F. van der Maas a,*, A.H.M. ter Hofstede b,1,P.F. de Vries Robbe a,2

a Department of Medical Informatics, Epidemiology and Statistics, Faculty of Medicine,Uni6ersity of Nijmegen, Kapittelweg 54, 6525 Nijmegen, The Netherlands

b Cooperati6e Information Systems Research Centre, Faculty of Information Technology,Queensland Uni6ersity of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia

Received 6 July 1998; received in revised form 13 October 1998; accepted 29 October 1998

Abstract

Patient case analysis is an elementary and crucial process which clinicians are dailyconfronted with. The importance and complexity is reflected in the need to discuss cases inclinicopathological conferences and the documentation of more than 70 000 patient cases inMEDLINE. This paper introduces a generic patient case report language (PCRL) based ongeneral medical temporal concepts to formalise temporal knowledge as present in casedescriptions. The lack of such a generic technique is reflected by the fact that computers arevery restrictive in accepting patient specific temporal information. Acceptance is almostalways controlled and guided by specific predefined disease or treatment models. We strivefor a case library consisting of unambiguous patient case descriptions formulated indepen-dent from future use. © 1999 Elsevier Science B.V. All rights reserved.

Keywords: Formal representation technique; Patient case report modelling; Temporal know-ledge representation; Incompleteness; Granularity

* Corresponding author. Tel.: +31-24-3613125; fax: +31-24-3613505.E-mail address: [email protected] (A.A.F. van der Maas)1 [email protected] [email protected]

0933-3657/99/$ - see front matter © 1999 Elsevier Science B.V. All rights reserved.

PII: S0933 -3657 (99 )00007 -X

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282252

1. Introduction

In jurisdiction, two types of case analysis can be distinguished, i.e. one isapplying jurisprudence, the other is applying law. The difference between theseapproaches is reflected in the difference between the education of case analysis inEnglish law schools and law schools on the continent. Jurisdiction in England isalmost entirely based on jurisprudence. Therefore, in English law schools, thou-sands of juridical cases are analysed, compared and retained. The law on theEuropean continent is much more refined and students are much more trained ininterpreting the law. The first approach contains elements of pattern matching. Thesecond approach uses laws to analyse cases and contains elements of deductivelogic. This approach can be characterised by the term deductive reasoning.

These approaches to case analysis can also be found in medical case analysis. Forexample, we use pattern matching when we explicitly refer to other knownanalogous patient cases and we use deductive logic when we explicitly refer togeneric knowledge in medical literature. Applying jurisprudence in jurisdiction isconceptually very similar to what is known as case based reasoning in the medicaldomain.

Patient case analysis in practice is to a large extent an intuitive process. Wheninterventions are planned, references to either case based knowledge or literatureare often not traceable. In jurisdiction on the other hand, the explicit justificationof a verdict is most important and gets most of the resources. Then why comparemedical case analysis with juridical case analysis? The comparison with jurisdictionis made from the case analysis support viewpoint. Patient case analysis is a crucialand complex process which clinicians are confronted with daily. The need ofmanagement for large amounts of relevant medical information is already recog-nised [20]. Also, the complexity of the case analysis process itself makes computerassistance logical [27]. Realising computer assistance however, requires an approachof case analysis as in jurisdiction. The patient case analysis process itself is to beanalysed extensively and made explicit before computer assistance has any chance.The objective of this PCRL-project as a whole is computer assistance of the processof patient case analysis. We distinguish the following patient case analysisfunctions:

Application of generic medical knowledge in medical decisions. Aspects of formalapplication of different types of clinical and nonclinical medical knowledge areimportant if we want to ‘activate’ the knowledge present in medical literature witha computer in order to bridge the gap between theory and practice.

Specific aspects of application of medical knowledge include: analysis of useful-ness and use of specific literature knowledge with respect to solving clinical decisionproblems, analysis of blind spots in medical knowledge, presentation of knowledgewith respect to its use in medical decisions, and information retrieval guided by theprocess of medical decision making.

Structure analysis of individual patient cases. Structure analysis can focus on theextrinsic or intrinsic structure of a patient case. The case structure part influencedby a clinician can be characterised by the term extrinsic structure. The case

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 253

structure part determined by patient specific knowledge can be characterised by theterm intrinsic structure. For example, temporal relations between specific symptomsare considered part of the intrinsic structure.

Medical analysis of patient cases in order to verify and validate medical reason-ing processes. Formal analysis in a medical context deals with the theoretic basis ofcomputer support of the process of diagnosis, prognosis, selecting examinations,selecting treatments, etc. A specific example could be protocol usage verification.

Aspects of medical analysis are not only the medical interpretation of all kinds ofpatient specific knowledge of one patient, but also the general strategies used inanalysis processes.

Matching of patient cases. Archiving patient cases in an electronic library withautomated retrieval functions using the contents of the archived patient cases. Inthe WAREL-system [10], a step is made on the path of retrieval of clinical dataconsidering courses of diseases.

These functions have in common the analysis subject being a patient casedescription. As a result, the above mentioned objectives require a case descriptiontechnique which is the focus of this paper. In MEDLINE, �90% of the descrip-tions of patient cases are called patient case reports (PCR) or case reports for short.Other terms include: case history, patient description, presentation of case andclinical description. These case reports are natural language descriptions. A formalartificial language is needed to communicate these case report data with a com-puter, a language which shares elements of natural language as well as codingsystems [12].

Fig. 1. An INTERNIST-1 dialogue.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282254

Fig. 2. PCRL: a patient case description language to communicate patient case data with the computer.

Fig. 1 shows the interface of an INTERNIST-1 dialogue [22]. This is an exampleof a data entry dialogue which is fully directed by a specific application and onewhich requires renewed interaction when the disease profiles in the knowledge baseare changed. In this paper, a more flexible PCR language (PCRL) is introduced.Fig. 2 illustrates the multiple use and re-use of a PCRL case report.

In PCRL, the focus is on modelling of temporal concepts as present in naturallanguage PCRs. The modelling of temporal concepts is a very difficult and alsocrucial part in the representation of case reports. Medical patient case analysis isheavily based on the interpretation of temporal occurrence patterns of all kinds ofpatient characteristics. When specifying temporal concepts in a patient case, thecase becomes more detailed and realistic and as a result cases are easier todifferentiate. An endless amount of different cases can be composed using the samepatient characteristics with different temporal contexts. PCRL should have thefollowing properties:

to realise computer manipulation of any kind, the subject offormallydefined manipulation is to be defined unambiguously. We need a

modelling technique embedded in a mathematical framework [9].The term formal in this paper is used to refer to a mathematicallevel of formalisation;

expressive a patient case modelling technique is expressive if all case reportdata can be specified as detailed as needed in the process of caseanalysis;conceptual indicates that the modelling of a patient case is notconceptualdriven by any specific disease or treatment model. A conceptualmodelling technique allows the process of modelling of patientspecific knowledge to be ‘clinician driven’. Conceptual patientcase modelling techniques allow formulation of patient specificknowledge to be independent from any specific (future)interpretation or implementation;a patient case modelling technique is suitable if patient casesuitablespecifications are easy to formulate and easy to read. To enable

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 255

short elegant formulations, a high-level of abstraction isneeded. Domain specific concepts familiar to the clinician areto be recognised and admitted in the technique, i.e. a suitablespecification technique needs to be concept and problemoriented instead of machine oriented [15].

Formalisation of a modelling technique is useful with respect to verification andvalidation. Formalisation is essential if the specifications formulated by means ofthe modelling technique are to be interpreted by a computer program. Also,without formalisation proof of properties, assessment as well as comparison ofmodelling techniques can not be carried out in a constructive way [16]. Informaldefinitions easily introduce ambiguities, which block the possibility of automatedconsistency checks. For example, the widely used is-a link in semantic networksseems to be used in a very similar way by designers of automated systems. In [5],six conceptually different is-a links are catalogued and the importance of a cleandefinition to avoid ambiguity is illustrated. Regarding the importance of suitability,one can say that the development of situation specific techniques and methods is tobe strongly recommended, i.e. there is no ‘silver bullet’ [4]. The requirement of apatient modelling technique to be expressi6e is hard to realise. A rich model of timeis needed to capture the diversity of aspects involved in medical problems [19].Developing a conceptual modelling technique is a challenge, for it has majoradvantages. A major advantage is that the clinician can determine the extrinsicpatient case structure himself. For example, the clinician can determine the order ofthe questions to ask, i.e. the order of asking is only specified when it is neededconceptually. Another major advantage is the fact that the clinician can specify anycase on his or her mind, even unusual or unknown cases, for the formulation of acase is not conceptually restricted by existing disease or treatment models. As aresult, a ‘clinician driven’ patient case illustrates the authors view on the case. Asa consequence, archiving these cases for future use makes sense. An old patient caseonce memorised in the computer can be re-evaluated without renewed interactionwith the author. If, for example, a knowledge base is extended, re-evaluation of acase could be of use when formerly only a small part of the case was interpreted.A side effect of clinician driven patient cases is that restrictions or shortcomings ofcomputer case analysis become clear when a large part of a case report is ignored.Such criteria, based on quality of use of evidence, should get more attention asassessing formal patient case analysis procedures is of current interest [13].

The Canon group [29], has a similar ambition: ‘ … developing a deeper represen-tation formalism for use in exchanging data and developing applications … ’ [11].The Group works on the development of a medical-concept representation lan-guage (MCRL) [11]. However, their ambition is not yet focused on detailedrepresentation of temporal knowledge in patient cases. Next to MCRL, otheradvanced generic formal modelling techniques like KL-ONE [6], GRAIL [26] andthe ARDEN syntax [17] do exist, but are not focused on representation of temporalknowledge. A patient case modelling technique, which meets the requirementsmentioned above, does not yet exist. In this paper, the approach and nomenclature

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282256

of the European committee for standardisation CEN/TC 251 ‘Time standards forhealth care specific problems’ [8] is adopted and explored.

2. Toward formalisation of patient cases

In this paper, we introduce a generic concept based patient case modellingtechnique referred to as PCRL. The objective of PCRL is the formulation of formalsuitable descriptions of temporal knowledge as present in patient cases. In the firstsection, the subject to be modelled in PCRL is described. In Section 2.2, the wayPCRL is defined is described.

2.1. Essentials of case reports

Cases in general can be characterised as descriptions of practical instances ofsituations implicitly presenting one or more problems. Analogously, we characterisepatient cases as medical descriptions of specific patients implicitly representingmedical problems. A patient case contains information about the physical, mentaland social state of the patient as well as information about the decision makingprocess. These elements are all visible in the representation of patient cases. In thecase records of the Massachusetts General Hospital, described in the New EnglandJournal of Medicine, the elements concerning the decision making process aredeliberately separated from the so called objective patient data on the physical,mental and social state. The latter information is gathered under the title ‘presenta-tion of case’, whereas the explicit interpretation of the patient specific informationis gathered under the title ‘differential diagnosis’. Formalisation of the ‘differentialdiagnosis’ part of PCRs, i.e. the explicit representation of the patient case reasoningprocesses, is a long term objective. This part is not within the scope of this paper.In this paper, we focus on representation of temporal knowledge as present in the‘presentation of case’ part being the requisite for future computerised analysis ofPCRs. To put it another way, we will realise an epistemological analysis of casereports, focused on the ontology of time. Epistemological analysis is a high-levelconceptual knowledge analysis concerning ontology and inference (or interpreta-tion) [23]. Ontology represents the conceptual model of entities and relationships ofwhich the domain of interest is composed [23].

We give some examples of arbitrary patient cases to illustrate the type ofdescriptions we like to model in PCRL. The first example is also used in ref. [14] toillustrate temporal relations in a patient case. The second example is the first partof the final presentation of an arbitrary case of the year 1993 (CASE 52-1993)published in [28] and it is part of a clinicopathological exercise:

Example 1. The palpitations started when I stopped taking the tablets—orperhaps just before. I didn’t feel any chest pain while taking the tablets. But Iremember that I did feel some chest pain at the same time as the palpitations. Yes,that’s right, I stopped taking the tablets, and the chest pain started at once—orperhaps a little later.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 257

Example 2. A 17-year-old girl was admitted to the hospital because of massivehemoptysis and renal failure. The patient had been well until about 3 days earlier,when she began to have symptoms of an upper respiratory tract infection, withheadache and anorexia. She became nauseous and vomited 36 h before entry. Onthe evening before admission, chest pain developed, and she began to cough upblood-tinged sputum; dyspnea occurred, with intermittent nausea and vomiting.During the night, dyspnea persisted, and the next morning she began to cough uplarge quantities of blood and experienced diffuse weakness.

2.2. Approach

We choose PCRL to be defined as a language, for a language is considered easyto read and most important, language descriptions are directly related to native textualpatient descriptions we are familiar with. This paper has nothing to do with naturallanguage processing (NLP). It has to do with formal representation of a very restrictedlanguage, which can be used, e.g. as an interface language. To illustrate this and alsoto relate the abstract definitions to a concrete practical tool, we have added someexamples of screens of a window based general case report specification interface whichcan generate PCRL-sentences for the purpose of computer aided data analysis inSection 5. The exercise of defining a PCRL grammar is very similar to the exercisepresented by the European Standardisation Committee CEN/TC 251 mentionedearlier, except for the fact that it is more expressive, more conceptual and more suitable.

3. Definition of the PCR language

To define PCRL, in contrast to the CEN/TC 251 definition, a Backes-Naur Formtwo level grammar [1] (see also Appendix A) is used. A two level grammar allowsa comprehensive definition of extensive and complex languages. A two level grammar,e.g. allows one to define context dependencies. As a result, an important propertyof PCRL is its context dependent use of value types, i.e. PCRL statements use valuetypes consistently. This is only realised when the concepts, defined on a meta-level,are substituted in the low-level definitions consistently (consistent substitution isexplained in Appendix A).

The two level grammar is presented as an abstract syntax definition, i.e. terminalsare omitted. The term abstract indicates that the focus is on concepts and their relationsand not on the way these concepts are represented. The way concepts are representedin PCRL is laid down in the concrete syntax definition. In this paper, we focus mainlyon the definition of a suitable abstract syntax. The definition of a suitable concretesyntax has it’s own difficulties which are not in the scope of this paper. Only forillustration purposes, every low-level BNF syntax definition is followed by anemphasised very simple example of a ‘one to one’ translation of the abstract syntaxinto a concrete syntax. In Section 4, using this concrete syntax definition, we can showa PCRL-based translation of the medical natural language case report examples 1and 2.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282258

In Sections 3.1 and 3.2, PCRL concepts and the ‘perception of the real world’ arepresented, respectively. These sections determine what kind of patient specificknowledge is represented in PCRL. The subsequent sections contain syntax defini-tions. These definitions determine how this patient specific knowledge is described inPCRL. These sections contain small PCRL-description examples to illustrateintroduced syntax definitions. Section 4 contains an elaborate PCRL-descriptionexample. Section 5 illustrates some input parts of a PCRL-based application as wellas some output examples.

3.1. Ontology

Before introducing a model of time, definitions of temporal concepts as presentedby the European Standardisation Committee CEN/TC 251 [8] are given. Theseterms are adopted in a formal framework. While doing this, these terms inevitablyget a more precise and specific meaning. This will be done without major violationof the following informal CEN/TC 251 definitions of temporal concepts:

time-interval a portion of time of which the duration in a given context isconsidered to be significant and relevant;a portion of time of which the duration in a given context istime-pointconsidered to be insignificant or irrelevant;

situation a phenomenon occurring over time in a given context;a situation considered to occupy a time-interval;episode

event a situation considered to occur at a time-point.

In PCRL, phenomena are the one and only basic elements which are put in atemporal context. This seems very simple, as in formalisms found in literature anexplicit distinction between properties, events, and processes is made. This distinc-tion is based upon different temporal characteristics of properties, events, andprocesses. A useful test to make a distinction between events and processes is basedon the fact that one can count the number of times an event occurs but one cannotcount the number of times a process is occurring [3]. Note that in this specificcontext, one should not confuse occurring with occurrence. The examples: ‘age of45’, ‘car accident’ and ‘bleeding of the aorta’ are considered a property, an event,and a process, respectively. However, in patient case descriptions, items are neverexplicitly typified as being a property, an event or a process. These descriptionsallow a common view on properties, events, and processes. The reason for this isthe absence of complex property specific, event specific, and process specificstructures like recursion in properties and local (time dependent) parameters inprocesses. When using a common view on the representation of properties, events,and processes, the modeller when describing temporal features, is not a prioriforced to explicitly select either one of them. The use of the generic conceptphenomena in PCRL instead of property, event and process can be compared with,e.g. the generic use of the concept finding in [18]. In this paper, findings aregenerically defined just as being patient specific pieces of information. It can also be

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 259

compared with the generic use of observations in [24]. In this paper, diagnoses,symptoms and signs in the medical reasoning process all are referred to asobservations. Consequently findings like ‘fever’, ‘dyspnea’, ‘blood loss’, ‘pain’, etc.in our model all are referred to as phenomena. We conclude that:� case reports do contain properties as well as events and processes which are not

typified;� we do not want to force a clinician to explicitly typify properties, events, and

processes explicitly;� we do want the formulation of temporally consistent PCRL sentences.Inspired by the above mentioned observations, a common temporal view onproperties, events, and processes is defined. The common term situation is intro-duced to indicate either the ongoing of a process, the occurrence of an event or thepresence of a property. The logical concept of property, event and process areadopted from [3]:

property a (logical) proposition which can be true or false during a particulartime;a description of an activity which involves a product or outcome;eventa reference to some activity not involving a culmination or antici-processpated result.

The common temporal view on properties, events, and processes is one of thecharacteristics of PCRL and makes it suitable for practical use in the medicaldomain where it is primarily designed for, without losing the computationalpossibilities of, e.g. Allen’s general theory of action and time [3].3.2. Introducing a model of time

When we formulate a case report, we have a specific patient case in mind.Although the intention might be to describe just this specific patient, the resultingcase report is an incomplete description and is therefore an abstraction of reality.Consequently, when we discuss a case report, others will have a set of compatiblepatient cases in mind. Therefore, we will discriminate between the concept ofpatient case, and the concept of PCR. The set of all compatible patient casesdetermines the semantics of a specific PCR. To define the semantics of PCRL, weneed to define the concept patient case. What patient cases should be reasonedabout? What is our view on the real world? PCRL is to describe temporalknowledge as present in patient case descriptions. Consequently, from this point ofview a patient case will be an instance of a model of time. Now we will define sucha model of time. Firstly, the following auxiliary concepts are introduced: situations,profiles and phenomena. The general concept phenomenon is adapted from theGALEN high-level ontology [25]:

Definition 1. X is the set of phenomena. X consists of all diseases, signs,symptoms, interventions and other patient state indicators as present in PCR casereports.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282260

These phenomena are the basic PCRL elements temporally reasoned about. Phe-nomena will be used to compose situations. In PCRL, context situations areconsidered descriptions of moments of time. In the following definition of SITUA-TION, we use the functions: HOLDS, OCCUR and OCCURRING as defined byAllen in [3]. These three functions are defined on properties, events, and processesrespectively. These are well known concepts also formally introduced by Allen,Before a common view on properties, events, and processes is defined, the conceptof time point and time interval is introduced:

Definition 2. T is the set of time points and I is the set of time intervals,{�t1,t2��T×T �t1B t2}, where, ‘B ’ is a chronological total order of time points.Intervals have the following access functions, for each interval i=�t1, t2�, ON-SET(i )= t1 and END(i )= t2.

A common view on properties, events, and processes is now defined. SITUA-TION(s, t) is to be read as ‘a moment with s at time point t ’:

Definition 3. A situation indicates either the presence of a property, the occur-rence of an event or the ongoing of a process: For x is an element of X and t is anelement of T,

SITUATION(x, t) def×i [ONSET(i )B tBEND(i )�PRESENT(x, i )],

where PRESENT represents one of the following functions defined Allen [3]:HOLDS when x is a property, OCCUR when x is an event and OCCURRINGwhen x is a process.

We now resume the definition of patient cases. With SITUATION, we now candefine isolated momentary states. To define a complete individual patient case theentire history of the patient is to be determined. To do this the following occurrencefunction will be used:

Definition 4. F is the set of occurrence functions, F is the set of functions fromtime points to Boolean values, i.e. F={true, false}T.

An occurrence function can be graphically represented by a straight dotted lineextended on a time axis. The line segments indicating that the occurrence functionhas value true and the interruptions indicating that the occurrence function hasvalue false (Fig. 4). Each line segment associated with a phenomenon will bereferred to as an occurrence. An occurrence is interpreted either as one continuouspresence of a property, as one process, or as one event. The occurrence function isused in the following definition of PROFILE. PROFILE(x, f ) is to be read as:‘occurrences of phenomenon x are described by occurrence function f’. Conse-quently, the predicate PROFILE(x, f ) fully determines the history of phenomenonx :

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 261

Definition 5. For each x in X and f in F,

PROFILE(x, f ) defÖt�T [ f(t)USITUATION(x, t)]

Using the concept PROFILE, we can define our patient case concept easily. A patientcase is an extensive indication of presence of phenomena. A patient case indicateswhether any phenomenon is present at any time-point:

Definition 6. Y is the set of patient cases, where a patient case is a set of pairs ofphenomena and occurrence functions that are true profiles, i.e. {�x, f��X×F �PROFILE(x, f )}.

As stated previously, the definition of the patient case concept defines the view onthe real world and determines what can and cannot be reasoned about in PCRL casereports. This model of time makes clear that not all temporal trends, as describedin [18], can be inferred in PCRL. For example, temporal trends concerning changesin severity cannot be expressed, neither can anatomical movements. To monitoranatomical movements phenomena require a location attribute. The interpretationof location attributes would require a model of place. The combination of a modelof time with a model of place would be an interesting topic of future research. Fornow, we restrict ourselves to the view on patient cases as introduced in this section.

3.3. PCRL statements

In the previous section, we introduced situations and profiles in order to definepatient case. These concepts however, are not suitable to organise and declare patientspecific knowledge. We need to introduce additional concepts to enable theformulation of suitable statements. A lot of basic temporal concepts can be foundthroughout the literature. In [7], an inventory of the most common concepts is made.These concepts were: intervals, instants, time points, properties, events, states, actions,occurrences, processes, histories, and facts.

We used phenomena to capture the concepts of properties, events, and processes.Phenomena are now to be put in a temporal context. To allow qualitative temporaldescriptions, phenomena are put in a temporal context in an indirect way. Phenomenaare assigned to episodes using either qualitative or quantitative temporal occurrencepatterns. An episode is a specific type of time interval description. In total wedistinguish two types of time interval descriptions, being episodes and periods. Anepisode describes a continuous closed time interval which can not be freely chosenby the modeller. The onset and end of an episode is fully and intrinsically determinedby the history of the assigned phenomenon. An episode therefore is part of theintrinsic PCRL case structure. A period in PCRL is also a continuous closed timeinterval, but its onset and end is chosen by the modeller in order to group patientspecific data. For example, the modeller may focus on the past 2 weeks when gatheringinformation. As a result, periods and their temporal relations determine the extrinsicPCRL case structure. Fig. 3 shows what PCRL statements are composed of and showsthat phenomena in a PCRL case are always assigned to an episode.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282262

Our goal, an unambiguous temporal context of phenomena, is realised by firstintroducing a well defined assignment of phenomenon statements to episodes, andsecond by introducing a well defined temporal context of episodes. Phenomena areassigned to episodes by the use of qualitative as well as quantitative occurrencepatterns. Conceptually, occurrence patterns in PCRL are related to temporal trendsas described in [18]. Periods and episodes are put in a temporal context usingtime-point expressions and interval patterns. Interval-patterns enable logical quan-tification of time interval descriptions. The concept of quantification is based onquantification as used in first order logic. A logical quantification of time intervaldescriptions is a powerful mechanism, for it allows to generalise a temporal relationto a group of episodes, not by enumerating these episodes explicitly, but just bygiving an essential characteristic the episode must fulfil. In PCRL, the use ofquantification enables the following information to be expressed: ‘all episodes of thepast 2 weeks with the phenomenon headache didn’t last more than 2 h’. All ofabove introduced concepts will be introduced in the following PCRL syntaxdefinitions. We start with the ‘top-level’ definition. This definition defines the ‘toplevel’ concept, i.e. the PCRL patient case report (PCRp):

PCRp:: PCRp-Id ; Top-Level-Statement PredicateEpisode � Period � Time-Point Expression � Interval-Pat-Top-Level-tern � Pattern ExpressionStatement::

A PCRp composed of a PCR-Id i and a Top-Le6el-Statement Predicate p can bedenoted as: ‘Patient [i ] is known with: [p]’.

Fig. 3. A decomposition of PCRL statements.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 263

Example 3. A PCRp example: ‘Patient A is known with: an episode of palpita-tions.’,where ‘A’ is a PCRp-Id, and ‘an episode of palpitations’ is an elementaryepisode predicate, and ‘palpitations’ is a phenomenon (respectively explained fur-ther in Appendix B, Sections 3.4 and 3.7.1)

Fig. 3 illustrates that episodes, periods, and time-point expressions appear ondifferent levels within PCR expressions. The term ‘predicate’ is reserved to indicatelogical expressions. Predicates are composed of PCRL-statements and logicaloperators (Appendix B). The term expression is reserved to indicate terms com-posed of a comparison operator which can be either ‘equal to’, ‘less than’ or‘greater than’. When comparing time points, we use ‘is-before’ instead of ‘less-than’ and ‘is-after’ instead of ‘greater-than’. Expressions are composed of termsdescribed in Appendix B. Fig. 3 illustrates the main PCRL concepts. If wecompare the concepts in this schema with concepts used in (nonmedical) temporallogic, we can say that time point expressions do already exist in temporal logic.Period statements and episode statements can be characterised as specialised timeinterval descriptions and phenomena as explained earlier can be characterised asgeneralised descriptions of properties, events, and processes. Period statementsorganise knowledge similar as in natural language patient case descriptions (PCRn)to realise short formulations for the sake of suitability.

In this section, the top-level definition of PCRL is presented. This gives anoverall idea of the concepts related to a PCRp. In the sections following however,we define PCRL using a bottom up strategy. First, we define the bottom levelstatements of Fig. 3 which are phenomena and occurrence patterns. Thereafter,the two types of time interval descriptions used in PCRL are introduced. In theend, we define the way time interval descriptions are put in a temporal contextusing time point expressions and interval-patterns. The appendix is added tocomplete the PCRL definition. These definitions are referred to by the abstractsyntax definitions and they concern important enforcement of type consistent useof terms, values, variables, scales, etc. However, these definitions need not bediscussed on a medical conceptual level.

3.4. Phenomena

As stated earlier, phenomena describe properties, events, and processes. Apatient case may consist of several identical phenomenon statements. When mak-ing inferences on such identical phenomenon statements we could decide tostrongly relate these statements. However, the more general these phenomenonstatements are, the bigger the chance that they do not represent similar concepts.In that case, strongly relating them would be misplaced. For example, a phe-nomenon ‘headache’ used on different locations in a patient case can refer tocompletely different instances of headaches. When a patient would explicitly relatethese headaches as being similar, we would have much more confidence in ourinference mentioned above. We need to represent such valuable information inPCRL. This would not be a problem if the similarity is known and made explicit,

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282264

Fig. 4. Structured qualitative temporal pattern description.

but very often that is not the case. Consequently, this type of incomplete knowledgehas to be dealt with.

We conclude that a facility to enable references to phenomena is needed, but alsothe possibility to distinguish identically described phenomena. This is an identificationproblem. We will identify individual phenomena not only by their description, butalso by their episode context:

Phenomenon:: Phenomenon-Id ; Episode-Id

A phenomenon composed of a phenomenon identification i and an episode identificationj can be, denoted as. ‘[i ] as in [ j ] %. Note that an Episode-Id is omitted if and only ifthe phenomenon is not referred to in the PCRp).

Example 4. An example of a phenomenon which is referred to: ‘an episode ofpalpitations as in e1’, where ‘an episode of palpitations’ is an episode, and ‘palpitations’is a phenomenon. The episode-id e1 refers to another episode in the PCRp (episodeis explained further in Section 3.7.1).

In the following section, we introduce the concept of occurrence patterns. Occurrencepatterns will be used to describe the temporal context of a phenomenon with respectto a specific episode.

3.5. Concept of occurrence patterns

In the medical domain, the occurrence function is not suitable to describe a patientcase reality as defined in Section 2.1. Patient specific knowledge is rarely assignedto absolute time points or absolute time intervals. In practice, only characteristicsof an occurrence function are described. Fig. 4 is an example of an occurrence functiondescribed with a PCRL description. This PCRL description describes the change ofduration of the individual occurrences in a structured way. Note that the functionof the mean duration in time is not included in this figure. The PCRL descriptioncharacterises it only partly.

To characterise an occurrence function, we will use three types of functions.Functions which describe:� the frequency at which the occurrences appear;� the duration of the occurrences; and� the duration of the intervals between the occurrences.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 265

The latter function is referred to as duration-complement function. These threefunctions themselves can be characterised using another function referred to asdirection function. A direction function can indicate value changes of any function,i.e. it can describe its differential function. As a result of this analysis, in PCRL thefollowing pattern types are implemented:

PATTERN::m Frequency � Duration � Duration-Complement � Direction

In the following section, substitution of these concepts will generate four types oftype consistent PCRL pattern descriptions.

3.6. Describing occurrence patterns

In the previous section, we recognised three types of occurrence functionsdescriptions: the frequency description describes the frequency of occurrences, theduration description describes the duration of occurrences as opposed to complement-duration which describes the duration in between these occurrences. Each of thesedescriptions presents itself with its own values and scales (see Appendix B):

Occurrence Pattern:: Frequency Description’; Duration Description’;Complement Duration Description’

To describe the shape of the three above mentioned functions, a direction patternis also used. Each of these functions can be described in a similar way:

PATTERN Pattern Aspect’; PATTERN Profile’; Direction Description’;Description::

Macro pattern’: Occurrence-PatternMean � Max � Min � TypicalPattern-Aspect::PATTERN; Occurrence-Pattern Id’PATTERN

Profile::

A duration description of pattern aspect p, composed of pattern profile 6alue 6, adirection description d, and a macro pattern m, can be denoted as: ‘a [p] duration which(6alue) is [6] which (direction) is [d] which (o6erall pattern) is [m]’. Note that anOccurrence-Pattern-Id is omitted if and only if the pattern is not referred to in the PCRp.

Example 5. An example of a frequency description and two duration descriptions:� ‘a minimum frequency which is daily’ ;� ‘a typical duration which is persisting’ ;� ‘a mean duration which value is less than 10 s which is increasing’,where ‘minimum’, ‘typical’, and ‘mean’ are pattern aspects, and where ‘daily’ is apredefined frequency value, and where ‘persisting’ and ‘increasing’ are predefineddirection values, and where ‘less than 10 s ’ is a duration denoted as a (relative) timepoint (time points are explained further in Section 3.8).

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282266

The recursive structure of the definition allows the formulation of detailed descrip-tions and approaches some of the expressiveness of natural language. There are tworecursive elements in this complex definition. One recursive element is the directiondescription. A direction description can be composed of other direction descriptionsfor direction is again a PATTERN. As a result, we now can describe tendencies oftendencies, e.g. ‘decreasing increasing frequency of … ’, i.e. in PCRL ‘a frequencywhich direction is (a direction which is increasing which is decreasing) … ’. The otherrecursive element is the fact that an occurrence pattern can be composed of anoccurrence pattern, which then is referred to as its macro pattern. A macro patterndescribes patterns on a higher level. It allows descriptions of patterns of occurrencepatterns, e.g. ‘yearly episodes of daily headaches’, i.e. in PCRL ‘a daily headachewhich is yearly’. The concrete syntax of recursive elements can be realised with‘endless’ use of the term ‘which’. This is comparable with the use of the term ‘which’in the GRAIL-formalism [26]. GRAIL uses ‘which’ to create composed terms. Patternvalues can be described in PCRL either by using references to other values or bydescribing characteristic parameters of a temporal pattern using pattern expressions:

PATTERN::m Reference Occurrence-Pattern-Id � PATTERN Declaration

The need for the use of references to vague descriptions holds for patterns just thesame as it holds for phenomena. For example, we can use the term ‘often’ at severalplaces in a case report to indicate a qualitative frequency, which can be interpretedas being the opposite of ‘sometimes’. However, the vagueness of the description stillallows the semantics of each instance of the term ‘often’ to vary a lot. This preventsthe possibility of valuable comparison. To allow comparison of patterns, an explicitpattern reference, in this case something like ‘as often as’, is to be used. In case reportscomparison, i.e. indication of similarities or differences, is often more important thanthe use of absolute values. For example, the patient might give the following valuableinformation: ‘my weight at this moment is as high as it was when the symptomsstarted’.

The fact that frequency regularities and duration regularities are both typified astemporal regularities makes it possible to compare a frequency regularity to aduration regularity with the use of a pattern predicate (see top level syntax definition).Note that the formulation of composed pattern terms (see Appendix B) within patternexpressions is type consistent. In case of computer support, consistency is alwayswelcome. The counter part however is that it enforces a strict separation of the varioustypes of temporal patterns. In PCRL, there are no variables describing a durationas well as frequency regularity at the same time, for a variable is typified as eithera duration or a regularity. Maintaining type consistency is burdensome when adescription such as ‘regular pulse’ is to be converted to a PCRL statement. Theconversion requires the use of at least two variables; one to describe a frequency valueand one to describe a regularity-value. One is also forced to choose either a weakor a strong interpretation, for a ‘regular pulse’ can indicate a regular frequency, aregular duration or both. It could even be indefinite. A PCRL translation of the latterinterpretation of the term ‘regular’ would be: ‘a frequency which is regular or aduration which is regular’.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 267

3.7. Inter6al descriptions

In PCRL, two types of time interval descriptions are distinguished being episodesand periods. Episodes and periods are specialised time interval descriptions usefulin describing patient specific temporal knowledge. The use of specialised conceptsenables the formulation of short expressive descriptions. As a result, PCRL is alanguage which has more similarities with natural language expressions thantemporal logic which makes PCRL-sentences, if compared to logical formulas,easier to read and understand. In the following sections episodes and periods aredefined.

3.7.1. EpisodesWe now have descriptions of moments of time as well as descriptions of temporal

patterns. An episode is a description of the state of a patient on a continuous closedtime interval. An episode description is composed of a phenomenon predicate, anoccurrence pattern predicate, and an episode identification:

Episode::

Phenomenon Predicate; Occurrence-Pattern Predicate’; Episode-Id’

An episode composed of a phenomenon predicate p, an occurrence pattern predicate oand, an episode identification i can be denoted as: ‘an episode [i ] of [p] which has [o] %.Note that an Episode-Id is omitted if and only if the episode is not referred to in thePCRp.

Example 6. An example of an episode:

‘an episode e1 of palpitations which has a minimum frequency which is daily ’,

where e1 is the episode-id, and where ‘palpitations’ is an elementary phenomenonpredicate, and where ‘a minimum frequency which is daily’ is an elementaryoccurrence pattern predicate.

Example 2 consists of two different episodes of vomiting. In Fig. 9, both of theseepisodes are referred to as ‘vomiting’. This example shows the need for an episodeidentification in order to be able to draw a distinction between episodes with similarphenomena descriptions. The use of similar phenomena referring to differentepisodes is often the case, especially when common phenomena, such as ‘headache’,‘weight loss’ and ‘fever’ are to be described. An occurrence pattern (Section 3.6)predicate defines the temporal context of an episode, making use of episodeidentifications. Each episode describes a specific continuous time interval, whichonset and end are determined by the phenomenon predicate and the temporalpattern predicate it is composed of. In other words, if an episode is described by aphenomenon predicate ‘headache’ and a temporal pattern predicate ‘weekly’, thepatient can be asked ‘When did this weekly headache start? The answer isconsidered to be the onset of the episode mentioned above. The patient can also be

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282268

asked ‘When did this headache start?’. In that case, the temporal pattern is keptimplicit and a default pattern is to be assumed. This default pattern in PCRL is thecombination of three qualitative patterns: onset pattern, finish pattern and presencepattern. The first two patterns assume an occurrence respectively in the first and lastquarter of the described time interval. The presence pattern assumes occurrence inthe second and third quarter. Note that the continuous presence of a specificphenomenon is not required. The interpretation of the default pattern is a simpleworkable estimation which can be mapped on our model of time as the meaning ofthis description can be formulated as a restriction on the set of possible occurrencefunctions. This estimation however is not binding, for interpretations of temporalpatterns in PCRL can also be made context dependent, thanks to the use of episodecontexts. In PCRL, this means that the phenomenon predicate of an episode can bedrawn into the interpretation of the qualitative temporal pattern predicates of thisepisode. We will not pursue this problem any further as the making of an inventoryof qualitative patterns and their possible interpretations is not within the scope ofthis paper. The patient can also be asked: ‘Did you have this weekly headachebefore’ and if the answer is yes the following question ‘When did this weeklyheadache start for the first time?’, will have a different answer compared to thequestion mentioned earlier in this section. The fact that the last question has adifferent answer is because of the fact that it is not the default temporal pattern thatis applied.

The property of episodes is that episode-onset and episode-end are completelydetermined by the combination of phenomenon and occurrence pattern. From thisproperty, the following property can be derived: ‘two episodes with equivalentphenomenon predicates and equivalent occurrence pattern definitions will neverhave an overlap of any kind’.

3.7.2. PeriodsA period is like an episode a description of the state of a patient on a continuous

closed time interval. The clinician uses periods when he wants to direct thestructuring of temporal patient specific knowledge. This means that the clinicianhimself can determine the beginning and end point of a newly introduced period.For example, the clinician can ask: ‘Did you have weekly headaches last year?’. Inthis case, the clinician himself defines a temporal scope. In natural language casereports, quotes like ‘there was a moment of vomiting’ are ambiguous. If ‘moment’refers to an episode, we can conclude that the complete vomiting has taken place ina short time span. On the other hand, if ‘moment’ refers to a period, we willconclude that a moment of an episode of vomiting is observed, i.e. the vomitinglasted at least one moment. In PCRL, periods can help putting parts of episodes ina temporal context:

Period:: Episode Predicate; Period-Id’

A period composed of an episode predicate e and a period identification i can bedenoted as: ‘‘a period [i ] with [e]’. If e is an elementary predicate and not referred to(no identification is explicitly defined) then the prefix ‘an episode of’ can be omitted.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 269

Note that a Period-Id is omitted if and only if the period is not referred to in thePCRp.

Example 7. An example of a period:

‘a period p1 with hospital admission and fe6er spikes and chronic diarrhoea ’,

where p1 is the period-id and ‘hospital admission and fe6er spikes and chronicdiarrhoea’ is an episode predicate composed of episodes which are not referred toin this example. Therefore, the prefix ‘an episode of’ is omitted all three times.

The interpretation of ‘moment of vomiting’ to be a period, is based on the limitedknowledge of an observer. In this case, the observer does not want to say anythingabout the vomiting before or after the observation. This notion implies that vagueknowledge is to be represented, i.e. we do not want to be forced to makeassumptions. Therefore, in PCRL, when a period p is composed of an episode e, thetime interval of p is not considered to completely enclose the time interval of e.Instead, only an overlap between p and e is assumed. The extent of this overlapdepends on the temporal pattern of episode e, i.e. the temporal pattern of e mustbe applicable on p ! This property makes the following derivation example true:

Example 8. An interpretation example:

If period p1 consists of an episode ‘weekly headache’ then the length of p1 isat least 1 week.

Example 7 can be interpreted as follows, counting in that occurrence patterns areabsent and default temporal patterns are to be presumed:

Example 9. An interpretation of example 7:

Within period p1 the patient had at least once: a moment with hospital ad-mission, a moment with spikes, and a moment with diarrhoea.

3.8. Relating indi6idual time inter6als

With episode and period statements, we describe individual time intervals. Toenable the formulation of a temporal context of each of these interval descriptions,another structure is introduced. This structure mutually relates interval descrip-tions. For example, episode e1 can be related to episode e2 by saying that the onsetof e2 is after the end of e1. When we relate time intervals we use interval references.In PCRL, we have two types of time interval descriptions and consequently thereare two types of time interval references:

Interval-Reference:: Period Reference � Episode Reference

Each time interval description implicitly represents one time interval. Consequentlya time interval can be referred to either by a complete time interval description orby an identification of a time interval description:

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282270

Period � EpisodeINTERVAL::mINTERVAL Reference:: INTERVAL � INTERVAL-Id

We now reduce the problem of adding a rich temporal context to episodes andperiods to the problem of temporally relating time intervals, a well-known temporallogic problem. In [2], an inventory of all temporal time interval relations is made.There is a total of 13 possible time interval relations, with names such as ‘before’,‘meets’, ‘contains’, etc. This list of extensive interval relations is not used byclinicians. Clinicians cannot be forced to use such an extensive terminology in sucha rigid way. We therefore prefer to reduce the problem of temporally relating timeintervals to the problem of temporally relating time points. This does not harmexpressiveness as time intervals can be completely characterised by their extremetime points. As a result, a PCRL translation of an expression like ‘a before b’ willbe ‘end of a is-before onset of b’:

Time-Point Declaration � Interval-ExtremityTime-Point-Description::Onset: Interval-Reference � End:Interval-Extremity::Interval-Reference

An inter6al extremity, onset or end, composed of inter6al reference i can be denotedrespecti6ely as: ‘onset of [i ]’ and ‘end of [i ]’. If i already exists in the case descriptionthen i has a prefix ‘the’ instead of ‘a(n)’. Note that, if episodes and periods are usedin time point descriptions they need not be declared first. Episodes and periods whenused in time point descriptions are true by definition.

In PCRL, time points are related by the use of time point expressions (seeAppendix B). Using time point expressions, ranges of time point values can bedefined explicitly. However, sometimes the possible range of a value is kept implicit.These implicit value ranges of time points are referred to as margins. When amargin of a quantitative value is kept implicit it is not that much of a problem, forthe margin can be estimated from the level of detail of the specification of thequantitative value. For example the quantity ‘3 years’ is less detailed than thequantity ‘3 years and 3 days’ and therefore is interpreted to be less precise.Consequently in PCRL, the term ‘1 day’ can not be replaced by the term ‘24 h’. Incase reports margins also are described qualitatively, using terms such as ‘precisely’,‘around’, ‘just before’ and ‘a bit less’. Qualitative margins essentially influence theprocess of patient case analysis. In PCRL, the use of qualitative margins issupported. In contrast to quantitative margins, qualitative margins can only beinterpreted in a context. Qualitative margins in PCRL are explicitly enclosed withina time-point context:

Time-Point:: Qualitative-Margin Declaration’; Time-Point-Description

If a time point is composed of a qualitati6e margin q and a time point description dthen use q as a prefix of d.

Example 10. An example of a time point:

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 271

‘just before onset of an episode of hospital admission’,

where ‘just before’ is a qualitative margin and ‘onset of an episode of hospitaladmission’ is an interval extremity.

The following example is similar to the previous one. Except for the term ‘just’, whichis changed to ‘3 days’. This new statement however, is constructed in a completelydifferent way. It is a composed time point 6alue composed of a math-operator (seeAppendix B):

Example 11. An example of a composed time point value (see Appendix B):

‘3 days before onset of an episode of hospital admission ’,

where ‘3 days’ is the first operand, and ‘onset of an episode of hospital admission’is the second, and where ‘before’ is the operator.

The different construction of these statements is reflected in the way they areinterpreted. The interpretation of ‘3 days’ is interpreted independently from ‘onsetof an episode of hospital admission’, whereas the interpretation of ‘just before’ indeeddepends on ‘onset of an episode of hospital admission’. If a qualitative margin iscombined with a reference to an interval onset or end, the interpretation is based onthe length of this time interval. For example, the meaning of ‘just before’ in example10 depends on the length of the hospital admission.

3.9. Describing patterns of time inter6als

Despite the fact that PCRL is already sufficient to express all of the temporalinformation of our nontrivial example 1, we are not yet satisfied. Our approach wasto compare natural language case reports with first order temporal logic, a theoreticalformal framework. We observe that first order temporal logic has a construct, referredto as quantification, which is not yet embedded in PCRL. Logical quantification isa powerful construction which enables generalisation. We already gave the followingnatural language example of this kind of information:

Example 12. An example of the use of quantification in natural language:

‘In the past 2 weeks, for at least three times, the headache lasted morethan 2 h’

In this example, the time-point relation ‘lasting more than 2 h’ is generalised uponmore than one ‘episode of headache’ without enumeration of each of these episodes.Within this construction, the episode description ‘headache’ is used to refer to morethan one episode. With the introduction of interval-patterns, PCRL has the abilityto generalise time-point relations upon interval descriptions:

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282272

INTERVAL Quantification*;Interval-Pattern::Time-Point Expression PredicateQuantifier-Margin-Id’ ; Quantifier Declaration;INTERVALINTERVALQuantification::

An inter6al pattern composed of inter6al quantification string q and time-pointexpression t can be denoted as: ‘For [q] holds: [t]’. If an inter6al quantification iscomposed of a quantifier margin m, a quantifier declaration d, and an inter6al i thenuse m as a prefix of d and d as a prefix of i. Declare i as plural, i.e. if i is an episodethen write ‘episodes’ instead of ‘episode’ and if i is a period then write ‘periods’ insteadof ‘period’. If only one inter6al description is quantified e6ery reference to this inter6aldescription is kept implicit, i.e. an inter6al reference is omitted (see example 15).

Example 13. Examples of quantifier margin identifications:‘at least’, ‘at most’, ‘precisely’

Example 14. Examples of quantifier declarations:‘all’, ‘few’, ‘half of the’, ‘most’, etc.

Example 15. A PCRL translation of natural language example 12:‘For at least three episodes of headache holds: end minus onset is more than

2 h and onset is after 2 days before today’,

where ‘at least’ is a quantifier margin, and ‘three’ is used as a quantifier, and ‘episodesof headache’ is an episode. The time point expression predicate ‘end minus onset ismore than 2 h and onset is after 2 days before today’ is an and-construct, where ‘theirend minus their onset is more than 2 h’ and ‘onset is after 2 days before today’ areelementary time point expression predicates (see Appendix B).

Note that within this time point expression predicate, references to the episode ofheadache are kept implicit. This is allowed, for this episode of headache is the onlyinterval description being quantified.

It is clear that the quantifier declaration resembles the well-known logical quantifiers‘for all’ and ‘there is’. In contrast to logical quantification, the quantification in PCRLconsists also of a quantifier margin. Quantifier margins are added to make thequantification less precise and therefore applicable in PCR descriptions.

4. An elaborate example

In this section, we give a larger example of a PCRL case description to illustratethe relation between an original natural language case report and a PCRL casedescription. The example primarily illustrates the expressiveness of PCRL in thisspecific domain. The suitability and generality of PCRL is illustrated by thecomparable extrinsic and intrinsic structure of the PCRL and the natural languagedescriptions. The suitability of PCRL could be improved if a more advanced concretesyntax had been defined. The current concrete syntax is chosen to be a one to one

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 273

translation of abstract syntax definitions to concrete denotations. The formulationof a more advanced concrete syntax combining several abstract syntax definitionsallows shorter more elegant formulations, but this is not within the scope of this paper.The following examples are PCRL-descriptions of example 1 and 2, respectively. Therelations ‘followed-by’ and ‘contains’ are adopted from CEN/TC 251 Time ‘standardsfor health care specific problems’ [8]:

Example 16. Patient A is known with:Onset of an episode of palpitations is just before end of an episode of taking

tablets—The episode of taking tablets does not overlap1 an episode of chestpain—The episode of palpitations overlaps1 an episode of chest pain—( End of theepisode of taking tablets equals onset of the episode of chest pain ) or ( End of theepisode of taking tablets is just before onset of the episode of chest pain ).

1A o6erlaps B ’

=onset of A is before onset of B and onset of B is before end of A

Example 17. Patient B is known with:An episode of hospital admission is followed by a period with age of 17, massi6e

hemoptysis and renal failure-An episode of feeling well is followed by an episode symptoms of an upper respiratory

tract infection-The symptoms of an upper respiratory tract infection is followed byan episode of headache and anorexia-Onset of the symptoms of an upper respiratorytract infection is just 3 days before admission2-The episode of headache and anorexiais followed by an episode of nausea. The episode of nausea is followed by a momentof3 6omiting-Onset of the nausea is about 36 h before admission-Onset of an episodeof chest pain is the evening before admission-Onset of a period with a start of coughingup blood-tinged sputum is just after onset of the episode of chest pain-An episode ofdyspnea is just after the coughing up blood-tinged sputum-The episode of dyspneacontains an episode of nausea and 6omiting which has a typical frequency which isintermittent-End of the episode of dyspnea is the morning after onset the episode ofchest pain-The episode of dyspnea has a typical duration which is continuous-Onsetof an episode of coughing up large quantities of blood and experience of diffuseweakness is the morning after onset of the episode of chest pain.

2‘admission ’= ‘onset of the episode of hospital admission ’3‘a moment of ’ can be substituted by ‘a period p with ’, if time point expression

‘end p equals onset p ’ is included in the PCRp also.

Note that almost all episodes are incomplete, i.e. a temporal pattern description isabsent. The original descriptions appear ambiguous when analysed in detail. Forexample, the precise relation between nausea and vomiting is not clear. A lot ofquestions can not be answered, such as: ‘Is the intermittence pattern of both similar?’and if so: ‘Are these patterns synchronous?’. The advantage of PCRL is that the useris not forced to specify things not relevant or not yet known, for PCRL enables vaguequalitative descriptions. Consequently, we could choose to systematically representinterpretations of the original natural language descriptions we considered logical.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282274

Formal analysis of the formalised examples requires the definition of the meaningof ‘just before’, ‘just after’, ‘the evening before’, ‘the morning after’, ‘Intermittent’,and ‘persisting’. In PCRL, these terms can be given a sensible context dependentmeaning. The phenomenon statements used in the examples can not be decom-posed. The extension of phenomenon statements with for example simple locationor severity attributes allows the analysis of the formalised examples to be based oncurrently existing disease models. However, doing the case analysis itself, is notwithin the scope of this paper. We end this section with the striking fact that thePCRp’s are indeed far less ambiguous than the corresponding PCRn’s.

5. Toward application in practice

It is not easy to learn the PCRL language. Reading a PCRL case descriptionrequires knowledge about specific interpretation of common terms such as episodeand period. One is to be familiar with the way of modelling of PCRL, i.e. themodels and model components used in the PCRL-technique (casu quo methodol-ogy) [30]. Writing a PCRL case description is even harder. The writing of a PCRLcase description requires a far more structured approach than writing naturallanguage case descriptions, despite the fact that the appearance of a PCRL casedescription resembles a natural language description, and despite the fact that theconcepts are few and the descriptions are suitable. In general when one uses aspecification technique concepts in mind are to be translated to a rigid framework.This specific process of translation is known as way of working [30]. To facilitateapplication of PCRL, the way of working is to be described and supported. In Fig.5, the basic steps to achieve a PCRL case report are summarised. The figure showsthat the way of working is closely related to the way of modelling as discussed inthis paper.

Automated tools with incorporated strategies and procedures to arrive at specificPCRL-models facilitate the way of working a great deal. For example, theformulation of PCRL descriptions could be syntax driven in order to omitsyntactically inconsistent descriptions. At this moment, a PCRL-based consultantto acquire structured temporal descriptions is under construction in which Figs. 6and 7 show an interface part. Note that this interface is generic and lesser directivethen the interface shown in Fig. 1. A PCRL-based consultant can also handle

Fig. 5. Basic steps to model a PCRL patient case report.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 275

Fig. 6. Computer aided formulation of episodes.

abbreviations, substitutions and conversions. Consistency checks as well as checkson completeness can be supported and this is also important. A simple check oncompleteness can be the answering of the following question ‘Are all episodestatements temporally related, explicitly?’. A simple check on the completeness ofexample 12 concludes the absence of isolated episode statements. Fig. 8 illustratesa stronger conclusion, all pairs of episode statements are explicitly temporallyrelated. The analysis of Fig. 9 shows that in example 13 the relations betweenepisode onsets are well documented. The figure shows that the PCR case reportstarts with the phenomena at the time of admission, followed by a chronologicaland structured description of the patient history starting from the beginning whenthe patient was feeling well. An analogous figure of example 13 consisting ofrelations between episode ends would have shown that these are almost completelyneglected in this example case report, which of course is normal when the focus ison diagnosis.

6. Conclusions and further research

In this paper, we analysed temporal knowledge as present in natural languagePCR (PCRn) of which in MEDLINE more than 70 000 are adopted. The projectstarted as a study on feasibility of developing a PCR language (PCRL) to model

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282276

Fig. 7. Computer aided formulation of episode contexts.

PCRn’s for the purpose of computer-assisted patient case analysis. Therefore,PCRL patient case reports (PCRp) were to be formal. Also, PCRp’s were to beconceptual, i.e. they were not to be driven by future interpretation and shouldtherefore be based on generic concepts of patient specific temporal knowledge.Naturally, PCRp’s should also be expressive and suitable. The focus in this projectwas on analysis of natural language case reports as described by clinicians. We didnot analyse requirements of all kinds of computer-assisted patient case analysisfunctionality. We recognised the most important temporal concepts and the waythey presented themselves in natural language case reports. We managed to select

Fig. 8. Results of a simplified temporal completeness check of Example 12: ‘phenomena are related inpairs’.

A.A

.F.6an

derM

aaset

al./A

rtificialIntelligence

inM

edicine16

(1999)251

–282

277

Fig. 9. An ordering of episode onsets formally derived from Example 13. The episode identification numbers indicate their order of occurrence in the casereport.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282278

a few yet sufficient temporal concepts and to relate them in a relatively simple way.This exercise proved that a conceptual, expressive and suitable case report languageto formally describe temporal knowledge as present PCRn’s is feasible. The researchproject was concluded with a serious attempt to define a concrete PCRL. The resultis presented in this paper. PCRL is characterised by its structured use of qualitativevalues, its type consistent use of qualitative as well as quantitative values, and itsrelationship with natural language.

PCRL as introduced here, focuses on one aspect of case reports. Only temporalknowledge is formalised. Currently PCRL is extended with structures to formaliseknowledge about care co-ordination as well. In future fundamental research, PCRLwill be extended with other essential models, such as a model of disease and amodel of space. The addition of a model of space and a model of disease enlargesthe expressiveness of PCRL to a great extent, as it enables the description anddetection of courses of anatomical changes and courses of severity which areessential elements in patient case analysis. We already started with some appliedresearch in this area. We have developed a matching procedure to match PCRLcases with network disease course descriptions to support case based informationretrieval of disease knowledge. Next to computer assistance of patient case analysis,we start using PCRL as a language to support the development of computerisedCase Report Forms (CRF’s) for the purpose of flexible yet structured acquisition ofpatient data.

Appendix A. Symbols and conventions PCRL-syntax

The definitions of the abstract syntax have a Backes-Naur form [1]. Within theseBackes-Naur form definitions, we make use of the following operator symbols [21]:‘::’, ‘:’, ‘;’, ‘ � ’, ‘,’, and superscripts ‘*’ and ‘’’. The ‘::’ can be read as ‘is syntacticallydefined as’ and is used to indicate the definition of an abstract concept. The firstoperand specifies the abstract concept, which is to be defined. The second operandis the definition part. The ‘ � ’ can be read as ‘exclusive or’ and is used in a syntaxdefinition part to separate options. The ‘;’ can be read as ‘and’ and is used in asyntax definition part to separate present syntactical parts. The ‘:’ can be read as ‘istypified as’. The first operand of ‘:’ is an abstract syntactical concept which istypified by the second operand. If a type definition is composed, its components areseparated with ‘,’. The ‘*’ has a single operand and specifies a string consisting ofzero or more elements of a type specified by the operand. The ‘’’ indicates anoptional structure. Terms ending with ‘Id’, which stands for ‘identification’, repre-sent terminals.

The abstract syntax is defined as a two level grammar [31], containing a low-leveland a meta-level. A two level grammar is used instead of a one level grammar, asit allows for a more concise presentation of the abstract syntax. A meta-level syntaxdefinition is recognised by the use of a subscript m appended to the syntaxdefinition symbol. Meta-level definitions define meta-level concepts. Conceptsdefined on a meta-level can be recognised by the exclusive use of uppercase

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 279

characters. In the low-level syntax definitions, meta-level concepts can be substi-tuted by the values defined on the meta-level. Each substitution results in a differentlow-level syntax definition. Consequently, each meta-level syntax definition gener-ates a series of low-level syntax definitions. A substitution should be consistent, i.e.within a single substitution, each occurrence of a meta-level concept in a low-leveldefinition, is always substituted with the same value. For example the followingsubstitution is inconsistent (see Appendix B):

Episode Predicate:: Not: Period Predicate, Predicate-Id

The following substitution on the other hand is consistent:

Episode Predicate:: Not: Episode Predicate, Predicate-Id

Appendix B. Representation of predicates, terms and objects

B.1. Composition of predicates

Predicates can be composed of the following statements:

Episode � Period � TimePoint Expression �STATEMENT::mInterval-Pattern � Pattern Expression � Time Point Expression �Occurrence-Pattern � Phenomenon

Predicates are logical constructs:

STATEMENT Predicate:: And: STATEMENT Predicate,STATEMENT Predicate, Predicate-Id �Or: STATEMENT Predicate,STATEMENT Predicate, Predicate-Id �Not: STATEMENT Predicate, Predicate-Id �Elementary: STATEMENT, Predicate-Id

The binary predicate ‘And’ is represented concretely with an infix operator denotedas a dash (‘-’) or ‘and’. The ‘Not’ predicate is represented concretely with a suffixoperator denoted as ‘is absent’. The predicate identification is indicated with anumerated x as superscript. Note that in Section 3.2, SITUATION is defined on theset of phenomena, to support use of predicates, this definition can easily beextended to ‘phenomenon predicates’.

B.2. Composition of expressions

Expressions declare comparisons of value terms. An expression can be denotedusing the comparison identification as an infix operator:

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282280

VALUE Expression::

VALUE Term � VALUE Term; Comparison-Id ; VALUE Term

Terms are defined on temporal patterns and time points:

VALUE::m PATTERN � Time-Point

The composition of complex terms is defined recursively. Terms can be combinedendlessly using basic math operators such as division and multiplication.

Elementary: VALUE Declaration � Composed VALUEVALUETerm::

First Operand: VALUE Term ; Operator: Math-Operator-Id;ComposedSecond Operand: VALUE TermVALUE::

B.3. Declaration of 6alues

PCRL makes use of the following types of objects, which can be valued:

OBJECT::m VALUE � Qualitative-Margin � Selection

Values, variables and scales in PCRL are explicitly typified to support consistentuse of objects:

OBJECT OBJECT Quality-Id � OBJECT Quantity � OBJECT Variable-IdDeclaration::OBJECT Number-Id; OBJECT-Scale-IdQuantity::

References

[1] Aho AV, Sethi R, Ullman JD. Compilers: Principles, Techniques and Tools. Reading, MA:Addison-Wesley, 1986.

[2] Allen JF. Maintaining knowledge about temporal intervals. Commun ACM 1983;26(11):832–43.[3] Allen JF. Towards a general theory of action and time. Artif Intell 1984;23(2):123–54.[4] Avison DE, Wood-Harper AT. Information systems development research: an exploration of ideas

in practice. Comput J 1991;34(2):98–112.[5] Brachman RJ. What IS-A Is and Isn’t: An analysis of taxonomic links in semantics networks. IEEE

Comput 1983;16(10):30–6.[6] Brachman RJ. An overview of the KL-ONE knowledge representation system. Cogn Sci

1985;9(2):171–216.[7] Ceusters W, Buekens F. Towards a High-Level Framework Model for the Description of Temporal

Models in Healthcare Information Systems. In: ten Hoopen AJ, Hofdijk WJ, Beckers WPA, editors.Ontwikkelingen in de Medische Informatica. Rotterdam, The Netherlands: VMBI/Publicon Pub-lishing, 1992:41–50.

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282 281

[8] Ceusters W, Rossi Mori A, Buekens F, Bernauer J, de Keyser L, Grebermann S, Surjan G, OlesenH. Medical Informatics—Time Standards for Healthcare Specific Problems, Comite Europeen deNormalisation/TC251, Technical Report ENV 12831, Brussels, Belgium, 1996.

[9] Cohen B. Justification of formal methods for system specification. Softw Eng J 1989;4(1):26–35.[10] Dorda WG. WAREL; A system for retrieval of clinical data considering the course of diseases.

Methods Inform Med 1989;28:133–41.[11] Evans DA, Cimino JJ, Hersh WR, Huff SM, Bell DS. Toward a medical-concept representation

language. J Am Med Inform Assoc 1994;1(3):207–17.[12] Gangemi A, Galanti M, Galeazzi E, Rossi Morri A. Compositional semantics for medical records.

In: Scherrer R, Mandil S, editors. Proceedings of MEDINFO 92. Amsterdam, The Netherlands:Elsevier Science, 1992:703–708.

[13] Grimshaw JM, Russell IT. Effect of clinical guidelines on medical practice: a systematic review ofrigorous evaluations. Lancet 1993;342:1317–22.

[14] Hamlet L, Hunter J. A Representation of Time for Medical Expert Systems. In: Lindberg DAB,editor. Lecture Notes in Medical Informatics, vol. 33 of Lecture Notes in Medical Informatics.Berlin, Germany: (Springer, 1987:112–119.

[15] ter Hofstede AHM. Information Modelling in Data Intensive Domains, University of Nijmegen,PhD Thesis, Nijmegen, The Netherlands, 1993.

[16] ter Hofstede AHM, van der Weide Th P. Formalisation of techniques: chopping down themethodology jungle. Inf Softw Technol 1992;34(1):57–65.

[17] Hripcsak G, Clayton PD, Pryor TA, Hang P, Wigertz OB, van der Lei J. The ARDEN Syntax forMedical Modules. In: Miller MRA, editor. Proceedings of the Fourteenth Annual Symposium onComputer Applications in Medical Care. New York: IEEE Computer Society Press, 1990:200–204.

[18] Keravnou ET, Washbrook J. A temporal reasoning framework used in the diagnosis of skeletaldysplasias. Artif Intell Med 1990;2:239–65.

[19] Keravnou ET. Medical temporal reasoning. Artif Intell Med 1991;3:289–90.[20] Ledley RS, Lusted LB. Reasoning Foundations of Medical Diagnosis. In: Reggia JA, Tuhrim S,

editors. Computer-Assisted Medical Decision Making, vol. 1: Computers and Medicine. NewYork: Springer, 1985:46–79.

[21] Meyer B. Introduction to the Theory of Programming Languages. Englewood Cliffs, New Jersey:Prentice-Hall, 1990.

[22] Miller RA, Pople HE, Myers JD. INTERNIST-1, An experimental computer-based diagnosticconsultant for general internal medicine. N Engl J Med 1982;307:468–76.

[23] Ramoni M, Stefanelli M, Magnani L, Barosi G. An epistemological framework for medicalknowledge-based systems. IEEE Trans Syst Man Cybern 1992;6(22):1361–74.

[24] Rector AL, Nowlan WA, Kay S. Foundations for an electronic medical record. Methods InformMed 1991;30:179–86.

[25] Rector AL, Rogers JE, Pole P. The GALEN High-Level Ontology. In: Bender J, Christensen JP,Scherrer J-R, McNair P, editors. Proceedings Medical Informatics Europe ‘96, vol. 34 of Technol-ogy and Informatics. Amsterdam: IOS Press, 1996:174–178.

[26] Rector AL, Bechhofer SK, Goble CA, Horrocks I, Nowlan WA, Solomon WD. The GRAILconcept modelling language for medical terminology. Artif Intell Med 1997;9:139–71.

[27] Reggia JA, Tuhrim S. An Overview of Methods of Computer-Assisted Medical Decision Making.In: Reggia JA, Tuhrim S, editors. Computer-Assisted Medical Decision Making, vol. 1. NewYork: Springer, 1985:3–45.

[28] Scufly RE, Mark EJ, McNeely WF, McNcely BF. Case records of the Massachusetts generalhospital: Case 52-1993. New Engl J Med 1993;329(27):2019–26.

[29] Tuttle MS. The position of the Canon group: a reality check. J Am Med Inform Assoc1994;1(3):298–9.

[30] Wijers GM, Heijes H. Automated Support of the Modelling Process: A view based on experimentswith expert information engineers. In: Steinholz B, Solvberg A, Bergman L, editors. Proceedings of

A.A.F. 6an der Maas et al. / Artificial Intelligence in Medicine 16 (1999) 251–282282

the Second Nordic Conference CAiSE ‘90 on Advanced Information Systems Engineering, vol. 436of Lecture Notes in Computer Science. Stockholm, Sweden: Springer, 1990:88–108.

[31] van Wijngaarden A, Mailloux BJ, Peck JEL, Koster CHA, Sintzoff M, Lindsey CH, Meertens LT,Fisker RG. Revised Report on the Algorithmic Language ALGOL 68. Berlin, Germany: Springer,1976.

.