knowledge elicitation and knowledge representation in a large domain with multiple experts

8
Pergamon Expert Systems With Applications, Vol. 8, No. I, pp. 169-176, 1995 Copyright © 1994 Elsevier Science Ltd Printed in the USA. All fights reserved 0957-4174/95 $9.50 + .00 0957-4174(94)E0007-H Knowledge Elicitation and Knowledge Representation in a Large Domain With Multiple Experts A. R. BARRETT AND J. S. EDWARDS Aston Business School, Aston University, Birmingham, U.K. Abstract--This paper describes the knowledge elicitation and knowledge representation aspects of a system being developed to help with the design and maintenance of relational data bases. The size and complexity of this domain mean that the system requires both knowledge-based and conventional algorithmic components. In addition, the domain contains multiple experts, but any given expert's knowledge of this large domain is only partiaL The paper discusses the methods and techniques used for knowledge elicitation, which was based on a "broad and shallow" approach at first, moving to a "'narrow and deep" one later, and describes the models used for knowledge representation, which were based on a layered "generic and variants" approach. 1. INTRODUCTION THE WORK reported here represents the latest stage in research that has been going on for several years into the use of knowledge-based systems to support con- ventional computer systems development. The aim is to develop knowledge-based tools for the design and maintenance of relational data base systems; this in- volves collaborators from commerce, industry, and academia. The process of relational data base design will be described only in as much detail as is necessary for this paper. A suitable reference is Date (1986). Compared with earlier data base technologies, it would be fair to say that relational data base systems replace structural problems with performance prob- lems. Both the systems designer and the data base ad- ministrator (DBA) need to be aware of these problems and of the techniques available to alleviate them. Just which techniques are used, for example, "add an index" or "cluster tables," may depend on the application being considered, the history of the data base in use, the experience of the designer or DBA, or even their intuition. Many of the possible solutions cannot be derived algorithmically and are simply human exper- tise. Thus, the design and maintenance of relational The work reported here forms part of a project supported by the U.K.'s Science and Engineering Research Council and Department of Trade and Industry, as part of the Information Engineering Ad- vanced TechnologyProgramme (project IED4/1 / 1429). The collab- orators are BIS Information Systems(the main partner), British Steel Strip Products Group and Aston University. Requests for reprints should be sent to Dr. J. S. Edwards, Aston Business School, Aston University,Aston Triangle, Birmingham, B4 7ET, U.K. E-mail:[email protected] data bases appears to be a fruitful area for knowledge- based support, albeit with some features that make the task far from straightforward:- • The system needs to include both knowledge-based and algorithmic components. • There are multiple experts in the domain and their knowledge is often partial, for example, one person may be an expert in the theory of relational data base design, another an expert in using a particular relational data base management system (RDBMS) package running under a particular operating system. • The system needs to be very large, because of the complexity of the task involved. 2. THE APPROACH TO KNOWLEDGE ELICITATION The overall approach taken was based on the BIS KBS (knowledge-based systems) method, which has been in use by BIS Information Systems for some years. One of the reasons for this choice was that the method is closely associated with the BIS/Modus method for conventional system development. It has also been in- fluenced by a hybrid methodology called POLITE (Bader, Edwards, Harris-Jones, & Hannaford, 1988; Edwards, 1991). It was clear that a system for database design, maintenance, and management would need to include conventional as well as knowledge-based com- ponents, because of the numerical aspects of system sizing and performance measurement. Therefore, the approach to be used had to be capable of coping with both types, and this was felt to weigh against the use of more specifically KBS methods such as KADS. 169

Upload: ar-barrett

Post on 26-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Pergamon Expert Systems With Applications, Vol. 8, No. I, pp. 169-176, 1995

Copyright © 1994 Elsevier Science Ltd Printed in the USA. All fights reserved

0957-4174/95 $9.50 + .00

0957-4174(94)E0007-H

Knowledge Elicitation and Knowledge Representation in a Large Domain With Multiple Experts

A. R. BARRETT AND J. S. EDWARDS

Aston Business School, Aston University, Birmingham, U.K.

Abstract--This paper describes the knowledge elicitation and knowledge representation aspects of a system being developed to help with the design and maintenance of relational data bases. The size and complexity of this domain mean that the system requires both knowledge-based and conventional algorithmic components. In addition, the domain contains multiple experts, but any given expert's knowledge of this large domain is only partiaL The paper discusses the methods and techniques used for knowledge elicitation, which was based on a "broad and shallow" approach at first, moving to a "'narrow and deep" one later, and describes the models used for knowledge representation, which were based on a layered "generic and variants" approach.

1. I N T R O D U C T I O N

THE WORK reported here represents the latest stage in research that has been going on for several years into the use of knowledge-based systems to support con- ventional computer systems development. The aim is to develop knowledge-based tools for the design and maintenance of relational data base systems; this in- volves collaborators from commerce, industry, and academia. The process of relational data base design will be described only in as much detail as is necessary for this paper. A suitable reference is Date (1986).

Compared with earlier data base technologies, it would be fair to say that relational data base systems replace structural problems with performance prob- lems. Both the systems designer and the data base ad- ministrator (DBA) need to be aware of these problems and of the techniques available to alleviate them. Just which techniques are used, for example, "add an index" or "cluster tables," may depend on the application being considered, the history of the data base in use, the experience of the designer or DBA, or even their intuition. Many of the possible solutions cannot be derived algorithmically and are simply human exper- tise. Thus, the design and maintenance of relational

The work reported here forms part of a project supported by the U.K.'s Science and Engineering Research Council and Department of Trade and Industry, as part of the Information Engineering Ad- vanced Technology Programme (project IED4/1 / 1429). The collab- orators are BIS Information Systems (the main partner), British Steel Strip Products Group and Aston University.

Requests for reprints should be sent to Dr. J. S. Edwards, Aston Business School, Aston University, Aston Triangle, Birmingham, B4 7ET, U.K. E-mail: [email protected]

data bases appears to be a fruitful area for knowledge- based support, albeit with some features that make the task far from straightforward:- • The system needs to include both knowledge-based

and algorithmic components. • There are multiple experts in the domain and their

knowledge is often partial, for example, one person may be an expert in the theory of relational data base design, another an expert in using a particular relational data base management system (RDBMS) package running under a particular operating system.

• The system needs to be very large, because of the complexity of the task involved.

2. THE APPROACH T O K N O W L E D G E E L I C I T A T I O N

The overall approach taken was based on the BIS KBS (knowledge-based systems) method, which has been in use by BIS Information Systems for some years. One of the reasons for this choice was that the method is closely associated with the BIS/Modus method for conventional system development. It has also been in- fluenced by a hybrid methodology called POLITE (Bader, Edwards, Harris-Jones, & Hannaford, 1988; Edwards, 1991). It was clear that a system for database design, maintenance, and management would need to include conventional as well as knowledge-based com- ponents, because of the numerical aspects of system sizing and performance measurement. Therefore, the approach to be used had to be capable of coping with both types, and this was felt to weigh against the use of more specifically KBS methods such as KADS.

169

170 A. R. Barrett and J. S. Edwards

2.1. An Overview of Knowledge Elicitation in the BIS KBS Method

The BIS KBS method broadly resembles a "waterfall" approach to conventional development (see for ex- ample Boehm, 1976), in that it has stages of Feasibility, Analysis, Design, Programming, Testing and Valida- tion, and Review. However, it does not follow a rigid sequence like a conventional life-cycle and it also per- mits the use of prototyping within many of the stages. The result is that knowledge elicitation is carried out in two major phases (see Figure 1). The first phase is "broad and shallow," in order to survey the domain and obtain a high-level understanding. This roughly corresponds to the conventional stages of Feasibility and Analysis (but see later); it also allows for the use of prototyping within the Analysis stage. It addresses issues such as the appropriate form(s) of knowledge representation to use and which components are best developed using conventional algorithmic techniques. During this phase, the best structuring of the system into different areas or modules, for ease of development and maintenance, is also determined this takes the phase beyond Analysis into issues that are normally considered part of Design. As indicated in Figure 1, the domain examined in this phase needs to be slightly wider than the domain of the system eventually de- veloped, in order to define the boundary of the latter properly.

It should also be noted from Figure 1 that the "broad and shallow" phase is not merely preliminary work. Some of the knowledge elicited in this phase will appear directly in the knowledge base of the final system. This

turned out to be particularly significant here in view of the way in which the knowledge was structured, as is discussed in section 3.

2.2. Knowledge Elicitation: The "Broad and Shallow" Phase

The first stage in the process of eliciting the knowledge for the system was the familiarization of the knowledge engineers with the domain. This included studying books and system documentation, informal discussions with experts, and an intensive relational database training course.

The aim of the initial phase was breadth first, as the name implies, in order to understand the domain and devise a provisional architecture for the system. It was therefore done on the basis of relatively few interviews (typically two) with each of the domain experts. Initially there were five experts available to the project team; all of them worked for one of the collaborators. Other experts became involved later--a total of I l in all. The first phase interviews were very loosely structured. No attempt is made here to categorize the type of interview precisely, since it varied from one expert to another, covering several of the 16 types of interview identified by Neale (1988). Typical questions asked in the first interviews included:

"What method do you use for designing a database?"; "Would you describe the typical role you follow as that

of a database manager?"; "What information would you need to perform a data

analysis?"

breadth time

'" .....

/:

bulld prototype -~

~ p a n d to fu U d o m ~

FIGURE 1. Phases of knowledge elicitation.

breadth de~th

A Large Domain With Multiple Experts 171

Later interviews in this phase were still not formally structured, although the questions were at a more de- tailed level, such as:

"What formula do you use to calculate a table's size?"; "What is the result of over-indexing?".

This loose structure was principally chosen in order to establish a rapport between the interviewers and each of the individual experts. It appeared to work well in that respect, but had the drawback that the experts sometimes said what they wished to say, or what they thought they might be expected to say, rather than what they actually did. This is a well-known problem of knowledge elicitation (Hart, 1987; Johnson, 1983). All interviews were recorded on audio tape and transcribed afterwards. The only "technique" used in these early stages was that the interviewers prepared questionnaires for their own use, to provide a certain amount of struc- ture to the interviews by reminding them what to ask. In practice, these were not adhered to slavishly.

The knowledge engineers worked in pairs through- out; one conducting the interview, the other taking notes and observing. This, combined with the use of a voice recorder, avoids interruptions for note-taking, and reduces the potential for the interviewer to intro- duce his or her own bias.

2.3. Knowledge Elicitation: The "Narrow and Deep" Phase

The modules identified in the "broad and shallow" phase are then developed on the basis of the second phase of"narrow and deep" knowledge elicitation. The stages that follow for each module may again be iden- tified with tasks ranging from Logical Design and Physical Design through to Testing and Validation, but their actual conduct (especially for knowledge-based components) may again involve considerable use of prototyping. In addition, techniques such as data-flow diagrams and functional breakdowns from conven- tional systems development, with suitably extended notation where necessary, are employed to document the system development. The use of such techniques particularly eases integration and testing of the various modules.

Six main functions of the system were identified in the Feasibility part of the "broad and shallow" phase. The six were named as Designer, Tuner, Refiner, Monitor, Predictor, and Impact Analyzer. The range of these functions illustrates the point made in the "In- troduction" about domain size and complexity. These cover all three of the highest level tasks distinguished in the KADS generic task hierarchy (Hickman, Killin, Land, Mulhall, Porter, & Taylor, 1989), namely system analysis, system modification, and system synthesis. However, after the initial Design work, it became clear that four modules would suffice to address these six

functions, since those of Predictor could be subsumed in Impact Analyzer, and that Tuner and Design refiner were essentially performing the same tasks for the two different types of user. To explain the knowledge elic- itation process in this phase, the "Tuner" module will be used as an example.

During the first phase of knowledge elicitation, many types of database tuning problems had been identified. They included: • Excessive input/output problems • Single table scan problems • Batch run-time problems.

Similarly, a number of possible tuning solutions had been identified. For example, typical tuning solutions that can be employed to solve a batch run-time problem may include denormalisation; adding an index; re- moving an index; changing an index; clustering rows; and buffering.

This reflects the thinking of the experts, which tended to follow a sequence of

Manifestation (symptom) =* Problem =* Solution.

The knowledge engineers adopted a strategy of breaking this up into the two parts

Manifestation =, Problem

Problem =* Solution

to avoid difficulties with understanding the three-stage sequence; the most apparent of which was the expert going straight from Manifestation to Solution without specifying what the problem was.

At this point in this phase, all interviews were still being tape-recorded and transcribed, with the transcript sent to the expert afterwards for verification. Some protocol analysis had also been carried out as part of the interview sessions. It became apparent, however, that it was necessary to introduce more formal tech- niques for knowledge elicitation at this more detailed level. These included: • Card sorts • Repertory Grids • Matrices.

2.3.1. Card Sorts. Card sorts were used in areas where many-to-many relationships needed to be clarified, for example, where a problem may have more than one solution and a solution may solve more than one prob- lem. They were also found to be useful where it was difficult to distinguish between high- and low-level problems; the card sequences visibly portrayed the lev- els. These benefits are consistent with those identified elsewhere (Gammack, 1987). From an interaction point of view, card sorts also proved useful in that as "activity sessions" they broke the monotony of the in- terview format.

172 A. R. Barrett and J. S. Edwards

2.3.2. Repertory Grids. Repertory Grids were used to elicit high-level constructs used by the experts, as in Kelly's personal construct theory (Kelly, 1955). The method adopted was the now virtually standard one of presenting elements in groups of three, and asking the expert to state a way in which two of them were similar and, thereby, different from the third. All ele- ments were then rated against the construct so ob- tained, the experts being free to choose whichever rating scale they felt easiest with. This choice of scales was offered to maintain the rapport with individual experts. It did, however, preclude the use of a computer-based tool for this task, so that the analysis had to be per- formed manually.

2.3.3. Matrices. The use of matrices is illustrated by the "solution/characteristic matrix" form shown (in part) in Figure 2. For each of the tuning problems, such a matrix was supplied to the experts to enable them to identify the appropriate solutions that could be applied to solve the problems. Each potential so- lution corresponded to one row of the matrix. The col- umns of the matrix showed various characteristics that might make the particular solution appropriate. The elements of the matrix again used whatever rating, priority, or measurement scale the expert felt most suitable--even a descriptive (nominal) scale if so de- sired.

A further advantage of these more formal techniques was that it was possible for the experts to be given "homework" in the form of grids or rating exercises to be done prior to an interview (done in at least one case on the train journey to the next interview session!). This improved the continuity of the process, as well as making more efficient use of both experts' and knowl- edge engineers' time.

An important point is that the techniques used with different experts varied. This was partly because of the differing natures of their expertise, but more signifi- cantly to give the flexibility to take account of the in- dividual expert's likes and dislikes. For example, after his first session using card sorts one of the experts said

that he was not comfortable with them, and so they were not used in subsequent interviews with him.

In the final, most detailed stages of knowledge elic- itation, particularly once Programming and Testing had begun, the approach reverted to straight "question and answer," the tape-recorder being dispensed with because of the specific nature of the answers required. The experts were given advance notice of the area(s) to be discussed, to enable them to carry out some re- search (if necessary) before the interview.

2.4. Knowledge Eiicitation: Overall Reflections

Hart lists seven pieces of general advice in her chapter on "Interviewing the experts" (Hart, 1989). These are: be prepared; record information; be specific, not gen- eral; empathize with the expert; try not to interrupt; do not impose alien tools; listen to the way the expert uses knowledge.

For the most part these held good here, as may be seen from the preceding subsections, but with one ad- dition and one exception. The addition concerns Hart's advice to empathize, which she describes solely in terms of the methods and representations being used. This is certainly good advice, but in this project it was nec- essary to go further and actively attempt to build up a rapport between expert and knowledge engineer, in or- der to "oil the wheels" of the knowledge elicitation process as a whole. This may well have been because the experts were somewhat wary, in view of the partial nature of each individual's expertise compared to the domain as a whole. The exception was to the advice to be specific, not general; it was found advantageous here to keep the interviews as general as possible, at least in the "broad and shallow" phase. As well as being a conscious aim in the BIS KBS method, this was also a result of the sheer breadth of the domain. As already mentioned in section 2.3, the six functions that had been identified originally could be realized with just four modules. Becoming too specific too soon might well have led to this improvement being missed. This is a particular advantage of the BIS KBS method.

Tuner

Solutions

De-normalise

Add Index

Remove index

Change index

Ease of

Change

Characteristic

Cost Effect Benefit

Re-design program logic

Use different RDBMS I I I FIGURE 2. Part of a solution/characteristic matrix for Tuner.

A Large Domain With Multiple Experts 173

Several changes were noted between the initial "broad and shallow" phase, and the later "narrow and deep" phases. In the initial stages the interviews were relatively unstructured, whereas for the "narrow and deep" phases the interviews were tightly structured. An alternative way of describing this change is that there was a move from an interviewee-centered process to one that was centered on the interviewer. This should not be taken to imply that the role of the interviewee-- the expert--became less important as knowledge elic- itation continued. Rather, it was the control of the knowledge elicitation process that moved from the in- terviewee (expert) to the interviewer (knowledge en- gineer). This may be thought of as consistent with a gradual move from concentration on the domain as seen by the expert (at the Feasibility stage) toward con- centration on the system as seen by the knowledge en- gineer (at the Design and Programming stages). In par- allel with this, it is noteworthy that (for the most part) the responses given by the experts in the "broad and shallow" phase were much longer than those in the "narrow and deep" phases. Correspondingly, the ques- tions asked by the interviewers increased in length, al- though this was less clear-cut; all of the early questions were short, as were many of the later ones, but some of the final, detailed questions were very long. indeed.

Although many different techniques were Used for knowledge elicitation during the project, the process was based on interviews throughout. This was partly because of the nature of the task, but mainly because different experts appeared to work in fundamentally different ways. Such diversity does not preclude the use of techniques that suit particular individual experts, but does rule out the use of noninterview-based ap- proaches such as observation or questionnaires because of the lack of "common ground." Nethercote (1989) describes how a questionnaire-based approach failed in a domain with multiple experts (that of business assistance), and was replaced by one based on inter- views.

Apart from Nethercote's paper, the literature con- tains relatively little advice on working with multiple experts whose knowledge of the domain is partial. Three different "multiple expert" cases have been identified (Scott, Clayton, & Gibson, 1991): experts who work in teams, experts who share the same area of expertise, and experts with different areas of exper- tise. This project most closely resembled the latter case, whereas the majority of the literature treats the second of the three, in which the problems of combining ex- pertise may be characterised as "filling in gaps." Several systems have been devised to help in this case, such as the expertise transfer systems ETS and AQUINAS (Boose & Bradshaw, 1988). However, the only reported technique for building bridges between different areas of expertise, as was required here, is the use of a Group Decision Support environment (Liou & Nunamaker,

1990). Such facilities were not available for this project. A further point to note is that the advice of Scott et al., "[i]fthe experts are not already accustomed to col- laborating with each other, the knowledge engineers may need to mediate disputes that arise . . . . "proved an over-simplification. Most of the experts here were accustomed to collaborating, but disputes still arose, and it was beyond the capabilities of the knowledge engineers to resolve them, as will be discussed later.

In the absence of any other tool or technique, the use of an intermediate representation by the knowledge engineers turned out to be essential for the purpose of combining the expertise of the various experts. In es- sence, this representation was a set of pseudo-rules in English. A feature of the rules in the intermediate rep- resentation was that the problem identification rules tended to be simple ones, such as:

IF the data experience unexpected growth THEN table scan could be a contributory factor to

system degradation

whereas the rules for solution identification tended to be complex (see Figure 3).

Disagreements between experts, or the question of which expertise was the most appropriate in a given set of circumstances, formed an important issue in de- veloping the system. The solution used was a variant on the "knowledge czar," "final arbiter," or "lead ex- pert" approach for resolving disagreements (Scott et al., 199 l; Turban, 1990). There was no-one whose ex- pertise spanned the entire domain, so no-one could carry out the "pure" knowledge czar role. Consensus methods appeared inappropriate because of the diver- sity of experts and expertise involved. However, one of the experts involved in the earlier stages was later appointed as manager of the project team. He then undertook to resolve all conflicts in expertise in one of two ways: either by taking the decision as the knowledge czar if the query was in his sphere of expertise, or by deciding which of the experts to "believe" if it was not.

3. KNOWLEDGE REPRESENTATION AND THE SYSTEM STRUCTURE

3.1. Nature of the Expertise in the Domain

During the "broad and shallow" phase of the knowl- edge elicitation process, the experts were found to ex- press their knowledge in terms of difference. Those conversant with a number of RDBMSs were often un- able to say what a particular RDBMS could do, but could easily recall what it could not do. The experts appeared to think in terms of an all-embracing RDBMS and to express their knowledge of a specific RDBMS in terms of differences from their conceived norm. A "generic and variants" model was therefore used to structure the knowledge, to reflect this "difference" ap- proach.

174 A. R. Barrett and J. S. Edwards

SA2 {Change Index Field by Order} {This is used where all fields are accommodated within existing indexes but cannot be accessed because of the order used in the index. Only suitable where the existing indexes are changeable.} DETERMINE index requirement for table {i.e. Required _Fields} FOR each existing index on table

IF index NOT have referential integrity defined AND IF index NOT unique AND IF index NOT clustered AND IF index NOT primary AND IF index order NOT important {see SG2} THEN

IF index contains one or more required Fields THEN Index is Changeable.

IF all Required Fields are contained within one or more Changeable Index {Only Change Order is Possible} THEN FOR each index which is Changeable

IF index contains all Required Fields THEN ORDER Required Fields with Incorporated Flag FALSE so that fields with the smallest range are at the front

NET Changed Index to TRUE IF Changed Index is FALSE {default} THEN

IF Required Field is Lead Field in Changeable Index {or after one or more leading Fields} THEN

SET Required Field Incorporated Flag to TRUE FOR each Changeable Index which contains Required Fields with Incorporated Flag FALSE

MARK as Order Changeable REPEAT

SELECT all Order Changeable Indexed with the most Required Fields with Incorporated Flag FALSE

IF two or more Order Changeable Index selected THEN SELECT all Order Changeable Indexes with the least number of total fields

IF two OR more Order Changeable Index selected THEN DETERMINE average priority of all transactions using Order _Changeable Indexes SELECT Order Changeable Index with lowest priority ORDER Required Fields so that fields with smallest range to the front

NET Required Field Incorporated Flag to TRUE UNTIL no required Fields have Incorporated Flag set to FALSE SET Changed Index to TRUE

FIGURE 3. One of the rules for solution identification.

Knowledge about RDBMSs is conceived as occur- ring on several different levels or layers, as shown in Figure 4. The first level is "generic knowledge" in that it relates to RDBMSs in general, but does not represent any specific RDBMS that is found in the real world. This layer is used as the default knowledge in the ab- sence of anything more specific. The next layer rep- resents the overall specification of a real RDBMS such as ORACLE, DB2, or INGRES and is "generated" from the generic layer by deleting generic attributes that do not exist in the particular RDBMS concerned and adding any attributes peculiar to that RDBMS that are not included in the generic model.

In c o m m o n with other computer software, RDBMSs are dynamic. Thus "enhanced" versions are released from time to t ime- - so that there may be a number of different versions or releases in commercial use at any one time. By applying knowledge of the differences between versions in the same manner as before, another layer modelling a specific version of the RDBMS is generated. Two further layers represent

the influence of the operating system (ORACLE under VM may not be the same as ORACLE under UNIX), and of different operating system versions.

Finally, a further layer of knowledge may be nec- essary to produce a working model for a specific in-

DEFAULT

RDBMS

RDBMS VERSION

OPERATING ENVIRONMENT

OE VERSION

USER CONSTRAINTS

COMPOSITE

WORKSPACE

FIGURE 4. Layered structure of domain knowledge.

A Large Domain With Multiple Experts 175

stallation, in order to cater for constraints determined by the installation. For example, it may be a policy not to use hashing algorithms in any applications.

3.2. System Structure

The system is structured around four "knowledge models." • The DBA Functions model, which describes the

main functions of the system. Note the DBA label here, even though it also includes the functions of designers, since it is the (human) DBA who has over- all responsibility for the database.

• The Structural model, which contains the static knowledge about data bases in general and relational data bases in particular.

• The Environment model, which contains informa- tion about the physical hardware and operating en- vironment.

• The Strategy model, which contains the strategy in- formation that controls the problem-solving pro- cesses. Each of these four models follows the layered struc-

ture. The content of the layers is briefly outlined in Figure 5. The "narrow and deep" parts correspond to the DBA Functions model and the "jobs" within the Strategy model that carry out tasks within those func- tions. The knowledge for the remainder of the Strategy model, and for the Structural and Environment mod- els, was mainly elicited during the "broad and shallow" phase. This refers back to Figure 1, where it may be seen that the knowledge in the final system does not all come from the later "narrow and deep" phase of the BIS KBS method.

Note that the models that have been briefly outlined here are logical models, produced originally as part of the intermediate representation. The link to the final physical representation was not, however, always straightforward, for two reasons: • a solely rule-based representation was inadequate; • the knowledge for each module corresponding to one

of the functions of the system had to be merged with

other knowledge from the "broad and shallow" phase.

In practice, the physical design did not have an exact mapping of one knowledge base for each model and layer combination. For example, the DBA Functions model default layer had one knowledge base for each of the four modules from the "narrow and deep" phase, while the composite layer consisted of one knowledge base for all four models. The final representation was based on objects, in order to capture the layered struc- ture and to facilitate the inclusion of algorithmic com- ponents in the form of methods.

The system has been written using the Nexpert Ob- ject package running under Microsoft's Windows on a PC, with additional routines and user interfaces in C and C++. Nexpert Object proved to be particularly suited to representing the layered structure of the knowledge as described above, because it allows classes in different knowledge-bases to be given the same names. When the knowledge-bases for each layer are loaded top down in the sequence shown in Figure 4, the values for later layers overwrite those of earlier lay- ers as appropriate. Considerable use was still made of rules, however, in the final system, together with the use of Nexpert's "Hypotheses" to control the infer- encing process.

The system produced is called MATURED (Mon- itoring and Tuning of Relational Databases), which addresses all aspects from the physical design of a sys- tem to its operation in a real working environment. It consists of the four modules Designer, Tuner, Impact Analyzer, and Monitor. Designer performs two main tasks. The first is to take a logical design and the op- erational constraints relating to a particular environ- ment, to produce a legal physical design for the chosen relational data base. The second is to refine a physical design so that it meets a specified set of performance requirements. Tuner also performs two main tasks: de- termining the cause of a performance problem occur- ring when a data base system is operating, and selecting a solution or solutions to improve the system's perfor- mance. Impact Analyzer evaluates the knock-on effects

Defauk

DBA Functions Model

Designer, Tuner, etc.

Structural Model BDM etc.

Environment Model

Peripheral definitions etc.

Strategy Model

Inference structures etc.

Variant Database & OS Database & OS Manufacturer & Database & OS layers v a r i a t i o n s variations OS variations variations User

Composite

Workspace

User functions

Target configuration

User limits & options

User strategies

Consolidated Knowledge Base

Workspace Application Environmental Active for KBs data data strategies

FIGURE 5. The layers for each of the knowledge models.

176 A. R. Barrett and J. S. Edwards

of proposed changes to the physical design of a data base system, and indicates when a complete re-tune is needed. Monitor warns the user of unanticipated changes in the performance statistics of a data base system in operation, which may be manifestations of either current or future performance problems. The four modules may be used either separately or in com- bination; there are close logical links between Monitor and Tuner, and between Tuner, Impact Analyzer, and Designer's refining function. All four modules have been used by the industrial collaborator's systems de- partment. The only extra work necessary was that a small routine had to be written to supply the necessary database statistics to M A T U R E D in a suitable format for Monitor and Tuner. Designer has proved particu- larly popular with the users; indeed, they were eager to use the original prototype after seeing it demon- strated, even though it only covered the ORACLE RDBMS, whereas they use DB2! Comparisons of De- signer's output with actual data base designs are cur- rently being undertaken.

4. SUMMARY

This paper has discussed knowledge elicitation and knowledge representation for a large system involving multiple experts with only partial knowledge of the domain.

The main themes in knowledge elicitation were: • The use of a variety of techniques, tailored to suit

the individual expert's needs and preferences, form- ing part of the process of building up a rapport with each expert

• The use of methods based on interviews throughout, with a progression from unstructured to structured interviews, and from an interviewee-centred process to an interviewer-centred one, particularly in order to counter the experts' tendency to move straight from symptoms to solutions without explicitly iden- tifying the problem(s)

• The use of an intermediate representation (in this case pseudo-rules in a form of structured English) in order to combine the expertise from different experts. The main themes in knowledge representation were:

• The use of a layered "default and variants" structure for the logical design of the knowledge-base.

• A further decomposition into Functions, Structural, Environment, and Strategy models.

• The use of an object-orientated physical design of the knowledge-base because of the above structure and to facilitate the inclusion of conventional algo- rithmic components.

• The use of rules and hypotheses to control the in-

ference process, at both the strategic and detailed levels. The principal lesson from the development as a

whole is that to cope with a large domain and multiple experts it was necessary to maintain flexibility. Knowl- edge elicitation methods and techniques had to suit the individual experts; intermediate representations and logical designs had to suit the knowledge engineers and the knowledge czar; and the final design had to match the programming facilities available.

REFERENCES

Bader, J.L., Edwards, J.S., Harris-Jones, C., & Hannaford, D. (1988). Practical engineering of knowledge-based systems. Information & Software Technology, 30, 266-277.

Boehm, B.W. (1976). Software engineering. IEEE Transactions on Computers C-25, 12, 1226-1241.

Boose, J.H., & Bradshaw, J.M. (1988). Expertise transfer and complex problems: Using AQUINAS as a knowledge-acquisition work- bench for knowledge-based systems. In J.H. Boose & Gaines, B.R., (Eds.), Knowledge acquisition tools for expert systems (Knowledge-Based Systems, Vol. 2, pp. 39-64). London: Aca- demic Press.

Date, C.J. (1986). Relational database. Selected writings. Reading, MA: Addison-Wesley.

Edwards, J.S. ( 1991). Building knowledge-based systems: Towards a methodology. New York: Pitman, London and Halsted Press (John Wiley).

Gammack, J. (1987). Different techniques and different aspects on declarative knowledge. In A. L. Kidd (Ed.), Knowledge acquisition for expert systems: A practical handbook (lst ed.), pp. 137-163. New York: Plenum.

Hart, A. (1987). Role of induction in knowledge elicitation. In A.L. Kidd (Ed.), Knowledge acquisition for expert systems: A practical handbook (lst ed.), pp. 165-189. New York: Plenum.

Hart, A. (1989). Knowledge acquisition for expert systems, (2nd ed.). London: Kogan Page.

Hickman, F.R., Killin, J.L., Land, L., Mulhall, T., Porter, D., & Taylor, R.M. (1989). Analysis for knowledge-based systems: A practical guide to the KADS methodology. New York: Halsted Press (John Wiley).

Johnson, P.E. (1983). What kind of expert should a system be? Journal of Medicine and Philosophy, 8, 77-97.

Kelly, G.A. (1955). The psychology of personal constructs. New York: Norton.

Liou, Y.I., & Nunamaker, J.F. (1990, January). Using a group decision support system environment for knowledge acquisition: A field study. In Proceedings of the 23rd Annual Hawaii International Conference on System Sciences (Vol. 3, pp. 40-49), Kona, Hawaii. Los Alamitos, CA: IEEE Computer Society Press.

Neale, I.M. (1988). First generation expert systems: A review of knowledge acquisition methodologies. Knowledge Engineering Review, 3, 105-145.

Nethercote, S. (1989). The development of Victorian Business As- sistance Referral System--VBARS. In J.R. Quinlan (Ed.), Ap- plications of expert systems, (Vol. 2), pp. 182-203. Reading, MA: Turing Institute/Addison-Wesley.

Scott, A.C., Clayton, J.E., & Gibson, E.L. (1991). A practical guide to knowledge acquisition. Reading, MA: Addison-Wesley.

Turban, E. (1990). Decision support and expert systems: Management support systems (2rid ed.). New York: Macmillan.