integrating information from disparate contexts: a theory ...web.mit.edu/smadnick/www/wp2-old...

Integrating Information from Disparate Contexts:A Theory of Semantic Interoperability

Jacob Lee

CISL WP# 96-02March 1996

The Sloan School of ManagementMassachusetts Institute of Technology

Cambridge, MA 02142

Integrating Information from Disparate Contexts:A Theory of Semantic Interoperability

by

Jacob L. Lee

Submitted to the Alfred P. Sloan School of Management on March 1996 in partial fulfillment of

the requirements for the degree of Doctor of Philosophy in Management

ABSTRACT

Current technology (e.g. the Internet) has provided vast opportunities to link many

information sources (e.g. databases) to many information receivers (e.g. users and applications).

This link, however, provides mainly for physical interoperability. There still remains major

difficulties in achieving semantic interoperability. Semantic interoperability means that sources

and receivers can exchange information in a meaningful manner. Ensuring semantic

interoperability is difficult when sources and receivers have different contexts. That is, receivers

have preferences, goals and assumptions about how to ask questions and interpret answers.

Sources, that have been independently created and maintained, can have a different set of

underlying assumptions, and which are reflected in the way their information is organized and

presented. Receivers may therefore encounter serious difficulties when interacting with the

variety and multiplicity of distributed, autonomous sources.

In general terms, this thesis is concerned with the development of intelligent systems that

will facilitate semantic interoperability between sources and receivers, without violating their

contexts, allowing them to continue functioning "as usual". More specifically, this thesis has three

components. The first component draws upon the philosophical disciplines of Semantics and

Ontology for insights into the nature of semantic interoperability. In the second component,

these insights are integrated into a formal definition of a semantically interoperable system based

on the Context Interchange Architecture. This definition is an abstract specification which

represents a conceptualization of semantic interoperability. It is a formal statement of what a

Context Interchange system should deliver and provides a rigorous basis for an implementation.

In the third and final part of this thesis, a system, which realizes this specification, is developed to

serve as a proof of concept and to illustrate various aspects of the theory developed in this thesis.

Thesis Committee: Professor Stuart E. Madnick (Chair)

Dr. Michael D. Siegel

Professor Yair Wand

Table of Contents

ABSTRACT .1

Table of Contents .................................................................... 2

1 Introduction and Overview ...................................................... 5

1.1 Semantic Interoperability ........................................................................................................ 6

1.2 The Context Interchange Approach ...................................................................................... 11

1.3 T h esis O verview ........................................................................................................................... 14

1.3.1 Ontological and Semantical Foundations .............................................................. 15

1.3.2 Formal Specification of Context Interchange......................................................... 16

1.3.3 Proof of Concept ........................................................................................................ 17

1.4 T hesis O rganization ..................................................................................................................... 18

2 L iterature R eview ....................................................................................................................................... 19

2.1 Obstacles to Semantic Interoperability.................................................................................. 19

2.2 Approaches to Interoperability ............................................................................................... 20

2.2.1 Schema Integration ...................................................................................................... 20

2.2.2 A Loosely Coupled Approach ................................................................................. 23

2.2.3 Implementations that use Ontologies ...................................................................... 24

2.3 The L ogic of C on texts .................................................................................................................. 26

2.3.1 LOC and BULLION .................................................................................................... 27

2.3.2 LOC is a Formal Language ........................................................................................ 27

2.3.3 L O C vs. FO L ................................................................................................................... 28

3 Semantics and Ontology: Philosophical Foundations for Context Interchange .......................... 29

3.1 Semantics and Semantic Interoperability ............................................................................... 30

3.2 Ontology and Semantic Interoperability .............................................................................. 32

3.3 Defining the Shared Ontology............................................................................................... 38

3.3.1 Defining Predicates and Arguments......................................................................... 39

3.3.2 Defining the Deductive Relationships among Propositions................................ 41

3.4 Context Definition Rules and Integrity Constraints............................................................. 44

3.5 Simplifying the task of Domain Definition .......................................................................... 46

4 BULLION: A Proof Theoretic Specification of Context Interchange.................................................. 48

4.1 The Knowledge Level: Background and Motivation ......................................................... 48

4.2 Features and Benefits of the Specification ............................................................................ 51

4.3 The Logical V iew of D atabases: A Review .............................................................................. 52

4.3.1 First O rder Languages .............................................................................................. 52

4.3.2 Relational Databases: The Model Theoretic View............................................... 53

4.3.3 Relational Databases: The Proof Theoretic View ................................................. 55

4.4 A Logical V iew of a C ontext ....................................................................................................... 57

4.5 Adding the Shared Ontology and Context Definition Rules ............................................. 62

4.6 D iscussion ...................................................................................................................................... 66

5 The BU LLIO N Prototype: Proof of Concept I........................................................................................ 68

5.1 Context Interchange for Exam ple A ...................................................................................... 68

5.1.1 Sources and Receivers ............................................................................................... 68

5.1.2 The Shared Ontology and Context Definitions ...................................................... 69

5.1.3 Sam ple Q ueries and A nsw ers ................................................................................. 70

5.2 C ontext Interchange for Exam ple B ....................................................................................... 72


5.2.2 The Shared Ontology and Context Definitions ...................................................... 72


5.3 C ontext Interchange for Exam ple C ...................................................................................... 75


5.3.2 The Shared O ntology ................................................................................................ 76

5.3.3 C ontext D efinitions ................................................................................................... 77


5.4 Where did the Data come from? -The Logic Behind the Logic ......................................... 84

5.5 W hat does the data m ean? - C ontext Explication ................................................................ 86

5.7 D iscussion ...................................................................................................................................... 89

6 The BU LLIO N Prototype: Proof of C oncept II ...................................................................................... 91

6.1 Refinem ent 1 ................................................................................................................................. 91

6.2 Refinem ent 2 ................................................................................................................................. 94

6.3 D iscussion .................................................................................................................................... 103

7 Logic as an Im plem entation Paradigm .................................................................................................. 104

7.1 Prolog and Relational D atabases.............................................................................................. 104

7.2 Prolog and Sem antic Q uery O ptim ization ............................................................................. 105

7.3 Prolog and Looping ................................................................................................................... 106

7.4 O ther Logic Program m ing Languages .................................................................................... 108

7.5 The Logic of Contexts Revisited............................................................................................... 109

8 Conclusions and Future W ork ................................................................................................................ 112

8.1 Thesis Sum m ary ......................................................................................................................... 112

8.2 Future W ork ................................................................................................................................ 113

References ................................................................................................................................................................ 115

1 Introduction and Overview

Current technology (e.g. the Internet, client-server systems) has provided vast

opportunities to link many distributed and autonomous information sources (e.g. relational

databases, legacy systems etc.) to many information receivers (e.g. applications, users etc.). This

link, however, provides mainly for physical interoperability. Physical interoperability is primarily

concerned with the connectivity of heterogeneous hardware and software platforms, different

underlying communication protocols etc. There remains, however, major difficulties in achieving

semantic interoperability.

Semantic interoperability means that sources and receivers can interact in a meaningful

manner. Ensuring semantic interoperability is difficult when sources and receivers have different

contexts. That is, receivers have preferences, goals and underlying assumptions about how to ask

questions and interpret data. Sources, which have been independently created and maintained

also have assumptions which may be implicit and which may not be consistent with those of the

receiver. As a simple example, a source may, implicitly, represent financial information of

companies in terms of US dollars, while a receiver assumes these amounts to be in Japanese yen.

Furthermore, a receiver might, tacitly, assume that the source contains financial information

pertaining to all and only companies being traded on the New York Stock Exchange. This might

not be the case. In such a situation, the statistics computed from the source might not be

appropriate for the receiver. Other real world examples of similar problems are described in [40,

56].

Hidden assumptions arise in a particular context because of the need to efficiently store

and communicate information. Usually, there is no need to explicitly state what is generally

assumed to be true by all concerned within a particular context, and source-receiver interactions

do not suffer from conflicting assumptions. However, as businesses increasingly see the need to

integrate information from various autonomous and distributed sources to facilitate business

decisions and operations, the problem of hidden and conflicting assumptions quickly becomes a

serious problem. This situation is aptly summarized by Ventrone and Heiler:

"In current database systems, most of the data semantics reside in the applications rather

than in the DBMS. Moreover, data semantics are often not represented directly in the application

code, but rather in the assumptions which the application -- or, more correctly, the programmer --

makes about the data. This situation is tolerated in local database environments largely because

the local applications work with a shared set of assumptions. However, serious problems are

likely to occur during a database integration...effort because sets of local assumptions clash and

local applications do not have access to the semantics represented in "foreign"

applications...When semantic information that is hidden in applications is made explicit and

accessible through the database, then the semantic problem becomes.. .much more tractable..." [71].

To exacerbate matters, over time, the goals and assumptions of sources and receivers can

change. Furthermore, new sources and receivers can enter an existing federation while existing

sources and receivers may leave. Receivers may therefore encounter serious difficulties when

interacting with the variety and multiplicity of distributed, autonomous sources. Therefore, the

increased access to and the proliferation of information resources are both a boon and a bane to

decision makers. The former because of the easy availability of information required for decision

making. The latter because decision makers are required to expend non-trivial cognitive effort in

order to filter off irrelevant information and to make sense of the relevant information that

remains. In the case where receivers are software programs, a non-trivial amount of maintenance

might be required for these programs to maintain semantic interoperability with sources in a

large scale and dynamic federation. Section 1.1 more concretely describes what we mean by

semantic interoperability between sources and receivers.

1.1 Semantic Interoperability

Consider Example A where two sources are shown in Fig. 1.1 and Fig 1.2. Source 1

contains a set of propositions about the names of companies and the cities in which their head

offices are located. Source 2 contains a set of propositions on the names of companies and the

countries in which they are incorporated. A receiver (Receiver 1) has a view with a schema

r1_country-of-incorporation (Companyname, Countryname). This means that the

receiver sees a virtual table against which it can issue queries. Furthermore, there is a domain

associated with each column in the schemas of the sources and receiver. In this example, the

domains of Company_name for Source 1, Source 2 and Receiver 1 are {cl, c2, c3, c4}, {c5, c6}

and {c2, c3, c4 and c5} respectively. Note that the domain of Companyname in the receiver's

schema can, and in this case does, differ from that of the sources. Besides domain constraints,

there can be other integrity constraints associated with a source schema and a receiver schema.

For the purpose of this example, however, we shall consider only domain constraints.

Now suppose that for all the companies X that the sources and receiver are concerned

with (i.e. c1 to c 6), it is generally known, or agreed upon, that if the head office of x is located in

a city Y, and if city Y is in country Z, then the country of incorporation of company X is country Z.

Knowledge of which cities are in which countries is also available. This knowledge is shown in

Fig. 1.3 and can be used to convert information in Source 1 to information meaningful to the

s1 head office

Companyname Cityname

cl ny

c2 tokyo

c3 london

c4 chicago

Fig. 1.1 Source 1

s2_country-ofincorporation

Companyname Countryname

c5 usa

c6 japan

Fig. 1.2 Source 2

receiver. Note that throughout this thesis, arguments with uppercase first letters represent

variables, otherwise, they represent constants. Thus, ideally, the receiver's view (Fig. 1.4) can be

populated with facts derived from both sources and the knowledge in Fig. 1.3. The companies c1

and c 6 have been excluded from the receiver's view because it is not within the receiver's domain

of interest. The receiver may then issue the following SQL query against this view:

select Companyname

from r1_countryof_incorporation

where Country-name= "usa"

The answer returned should, ideally, be <c 4> and <c 5>, which is consistent with the receiver

context. Observe that the companies named cI and c 6 were correctly excluded from the answer

without requiring the receiver to make the domain restriction explicit in the query. This simple

example illustrates the desired goal of achieving meaningful information exchange while

preserving the autonomy of receiver and source alike. That is, the receiver should be allowed to

issue queries and be presented with appropriate answers in terms it expects. Furthermore, a

receiver should not be required to explicitly state its assumptions that are normally 'taken for

granted' in its local environment within a query. In this particular example, domain assumptions

are taken for granted by the receiver. This is common practice because domain assumptions, like

most context assumptions, tend to be stable over time, and explicating them within a query may

require non-trivial effort. For example, the receiver should not be required to issue a query in

which domain assumptions are explicated i.e.:

select Company-name

from ricountry-of_incorporation

where Countryname= "usa"

and

Company-name= "c 2"

or

Company-name= " c3"

or

Companyname="c4"

or

Company-name=" c5"

VCompany-name, Cityname, Countryname

head_of f ice (Companyname, Cityname) A locatedin

(Cityname, Country-name) =>

country_ofjincorporat ion (Company-name, Countryname).

locatedin (ny, usa).

locatedin (tokyo, japan).

locatedin (london, uk).

locatedin (chicago, usa).

Fig. 1.3 General Domain Knowledge

r1_country-of_incorporation

Companyname Countryname

c2 japan

c3 uk

c4 usa

c5 usa

Fig. 1.4 View of Receiver 1

This does not mean, however, that the receiver need not state its context at all. Rather, we

are advocating that a receiver be allowed to state its context outside the confines of a query. As

the context tends to be stable over time, it can be reused to appropriately process queries during

the period in which the context applies. The receiver need not restate its context each time it

issues a query. Consequently, a consistent and convenient user interface is presented to the

receiver regardless of the source. Supporting receiver autonomy is the key to minimizing the

cognitive effort required by decision makers to interact with sources of data that are

independently created and maintained. And in the case where receivers are computer programs,

receiver autonomy means that the need to rewrite such programs as a result of changes to sources

is minimized, if not eliminated.

This simple example illustrates the fact that hidden assumptions can take the form of

constraints on explicit attributes. However, hidden assumptions can take the form of constraints

on implicit attributes. As a result, it might not be obvious what information is being represented

by the source, or what information is acceptable to a receiver. This makes achieving semantic

interoperability even more difficult. To illustrate this problem, consider Example B, in which the

relation, s3_closing price, of Source 3, is shown in Fig. 1.5.

s3 closing-price

Stock Price Exchange

55.00 ns

stk2 40.00 nyse

stk3 6330.00 tokyo

stk4 6531.00 tokyo

Fig. 1.5 Source 3

It contains information only on stocks traded on the New York and Tokyo stock exchanges, and

their closing prices for a particular date (say ti). The context of Source 3 restricts propositions

about the closing prices such that the currency of the closing prices of stocks traded on the nyse

and tokyo exchanges must be in US Dollars (usd) and Japanese Yen (yen) respectively. These

date and currency assumptions are "understood" in the local environment, but are not explicit in

the relation. This problem has been termed by Kent [41] as domain mismatch and which he claims

to be one of the important problems in semantic integration . Note the complexity of this

problem for Source 3. Not only are the currency values implicit, but they are not the same for all

tuples in the same relation! Thus, there is heterogeneity even for a single attribute!

Now consider Receiver 2, with a different context, whose schema is

r2_closingprice (Stock, Price, Exchange). The receiver is also interested in the closing

prices only of stocks that are traded on the New York and Tokyo stock exchanges at time t1.

However, the receiver expects the currency of all closing prices to be in usd. Let us assume that

Source 3 is used to populate the receiver view. Now, suppose the receiver issued the following

query

select Stock, Price

from r2_closing-price

where Price> 50.00 .

Assuming, for simplicity, that the conversion rate is 1 US dollar to 100 yen, the answer derived

from the source should be, ideally, <stkl, 55.00>,<stk3, 63.30>and<stk4, 65.31>.

These examples illustrate what we mean by semantic interoperability between sources and

receivers.

1.2 The Context Interchange Approach

The goal of the Context Interchange Approach [56, 68] is to achieve semantic

interoperability among sources and receivers. In this thesis, we focus exclusively on structured

data (e.g. relational databases). In order to better understand The Context Interchange Approach,

we shall first consider some alternative strategies which seem relatively "straightforward". One

such strategy is to insist that all sources and receivers adopt a standardized means of

representing and retrieving information. Clearly, a universal standard can result in semantic

interoperability. But achieving such a standard is difficult, if not impossible. This would mean

that existing sources and receivers must migrate to this standard, which is a potentially costly

process. Even a change due to the introduction of a single new European currency unit, the ECU,

can be expensive [30]. New systems must be designed according to this standard, reducing

flexibility and impose more constraints on the design process. Furthermore, a single

standardized representation may not be suitable for all situations or for all concerned. For

example, it may not be practical for a geographically distributed information system to adopt a

single currency for monetary amounts. It may be more convenient to represent money amounts

in terms of Yen for information systems in Japan and in US dollars for systems in the USA.

Therefore, one of the constraints imposed on Context Interchange systems is that the autonomy of

sources and receivers must be preserved.

Could semantic interoperability be achieved, while preserving autonomy, by

constructing translation procedures between sources and receivers? Certainly, if these translation

procedures can be constructed, semantic interoperability can be achieved. Unfortunately, if we

have M sources and N receivers, this might result in M x N sets of translation procedures. This

strategy breaks down when M and N approach large numbers. Furthermore, over time, sources

might modify the way they represent data and receivers might change the format in which they

expect to receive data. This means modifications must be made to existing translation

procedures. And as new sources and receivers are introduced, new translation procedures need

to be constructed. For each new receiver, potentially M sets of translations procedures are

required. Thus, another constraint imposed on Context Interchange systems is that it must be as

scalable andflexible as possible. By scalability, we mean that a Context Interchange system must

be able to accommodate a large number of sources and receivers. By flexibility, we refer to the

ability to easily cope with changes and evolution within a dynamic federation.

The Context Interchange Architecture (Fig. 1.6), which we propose, satisfy these constraints in

two ways. First, sources and receivers are required to explicitly declare their contexts, which

includes otherwise implicit assumptions (e.g. domains and currencies), with respect to a shared

ontology. Second, reusable "general knowledge" is specified at the global level, in the shared

ontology, as opposed to being encoded in local customized translation procedures that are

Shared Ontology

Source Context

Context Mediator

Source

Receiver

Fig. 1.6 Context Interchange Architecture

particular to a source-receiver pair. The shared ontology represents a shared agreement about a

particular domain of interest and forms the basis for the exchange of information among sources

and receiver with different contexts. Without a commitment to a shared ontology, information

exchange would be impossible. Knowledge in the shared ontology will be used by the context

mediator to identify relationships between the source contexts and receiver contexts. For example,

the shared ontology might contain knowledge of currency conversion rates (e.g. from US dollars

to Japanese Yen). Furthermore, since the underlying assumptions of sources and receivers are

explicitly represented, the appropriate conversion rules are automatically selected by the context

mediator to transform source data to appropriate answers for a receiver query. This conversion

knowledge need only be specified once but may be used and re-used to automatically select

conversion procedures for various sources and receivers. This reduces the effort to manually

construct and maintain translation procedures between sources and receivers. The explicit

declaration of contexts and the automatic selection of conversion procedures is, in fact, two key

distinguishing features of the Context Interchange Approach.

Preserving receiver autonomy means that a receiver issues queries as per normal even

though sources and their contexts might change. The receiver is shielded from the details of the

sources used and the transformations that take place "under the hood" in delivering an answer to

a query. Therefore, a third key distinguishing feature of the Context Interchange Approach is the

ability for the system to provide an explanation of the answers delivered to the user. This

explanation should describe

1. What the original data is,

2. Where it came from (i.e. which sources) and

3. How it was transformed to the final answer.

This feature is particularly important because the receiver might be interacting with a wide

variety of unfamiliar sources. Knowing which sources the original data comes from can help

determine the reliability of the answers. Furthermore, decision makers would feel more

comfortable if they were familiar with the assumptions and rules that were used to convert the

original data to the final answer. This capability is analogous to the explanation features which

are deemed to be an integral part of expert systems. Concern about where the original data comes

from motivated the development of a source tagging theory [77, 78]. Here, information retrieved is

"tagged" with the identifiers of original and intermediate sources used to derive the final answer.

However, the user is unable to ask questions about the original data itself or the transformation

rules used to obtain the final answer.

A fourth distinguishing feature of the Context Interchange Approach is that it should

allow users to browse or ask questions about the context of particular sources. This feature is

termed context explication. Context explication enables users to understand the underlying

assumptions of particular sources and to determine the suitability of a source for a particular

application. Once again, this is a very useful feature given the variety of unfamiliar sources a

user might potentially interact with.

In this thesis, a theory of semantic interoperability for the Context Interchange

Architecture is proposed. This model, which applies to multiple sources and receivers, will be

referred to as BULLION, which stands for the BUnge-Lee's Logic of IntegratiON. This theory

draws up on Bunge's Semantics and Ontology [9, 10, 11, 12] and First-Order Logic as its

foundations. We will show how this theory of semantic interoperability realizes the goal of

achieving a high level of semantic interoperability while preserving autonomy, scalability and

flexibility, within an elegant and general theoretical framework.

1.3 Thesis Overview

This thesis is divided into three parts. Each part of the thesis is aimed at achieving a set

of goals. A summary of the thesis goals is shown in Table 1. We will now briefly discuss each of

these parts in turn.

Part 1 Ontological and Semantical Foundations

la Identify the proposition as a basic unit of exchange.

lb Explicate the notion of a context.

1c Explicate the notion of an ontology.

1d Define conversion as deduction.le Develop a framework for defining a shared ontology.

Part 2 Formal Specification of Context Interchange

2a Integrate the notions of context, ontology, proposition andconversion into a proof theoretic specification of a ContextInterchange system.

2b Define what a "mediated" answer to a query is.

Part 3 Proof of Concept

3 Demonstrate proof of concept by means of a series of Prologprograms.

Table 1 Summary of Thesis Goals

1.3.1 Ontological and Semantical Foundations

The theory developed in this thesis proposes an approach for the formalization of semantic

interoperability within the framework of the Context Interchange Approach. Such a

formalization facilitates the identification and representation of the knowledge required to

instantiate a Context Interchange system. In the first part of this thesis, we draw upon Semantics

[2, 9, 10] and Ontology [11, 12] for insights into the nature of semantic interoperability. From

these disciplines, we

la. Identify the proposition as a basic unit of exchange.

1b. Explicate the notion of a context.

1c. Explicate the notion of an ontology.

id. Define conversion as deduction.

le. Develop a framework for defining a shared ontology.

A basic premise of this thesis is that ultimately, the basic unit of information exchange is

an "elementary" proposition. More specifically, propositions about a domain of interest. A source

context defines the set of propositions that are expressible by the source, while the receiver

context defines the set of propositions that are acceptable to the receiver. A context is a set of

constraints on both implicit attributes as well as regular attributes. Thus, traditional integrity

constraints are subsumed under this definition. This conceptualization is particularly useful for

both source and receiver. As we shall see, a source context definition can facilitate semantic

query optimization, while a receiver context definition allows the user to specify assumptions

about implicit as well as regular attributes.

A shared ontology is a shared agreement that defines the global set of propositions for

information exchange and the deductive relationships among these propositions. Finally,

conversions are construed as a chain of deductions. Such a conceptualization allows complex

conversion procedures to be constructed automatically from a set of simpler deductive rules in

the shared ontology. For example, a conversion procedure might comprise a schema translation,

multiple currency conversions, and other arithmetic operations. The ability to automatically

construct complex conversion procedures reduces the need for labor intensive effort. Knowledge

in the shared ontology can also be used to define the context of sources and receivers. This

promotes further re-use of shared knowledge. Finally, we present a framework for defining a

shared ontology.

Each of these ideas are relatively novel with respect to some other integration approaches

(more in Chapter 2). Yet by themselves, they do not appear to offer any significant additional

insight for achieving semantic interoperability. But together, this suite of ideas coalesce into an

elegant conceptualization of semantic interoperability, which results in systems that can achieve

the goals of Context Interchange just described. For this reason, the application of these ideas to

Context Interchange is considered to be the core contribution of this thesis.

1.3.2 Formal Specification of Context Interchange

The second part of my thesis is concerned with integrating these abstract and disparate

concepts into a coherent, formal specification of a Context Interchange System (i.e. BULLION).

A formal specification of Context Interchange is motivated by criticisms raised by a number of

leading academics. For example, in a landmark paper, Reiter criticized the proliferation of

proposals for data models which were "served up without the benefit of an abstract specification"

[62]. Similar sentiments were echoed by the philosopher Bunge with respect to the field of

Artificial Intelligence (Al): "Unfortunately von Neumann's tacit advice - Start by getting hold of a

precise description of the cognitive task you want to simulate on a machine - is all too forgotten.

In fact, AI is clogged by unanalyzed concepts and metaphors, resembling now poetry, now

advertising copy" [13: p2 70]. These sentiments serve to underscore the importance of and,

ironically, the virtually non-existent emphasis on abstract specification in these respective areas.

This is also a problem in the area of interoperable systems.

Furthermore, Newell [61] suggested that logic is ideally suited as a specification

language. Reiter also proposed logic as a specification language for data models, and showed

how the Relational data model [20] may be couched in proof theoretic terms [62]. A logical

specification is precise and explicit, and will aid the comparison, understanding, design and

analysis of system implementations. Therefore, following Reiter and Newell, the specification of

a Context Interchange system will be couched in proof theoretic terms. That is, we

2a. Integrate the notions of context, ontology, proposition and conversion into a

proof theoretic specification of a Context Interchange system.

This specification is important in order to understand, in formal terms, the components of a

Context Interchange system, their relationships and what a Context Interchange system should

deliver. Without such a definition, it will be difficult to proceed in a "scientific" manner.

The goal of Context Interchange is to achieve semantic interoperability while preserving

source and receiver autonomy. An important aspect of this goal is delivering answers that are

meaningful in the receiver's context. The premise of Context Interchange is that a

straightforward query evaluation may not suffice in the case where a source and receiver have

different contexts. The answer to a query must be mediated in some sense, in order to be

meaningful to the receiver. This mediated answer is expected to be different from an unmediated

answer in general. Just what is a mediated answer? If we do not have a formal definition of a

mediated answer, we will not have a rigorous basis for an implementation. With regards to

database research, Reiter has argued that defining an answer to a query is a fundamental step

[62]. Therefore, the specification proposed in this thesis also includes the definition of what a

mediated answer to a query should be, given the various contexts and the ontology (Fig. 1.7).

Given that this specification is to be couched in first order logic, the assumptions and semantics

underlying the Context Interchange approach will be very explicit, which facilitates

understanding and theoretical analysis. Furthermore, the specification will be independent of

idiosyncratic implementation details and limitations. However, various implementations of this

specification are possible, and can differ in terms of the choice of practical trade-offs being made

(e.g. completeness vs. efficiency). Therefore, we

2b. Define what a "mediated" answer to a query is.

(1) source data 0(2)source context 0(3) receiver context - Context Mediator (6) answer(4) shared ontology -O

(5) query 0

Fig. 1.7 Graphical View of the Specification

1.3.3 Proof of Concept

The final question that will be dealt with in this thesis is "Is this an appropriate

definition?". This will be demonstrated by means of a series of Prolog programs. Prolog is a logic

programming language whose relationship to first-order proof theory is well understood [55],

and can itself serve as a specification for lower level languages. In this thesis, the Prolog

programs will be used to demonstrate that this conceptualization of semantic interoperability

meets the expectations of what an answer should be in a Context Interchange system. Sources,

contexts, the shared ontology and queries are all modeled in terms of Prolog statements. These

programs will also be used to demonstrate all the desirable features of Context Interchange

systems including the capability for explanation as well as context explication. Thus, another

goal of this thesis is to

3. Demonstrate proof of concept by means of a series of Prolog programs.

These programs will be used to demonstrate how semantic interoperability can be achieved for

Examples A and B. However, these are relatively simple examples. To further demonstrate the

usefulness and versatility of a BULLION system, we shall show how various obstacles to semantic

integration that have been identified in the literature can be solved within this framework. We

will also argue why BULL ION systems are scalable and flexible, and show how sources and

receivers can be added over time with little effort.

One immediate consequence of this thesis is that it suggests logic programming as an

implementation paradigm for the semantic integration of information sources. This is a novel

application of logic programming and it provides a number of significant benefits, both

theoretical as well as practical, to the task of semantic integration. Thus, we claim that logic can

be used not only to conceptualize Context Interchange, but may be used to implement it as well.

This will be discussed in Chapter 7 of this thesis.

1.4 Thesis Organization

The thesis chapters are organized as shown in Table 2. In Chapter 2, we compare and

contrast other approaches to database integration with the Context Interchange Approach as

realized by BULLION. Chapters 3 and 4 deal with Parts 1 and 2 of Table 1 respectively. Chapter 5

and 6 correspond to Part 3. Chapter 7 discusses logic programming as an approach to the

semantic integration of information sources. The thesis concludes with Chapter 8.

Chapter Title

1 Introduction and Overview

2 Literature Review3 Semantics and Ontology: Philosophical Foundations for Context

Interchange4 BULLION: A Proof Theoretic Specification of Context Interchange

5 The BULLION Prototype: Proof of Concept I

6 The BULLION Prototype: Proof of Concept II

7 Logic as an Implementation Paradigm

8 Conclusions and Future Work

Table 2 Summary of Thesis Chapters

2 Literature Review

2.1 Obstacles to Semantic Interoperability

Various kinds of heterogeneities present obstacles to the achievement of semantic

interoperability. In order to characterize these heterogeneities, We will use the notions of a

symbol, a construct and a context. One of the most basic distinctions emphasized in Semantics is

the distinction between a symbol and a construct. A symbol is a physical object (e.g. a character

string) that is used to designate meaning. A construct, on the other hand, is the meaning

assigned to a symbol and has no physical existence apart from mental processes [9: pp. 21-23].

Finally, a context is a set of constructs [9].

For the purpose of this thesis, I will therefore classify heterogeneities as symbolic and

context heterogeneity. The most commonly cited forms of symbolic heterogeneity are homonyms

and synonyms which give rise to naming conflicts [8]. Homonyms refer to identical symbolic

items that designate different constructs. Synonyms refer to different symbolic items that

designate the same construct. Naming conflicts arise in databases not only at the schema level,

but also at the instance level. As an example of the former, an attribute name Revenue in a

relation might mean the same thing as an attribute name Rev in another relation (i.e. a synonym).

This is a naming conflict at the schema level. In Example A, if the company names c1 in Source 1

and c5 in Source 2 refer to the same company, then there is a naming conflict at the instance

level.

Another type of symbolic heterogeneity, known as schematic discrepancy, has been

identified by Krishnamurthy et al [44, 45]. To illustrate, they used three databases euter, chwab

and ource.

Database euter:

relation r : {(date, stkCode, clsPrice) ... }

Database chwab:

relation r : {(date, stkl, stk2,...)...}

Database ource:

relation stkl: {(date, clsPrice)...},

relation stk2: {(date, clsPrice)...},

Observe that the "same" information is encoded by each database. Furthermore, there are no

naming conflicts due to homonyms and synonyms per se. For example, the stock codes (i.e.

stkl, stk2 etc.) are consistent in all three sources. However, in one source, the stock codes are

instance values, in the second the codes are attribute names and in the third they are relation

names. Although the same information is conveyed in these databases, the values that may be

retrieved by means of a query language like SQL will vary for each database. This is because SQL

can only retrieve instance values. Thus, SQL can retrieve stock codes from euter but not from the

other two databases.

Assuming that symbolic heterogeneity has been resolved, there still remains the problem

of context heterogeneity. That is, different sources and receivers represent and accept different

sets of constructs. For example, one source is able only to represent the construct f ather while a

receiver might only accept parent. In this case, one construct is at a higher level of abstraction

than the other i.e. one is a generalization of the other [25].

Examples A and B of Chapter 1 also illustrate various context heterogeneities. Source 1

only represented information about the locations of head offices of various companies, whereas

Receiver 1 would like only information about the countries in which these companies are

incorporated. Another problem is presented in Example B where in Source 3, not only was the

currency of the closing prices implicit, but the currency was not the same for all tuples. The

receiver, however, expected all prices to be in a single currency! As another example, a source

might only have information about the Revenue and Expense of companies, whereas a receiver

might accept only information about the Prof it of these companies.

2.2 Approaches to Interoperability

In response to these various problems, there has been a proliferation of database

integration approaches. Some of these approaches involve the construction of global schemas [4,

46, 58] or federated schemas [36]. Other approaches are considered loosely coupled [44, 45, 53]. For

a survey of these various approaches, the reader may refer to [54, 67]. However, dissatisfaction

with the various inadequacies of these approaches has led to recent work that draw upon the

insights from the Artificial Intelligence and Knowledge Representation communities [3, 21, 33].

We will now describe each of these approaches in turn.

2.2.1 Schema Integration

One approach to achieving semantic interoperability that has been prevalent in literature

involves the integration of source schemas into a global schema or superview. Receivers are then

provided an integrated view of the underlying databases and can issue queries against the global

schema. Alternatively, local receiver views can be defined against this global view. We consider

the approaches described in [4, 25, 46, 58, 59] to be typical of schema integration approaches

although they may differ from each other in various respects. Therefore, we shall consider these

particular approaches for the purpose of our comparison.

A global schema primarily captures knowledge of aggregation and generalization as

discussed in [69]. Informally speaking, attributes A1,...,Am (as in the columns of a relation) of a

class of entities can be aggregated with attributes A m+1,...,An of another class to form a new class

with attributes A1,...,An. If every instance of a class C1 is also an instance of a class C2, then C2 is

a generalization of C1. Generalization can be used to resolve differences in levels of abstraction

[25,48]. In [4], aggregation and generalization are described in Entity-Relationship (ER) terms as

relationships and ISA respectively.

A global schema, as described in [4, 58, 59] "does not provide enough semantics" [67] and

is therefore limited in its ability to capture relationships among concepts. For example, a global

schema cannot capture the kind of knowledge described in Fig. 1.3, i.e. the rule

VCompanyname, Cityname, Countryname

headoffice (Company-name, City-name) A locatedin

(Cityname, Countryname) =

country-ofjincorporation (Company-name, Country-name).

cannot be encoded in a global schemas as defined in [4,58].

The conversion knowledge required to resolve various conflicts, such as naming conflicts,

scale conflicts and so on, are captured in the mappings between local schemas and the global

schema. However, the resolution of naming conflicts in schema integration, by definition, is

restricted to the schema level. Naming conflicts at the instance level are typically ignored. These

mappings are referred to as view definitions.

In an example used in the description of Multibase [46], some sources contained

HOURLYWAGE information while other sources contain YEARLYSALARY information.

However, the global schema cannot represent both at the same time. So, one attribute,

YEARLYSALARY, was chosen to be represented in the global schema. Functions to convert

HOURLYWAGE in different sources to YEARLYSALARY in the global schema are then encoded in

view definitions for the respective databases. Thus, for each database which contains

HOURLYWAGE information, one such conversion procedure is required. This conversion cannot

be expressed in the global schema itself. As another example described in [25], one source had Ht

(i.e. Height) information in inches (ins), while another had Ht information in centimeters (cms).

Only one of these can be represented, so Ht in cms was chosen as a unifying dimension. The

conversion of from ins to cms is then encoded in a view definition.

This strategy presents a problem because the same conversion rule may have to be

constructed many times for different sources. Furthermore, if there is a change in the rule (e.g.

the conversion from HOURLYWAGE to YEARLYSALARY changes), all the affected view

definitions need to be changed. This may be a non-trivial task if there are many sources involved.

Furthermore, conversion procedures embedded in view definitions can obscure the semantics of

the databases.

Thus, it is not surprising that the maintenance of a global schema is difficult [36, 52, 53].

For one thing, local schemas may change over time and the literature is "largely silent" [8] as to

how to manage the evolution of local schemas. To reduce the magnitude of this problem, a

strategy was proposed in [36], which involved "a selective and controlled integration of its

components" and "represents a compromise between no (schema) integration...and total (schema)

integration" [67]. In essence, this strategy reduces the degree of information sharing in order to

reduce the difficulty of maintaining a global schema. In contrast, the Context Interchange

Approach is expected to maintain a high degree of information sharing without the

corresponding degree of difficulty found in schema integration strategies.

In summary, the schema integration approach is not as scalable or flexible because

(1) the global schema is not able to accommodate richer forms of general knowledge that

might be reusable and

(2) does not offer a means to effectively explicate the underlying assumptions associated

with various schemas (e.g. implicit assumptions about currencies).

In contrast, the Context Interchange Architecture, as realized by the BULLION model,

allows richer forms of reusable general knowledge to be captured at the global level. Such

knowledge may be used for both conversion as well as for explicitly defining the underlying

assumptions of sources and receivers. Furthermore, as underlying assumptions of sources and

receivers are made explicit, the context mediator automatically selects the appropriate conversion

procedure if one exists. A universally agreed upon conversion procedure (e.g. ins to cms) need

only be encoded once at the global level. Moreover, this conversion can be used not only for the

attribute Ht, but also for any other attribute with a length dimension (e.g. Wais t_Si ze). Thus,

even a single database with more than one attribute with a length dimension can benefit from this

re-use. This minimizes the load on the local database administrators in constructing conversion

procedures and for adapting to change.

The work on schema integration has been further developed by assuming that a key issue

is the identification of attribute equivalence [47] in different schemas. The key idea is that there

may be various degrees of equivalences among various attributes from different databases. For

example, the attribute company-name in Source 1 may be equivalent to the attribute

Companyname in another source if they have identical domains. Along similar lines, Sheth and

Kashyap [66] proposed the notion of semantic proximity to define the relevance of one attribute to

another. This approach involves a form of uncertain reasoning.

However, this approach is narrow and limited for the purpose of achieving a high level

of semantic interoperability. For example, it does not matter whether or not City_name in

Source 1 is equivalent or semantically close to Country-name in Receiver 1. The important

thing is whether or not the country of incorporation of a company can be derived from its head

office location. Similarly, it matters not if Revenues, Prof its and Expenses are semantically

close. What matters is that knowing two of these attributes will enable us to deduce the third.

Finally, the classical techniques used in schema integration cannot solve the problem of

schematic discrepancies as pointed out in [44, 45]. It is for this reason that the loosely coupled

approach has been introduced.

2.2.2 A Loosely Coupled Approach

A radically different approach was proposed in [44,45,52,53] which involved no schema

integration. It is referred to as the loosely coupled approach [67]. The motivation for this approach

is due to existence of schematic discrepancies which cannot be resolved by schema integration.

In essence, users are provided with powerful query languages and tools that enable them to

retrieve information from various sources. Thus, even if schematic discrepancies exist, the user

can successfully retrieve information. The problem, of course, is that the burden of "integration"

is shifted to users who are then confronted with a multiplicity of autonomous sources with

varying semantics, some of which are not explicit. As the sources that users need to interact with

grow in number or change, more and more cognitive effort is demanded of users. This approach

violates the notion of receiver autonomy. Fortunately, there is a better alternative. In Chapter 5,

we will demonstrate how the problem of schematic discrepancies can be resolved by the

BULLION approach, without resorting to higher order query languages, and placing the burden

on the receiver.

2.2.3 Implementations that use Ontologies

Problems with the schema integration and the loosely coupled approaches have led to

another stream of research that rely on shared ontologies. Currently, there is a great deal of interest

in the development of ontologies to facilitate knowledge sharing in general [31, 32, 60], and

database integration in particular [21, 64]. We will present more detailed discussions on the

notion of an ontology in Chapter 3. For now, we focus on integration approaches that rely on the

use of a shared ontology.

The Carnot project [21, 37, 65] is an approach that tries to reduce the effort of constructing

a global schema by using an existing ontology, the Cyc ontology [34, 49], as the global schema.

Source and receiver schemas are then mapped to the Cyc ontology by means of articulation

axioms. Articulation axioms are analogous to view definitions in Multibase. The assumption here

is that the global schema already exists and can be used for integration. The portion of the

knowledge in Cyc used by Carnot is still fundamentally based on aggregation and generalization

[2 1:p 6O, 37:p295].

There was also no specific discussion in the references [21, 37, 65] of how incompatible

units and scales are resolved in Carnot. For example, how would the Carnot approach address

the problem discussed earlier about Ht in cms and ins? One possibility is to create two Cyc

concepts Ht_in_cms and Ht_inins, and then define the conversion relationship between

these concepts in the Cyc ontology. Such a conversion relationship however cannot be expressed

in terms of aggregation or generalization. Another way is the Multibase approach. That is,

represent Ht_incms (say) in the Cyc ontology and encode the conversion function in the

articulation axioms. As pointed out earlier however, this method limits reuse of such standard

conversion procedures. Carnot also does not address the issue of explicating underlying

assumptions of sources and receivers.

The SIMS project [3] deals primarily with planning and reformulating of queries for

retrieving data from multiple databases more efficiently. The ontology, which is represented in

LOOM, contains knowledge about the information various databases have on various classes of

entities (e.g. Commercial ships, naval ships etc.). This facilitates the selection of appropriate

sources for data retrieval. As in classical schema integration, key concepts used in SIMS are

generalization and specialization as manifested in the specialize-concept and generalize-

concept operators. However, SIMS is not intended, to deal with the problems of semantic

interoperability described in Chapter 1. Thus, for example, even if the SIMS query planner

selected the database with the right companies, it cannot convert heado f f ice information to

country-of_incorporation information. Furthermore, the SIMS approach does not deal

with issues of hidden semantics.

Another approach, the theory of semantic values [64], has been proposed as a means of

realizing the Context Interchange Architecture. This theory describes a means to explicitly and

non-intrusively represent meta-attribute values associated with various base (or application)

attributes. For example, consider a source with schema f inances (Company-name, Revenue,

Expense). The attributes Company-name, Revenue and Expense are base attributes. We

can then associate a meta-attribute Currency with base attributes Revenue and Expense for

example. Next, we can also associate conversion functions with meta-attributes. For example, let

us consider converting a Revenue of 10,000 US dollars to yen. This currency conversion is

specified as

cvtVal(10000{Currency=usd}), {Currency=yen})

and returns the value 1000,000 (assuming a conversion factor of 100). This approach is better

than the global schema approach in that conversion functions are centralized and a means for

specifying hidden assumptions explicitly is provided.

Although BULLION is based on the same architecture as the theory of semantic values,

there are significant points of departure in terms of the conceptualization of

(1) a conversion,

(2) a shared ontology and

(3) a context.

There are some limitations in the manner in which a conversion function is

conceptualized in the theory of semantic values. By definition, such conversion functions are

associated with meta-attributes. Conversions between base attributes therefore cannot be

specified. For example, we might want to derive a base attribute Profit from Revenue and

Expense. A conversion function, as defined in the theory of semantic values, does not

incorporate any knowledge of which base attributes are involved. Hence there is no way do

incorporate knowledge of the relationships among base attributes in conversion functions.

In BULLION, a conversion is a more general notion and is defined by means of a

deductive law. This allows us to specify conversions between attributes such as Profit,

Revenue and Expense. A set of deductive laws can be chained together to form more complex

conversion procedures.

Moreover, general knowledge in BULLION can be encoded at the global level in the

shared ontology, and can be used for both the definition of contexts and conversion. BULLION

does not distinguish knowledge used to define contexts from knowledge used for conversions.

This promotes greater re-use of knowledge. This is not the case for the theory of semantic values

where the shared ontology is distinct from the set of global conversions.

Finally, in the theory of semantic values, explicating hidden assumptions referred to

defining the values of meta-attributes associated with base attributes. In BULLION, the notion of

a context is more general. A context definition can include constraints on the values of base

attributes as well. This allows a receiver to express other kinds of assumptions, such as domain

assumptions, as part of its context definition. Furthermore, a source context defined in this

manner can facilitate semantic query optimization which we will discuss later in the thesis.

2.3 The Logic of Contexts

The notion of a context has been introduced in Al by John McCarthy in his Turing Award

lecture[57] to deal with the problem of generality in AI. R. V. Guha's thesis [33], which was

supervised by McCarthy, was an in-depth study of context. Guha's research primarily centered

around the Cyc system [34, 49]. Without the notion of contexts, it would have been virtually

impossible to manage a knowledge base the size of Cyc. The notion of a context makes the task

of managing a huge knowledge base like Cyc more manageable by partitioning the knowledge

base into smaller "chunks".

Guha's thesis represents the initial work on the Logic of Contexts (LOC). Further

development of LOC can be found in [14, 15]. Essentially, the LOC is a an extension of first-order

logic. More specifically, a second order predicate is t (c , p) is introduced, where c is a context

and p is a proposition that is true within this context c. LOC is primarily concerned with the

truth of propositions in various contexts.

Consequently, one of the strengths of LOC is the ability to express multiple, inconsistent,

theories in the same knowledge base. Thus, "Michael Jordan is tall" and "Michael Jordan is not

tall" are sentences that can coexist in the same knowledge base because the former sentence is

asserted in the context of people in general, while the latter is asserted in the context of basketball

players. A set of sentences associated with a context is termed a microtheory.

2.3.1 LOC and BULLION

The notion of a context defined by Guha differs from that used in BULLION. When

viewed from the perspective of LOC, the BULLION model, which includes sources, context

definitions and the shared ontology, would be considered a single consistent microtheory, i.e.

having a "single context". Thus, as far as BULLION is concerned, when a proposition is asserted

as true within a source or the shared ontology, it is viewed as true throughout the federation. To

resolve confusion with terminology, we will use the term microtheory instead of context when

referring to the concept as used in Cyc.

An in-depth treatment of a federation of sources and receivers with multiple inconsistent

theories is beyond the scope of this thesis. However, as a federation continues to grow, this

problem becomes unavoidable. Therefore, towards the end of this thesis, we will briefly consider

how the BULLION model might be extended, by means of LOC, to manage a large-scale and

dynamic federation. This is very much in the spirit of the Cyc knowledge base. And given that

BULLION is specified in first-order logic (FOL), making the conceptual leap to LOC, an extension

of FOL, is not too difficult as we shall see.

2.3.2 LOC is a Formal Language

Just as FOL is a formal language, so too is LOC. Therefore, just like any language, LOC

may be used to express various forms of knowledge without necessarily telling us what that

knowledge should be. In fact, Guha states that "Merely providing a syntax and semantics of a

new logic does not solve any problem. We need a better understanding of what contexts can be

used for and how they are to be used" [33:p2l]. In his thesis, Guha illustrated, with examples,

various possible applications of this formalism, including the integration of databases. In fact, the

Carnot project is the result of applying this formalism to the integration of databases. However,

in Carnot, LOC was used essentially for the purpose of schema integration, not very different

from the classical global schema approaches. The BULLION model, on the other hand, is not a

formal language but a theory about the identification, organization and use of knowledge needed

to realize the goals of the Context Interchange Approach. Nothing in the LOC tells us how to do

this.

Given that LOC is a very expressive formalism, it is amenable to a wide variety of

applications [33], including schema integration (as in the Carnot project), and can even be used

for the formalization of the BULLION model as well. Indeed, it was argued by Farquhar et al that

LOC can be used to express the global schema, federated or loosely coupled approach [27]. Thus

LOC offers a very powerful framework in which to understand and analyze various integration

strategies.

While it might be true that the LOC can be used to express these strategies, which differ

in fundamental respects from one another, Farquhar et al did not present any particular strategy

of their own. For instance, the loosely coupled approach hardly encodes any conversion

knowledge, leaving the problem for the user; the global schema approach encodes conversion

knowledge in view definitions; and BULLION advocates encoding appropriate knowledge at the

global level for the purpose of conversion as well as context explication. Farquhar et al, however,

did not say anything about how to manage and use conversion knowledge.

Furthermore, the notion of hidden assumptions described by Farquhar et al is essentially

the same as that described in the theory of semantic values, which in turn is subsumed by that

defined in BULLION. The notion of a query, described by Farquhar et al, is equated to theorem

proving. This is the same in BULLION except that we need not devise new theorem provers as

they already exist for FOL. The point is this, while LOC can say anything, Farquhar et al did not

say anything new with it.

2.3.3 LOC vs. FOL

Although LOC can also be used to formalize the BULLION model, FOL is sufficient for

our purposes and will be used to describe the BULLION model instead. The choice of FOL is due

to the fact that FOL has been around a much longer time and is well understood from a

theoretical standpoint, compared with the more recent theory of LOC [14, 15]. From a practical

standpoint, tools based on FOL such as logic programming languages (e.g. Prolog [191) are more

readily accessible. Thus, a BULLION prototype can easily be built using Prolog, for example.

Moreover, logic programming languages are based on well established theoretical foundations

[35, 38, 55]. Also, FOL's relationship to databases is well understood and documented [22, 29]. In

fact, towards the end of this thesis, we discuss logic programming as an implementation

paradigm for BULLION.

Finally, there is always a fundamental trade-off between expressiveness and efficiency of

a knowledge representation [51]. Computing with such an expressive formalism as LOC might

have some adverse consequences on the efficiency of query processing. Therefore, we should not

use a computational formalism that is more expressive than required. For the purpose of this

thesis, in which we do not deal with multiple inconsistent theories, FOL is sufficient. In the

future however, more general extensions to the BULLION model that are required to manage

multiple inconsistent theories might require a tool such as LOC.

3 Semantics and Ontology: Philosophical Foundationsfor Context Interchange

In this chapter, we present the philosophical foundations for Context Interchange. In

particular, we draw upon the disciplines of Semantics and Ontology. Our main source of insight

is from Treatise in Basic Philosophy by philosopher Mario Bunge [9, 10, 11, 12]. Wand and Weber

were the first to draw upon Mario Bunge's Ontology [11, 12] as a formal foundation for their

work in systems analysis and design [74]. The reader may also refer to [72, 73, 74, 75, 76] as

instances of other work by Wand and Weber that relied on Bunge's Ontology as a formal

foundation.

Wand and Weber found that concepts provided by Bunge's Ontology were "rich and

complete enough" to serve as a formal foundation for the phenomena they were trying to model.

They were then able to make predictions, based on this formal model, about the strengths and

weaknesses of various design methodologies. Finally, the formal model facilitated the

construction of computerized tools to support information systems analysis and design [74].

The work described in this thesis is similar to that of Wand and Weber's in that it draws

upon concepts from Bunge's Ontology such as the notion of a thing, a property, a state and a

system. However, Wand and Weber relied almost exclusively on Bunge's Ontology. For our

purposes, we also draw upon Bunge's work on Semantics [9, 10]. In particular, Bunge discusses

the notion of a context in Semantics. Together, Bunge's Ontology and Semantics provide a suite of

ideas that are particularly relevant to the achievement of semantic interoperability. These ideas

are then integrated into a coherent model of Context Interchange, which is a key contribution of

this thesis.

Generally speaking, agents (i.e. sources and receivers) communicate by exchanging

symbols. Symbols are assigned meaning (i.e. constructs) by these agents. Communication involves

the exchange of propositions or statements about a perceived reality. Without a shared perception

of reality, no meaningful communication can take place. We refer to the explicit description of a

shared perception of reality as the shared ontology.

The study of symbols and meaning falls within the realm of Semantics, while the study of

reality is the central concern of Ontology. Bunge describes the study of Semantics as '...concemed

not only with linguistic items but also, and primarily, with the constructs such items stand for

and their eventual relation to the real world' [9: p. 2]. Also, Bunge stated that '...an ontology is not

a set of things but a philosophical theory concerning the basic traits of the world' [9: p. 38].

Semantics and Ontology therefore 'far from being mutually exclusive, are complementary' [9:

p.4 2 ].

It is therefore not surprising that these disciplines together can provide important

fundamental insights to the problem of semantic interoperability. Semantics and Ontology

provide a basis for a formal description of a shared ontology, contexts, propositions and the

exchange of information across different contexts. Bunge's Ontology can be used to highlight

important aspects of the perception of reality that must be agreed upon by sources and receivers

in order to have meaningful communication. Semantics provides a framework for defining and

representing contexts and the translation of propositions.

Relevant concepts from Semantics and Ontology are discussed in Section 3.1 and 3.2

respectively. These ideas provide a framework for defining a shared ontology (Section 3.3) as

well as contexts (Section 3.4). A simple example is provided showing how the shared ontology

and contexts may be defined in practice.

3.1 Semantics and Semantic Interoperability

In [48], the situation was described in which groups had different local languages with no

language in common. For example, one group may speak English, another French, and yet

another Japanese. With no common language, pairwise translation rules have to be devised in

order to communicate across groups. However, the distinction between symbols and constructs

was not explicit in the analysis. Recall that a context is essentially a set of constructs. The context

of a language is the set of constructs associated with the symbols of the language. When groups

have different languages, their respective contexts might be disjoint, overlapping or in a subset-

superset relationship with one another. Without differentiating between symbols and constructs,

these conditions cannot be distinguished from one another. This prevents a finer level analysis of

the nature of "translation rules" required to achieve meaningful communication. Specifically, if

the contexts are overlapping or in a subset-superset relationship, then pairwise translation of

symbols can be useful in facilitating communication. Such symbolic translations involve

mapping different symbols that stand for the same construct to one another. If however, the

contexts are disjoint, such symbolic translations do not exist because there are no constructs in

common. What is needed is a means to resolve context heterogeneity, which implies the ability to

relate constructs in the one context to "equivalent" constructs in another context. Semantics offers

important insights into the issue of context heterogeneity.

We first distinguish between two classes of constructs, namely concepts and propositions

[9: p14]. Concepts are unit constructs such as "father". Propositions, unlike concepts, are

statements or facts to which we can attach a truth value e.g. "Jack is the father of Jill".

A basic premise of the BULLION model is that semantic interoperability is based on the exchange

of basic units of information that are "elementary" propositions. Thus, with reference to a

particular domain of discourse, the expression "The closing price of stkl traded on

the New York stock exchange is 55.00 US dollars at time tl1" is considered a

proposition and we can attach a truth value to it. Furthermore, this proposition is considered

"elementary" if it becomes ambiguous when any part of the proposition is omitted. This will be

the case, for example, if time is omitted from the proposition, and various times are possible in

the domain of discourse. Therefore, elementary propositions are propositions that are sufficiently

specific with respect to a universe of discourse. On the other hand, the proposition " cl ' s head

office is in New York and cl is incorporated in the USA" isnotanelementary

proposition as it says two things about c1. Thus, elementary propositions are "minimal" yet

"sufficient" in some sense. Whether or not a proposition is elementary is an empirical issue and

will depend on the particular universe of discourse.

In Semantics, there are various ways in which constructs can be related to one another.

More specifically, there are meaning relations [2] among concepts and among propositions which

can be exploited for the purposes of translation. Concepts are related to one another via meaning

inclusion. For example, the concept "parent" includes the meaning of "father". The reader will

quickly recognize this as the notion of generalization. Generalization is highlighted as an

important mechanism for achieving semantic interoperability [25], and is operationalized in [48]

as a type hierarchy. Thus the concept " f ather" can be translated to the more general concept

"parent" if the receiver understands the concept of "parent" but not "f ather".

Propositions, on the other hand, are related to one another by means of entailment or

logical deduction. Proposition A entails proposition B is written as A =: B. This means that the

truth of the proposition A necessitates the truth of proposition B. It seems logical, therefore, to

consider using entailment for the purposes of translating propositions. In fact, this was

illustrated in Example A. Furthermore, meaning inclusion can be viewed as a special case of

logical implication i.e. Vx father (x) =:> parent (x). Conversion rules based on logical

deduction, therefore, affords us a wider range of translations than is possible with rules based

solely on meaning inclusion. In BULLION, the conversion of propositions is based on entailment.

However, we do not necessarily mean to suggest that a translation must be computed using logical

implication, only that it ought to be conceptualized as such.

Another notion from Semantics that is of particular importance for our purposes is the

notion of a context defined by Bunge as C = (S, P, D) [9: p57]. D, the domain of the context, is a set

of individuals. There are no other individuals in C other than those in D. P is a set of predicates

that are meaningful in C, with well-defined arity (i.e. number of arguments in its argument list).

Woods [79] drew a distinction between the intension and extension of a predicate. For example the

extension of the predicate red is the set of all red things. The intension of red is the meaning of

the notion of redness. In this case, P corresponds to a set of predicate intensions or meanings. It

is important to distinguish between the extension and the intension of a predicate because the

same predicate may have different extensions in different contexts. Thus the extension of the

predicate red in the context of fruits, is different from the extension of red in the context of cars

for example. Conversely, predicates with different intensions can have the same extensions. For

example, the predicates big and heavy might refer to the same set of people. S is a set of

statements or propositions which contain only predicates from P and arguments from D, and can be

construed as the set of statements that are allowed within the context.

This notion of a context, therefore, highlights three important components. D is an

important component because it defines the set of all the individuals that a source knows about,

or all the individuals that a receiver is interested in. D defines the scope of universal

quantification (i.e. "for all x") which may vary from context to context. In the si head of f ice

relation described in Chapter 1, D contains companies and cities. Another important component

of contextual knowledge are the predicates in P. These predicates represent the attributes of

individuals in D.

The third component is S, the set of statements that is allowed within the context. Such

restrictions are an important part of the contexts of sources and receivers. This component of the

source context defines the set of propositions expressible by the source. In Example A, the source

can only express propositions pertaining to the location of the head offices of certain companies.

The corresponding component for the receiver context defines the set of propositions acceptable

to the receiver. So in Example A, the receiver accepts only propositions pertaining to the

countries in which certain companies are incorporated. As another example, in Source 3 of

Example B, the currency of closing prices must be in yen for a stock traded on tokyo. The

corresponding receiver, on the other hand, only accepts closing prices with currencies in usd.

3.2 Ontology and Semantic Interoperability

Gruber states that an ontology embodies a set of ontological commitments which is an idea

"based on the Knowledge-Level perspective" [32]. From this perspective, when an agent commits to

an ontology, its actions are consistent with the ontology. Gruber defines an ontology as a

statement of a logical theory. According to Gruber the definition of a shared ontology involves

the definition of:

"...vocabularies of representational terms - classes, relations, functions and object

constants-with agreed upon definitions ...Definitions may include restrictions on domains

and ranges, placement in subsumption hierarchies, class wide facts inherited to instances

and other axioms" [31].

Elsewhere [32], he states that:

"Ontologies are often equated with taxonomic hierarchies of classes, class definitions, the

subsumption relation, but ontologies need not be limited to these forms. Ontologies are

also not limited to conservative definitions, that is, definitions in the traditional logic sense

that only introduce terminology and do not add any knowledge about the world

[26] ...[but] one needs to state the axioms that do constrain the possible interpretations for

the defined terms".

Gruber has proposed several desirable characteristics of a representation of an ontology

[32], namely, clarity, coherence, minimal encoding bias, minimal ontological commitment and

extendibility. Clarity simply means that the shared ontology must be understandable to all

concerned. Coherence means that the ontology must be logically consistent. Minimal encoding

bias means that an ontology should contain knowledge about a universe of discourse, not of any

particular encoding scheme. Gruber gave an example of an encoding bias where a physical

quantity was constrained to represent double-float numbers. This is an encoding bias

because it reflects the precision of the encoding numbers which is an implementation detail.

Rather, an ontology should be specified "at the knowledge level" and reflect the "knowledge-level

commitments of parties to an ontology". So it is better to change the constraint from double-

float to real.

Minimal ontological commitment means that an ontology should be a basic or weak

theory of the universe of discourse. Thus, a set of agents who commit to the same ontology agree

on some basic facts but not necessarily on everything. Finally, extensions to an ontology should

be monotonic. That is, new facts added to the ontology should be consistent with the existing facts

in the ontology.

With respect to BULLION, a shared ontology defines (1) a global set of elementary

propositions and (2) the deductive relationships among these propositions, in terms of a global

language. The above guidelines given by Gruber, however, are relatively general and do not tell

us specifically how to organize a shared ontology in terms propositions and deductive

relationships.

In this thesis, we will rely on Bunge's Ontology to provide a fundamental framework for

organizing the shared ontology. Certainly other ontological theories may be employed for such a

purpose. In fact, towards the end of [32], Gruber surveys various examples of ontologies

including Mario Bunge's. Gruber also states that "the utility of an ontology ultimately depends

on the utility of the theory it represents".

On our part, however, there are various pragmatic reasons why we chose Bunge's

Ontology in particular. First, as we have used concepts from Bunge's Semantics, it makes sense

to also use Bunge's Ontology for consistency in theories and definitions. Second, as we

mentioned in the beginning of this chapter, Bunge's Ontology has been introduced, by Wand and

Weber, into the information systems literature and was found to be useful for modeling various

information systems concepts. Finally, we find that Bunge's Ontology indeed provides the utility

needed in describing the kinds of knowledge required for the purposes of Context Interchange.

This will be clear by the end of the thesis.

In Bunge's Ontology, the world is made up of substantial individuals or things. Things

possess properties. We perceive the properties of things via attributes or predicates [11:p59]. That

is, we only know properties as attributes. A property is a feature that a thing possesses even if

we are ignorant of this fact. On the other hand, an attribute is a feature we assign to a thing [11:

p. 58]. Therefore there is a distinction between an attribute and a property of a thing. Bunge then

states that as a result, we distinguish the statement

"Substantial individual b possesses property P"

from

"Attribute A holds for b"

where A is taken to represent P. The reason why Bunge is concerned about such a distinction is

"because some attributes represent no substantial properties.. .and some properties are

represented by no attributes...". In the case where we assume that there is a one-to-one

correspondence between properties and attributes, we may use them interchangeably.

An attribute, (e.g. weight) represents a property in general while an attribute value (e.g. 36

kilograms) represents a property in particular [11:pp.62-65]. A class is a set of things that possess a

common property in particular. A mutual property [11: p66] is a property of two or more things.

For example, Source 3 contains propositions about stocks and exchanges. A stock and an

exchange interact in that the former is traded on the latter. Furthermore, a stock may be traded on

more than one exchange. Moreover, the closing price of the stock depends on the exchange on

which it is traded. Thus, the closing price is construed as a mutual property between a stock and

an exchange. Since the closing price of a stock can vary depending on which exchange it is

trading on, identifying closing price as a mutual property is important for semantic

interoperability.

There are simple and complex things. A complex thing is made up of simpler things that

interact. Systems are examples of complex things. There is a part-whole relationship between a

system and its components. An example of a system is the computer which is made up of

simpler, interacting things such as the CPU, memory etc. (Fig. 3.1). This is called an aggregation

hierarchy. Systems have inherited and emergent properties, and hence, inherited and emergent

attributes. An inherited attribute is an attribute common to a system and one of its components

and share the same attribute value. For example, the clock speed of computer is an inherited

attribute because it is also the clock speed of the computer's CPU. An emergent attribute is an

attribute of a system but not of any of its components (e.g. the processing power of the

computer).

Computer

Keyboard Z j \ VDUMemory CPU Storage

Fig. 3.1 Aggregation Hierarchy

The state of a thing is represented by a combination of its attribute values. These ideas

are formalized by means of afunctional schema [11:p119]. A thing X of class T (e.g. Employee) is

construed via a functional schema defined as Xm = <M, F>, where F = <F1, F2,...Fn>. Each state

function Fi, defined over the set M called the manifold, corresponds to an attribute; and each value

of Fi corresponds to an attribute value [11:p125]. More precisely, Fi: M-+V where V is a set of

values. For example, F1 and F2 may correspond to the attributes T i t 1 e and Salary

respectively. The manifold M can, for example, be a set of time points e.g. dates. Obviously, over

time, the state of X can change i.e. the employee's title and salary can change.

To better understand the notion of a manifold, we first discuss the notion of a reference

frame [11:pp264-2 6 7 ]. First, a reference frame is a thing. Second, for a thing to qualify as a

reference frame for another thing, the two must not influence each other i.e. they must not

interact. Third, for a thing to qualify as a reference frame for another, its states must be utilizable

for parameterizing the states of the latter. For example, a clock can serve as a reference frame in

which each clock state is a particular combination of hour, minute, seconds and am/pm

indicators. We might then associate a stock price on any particular day with each state of this

clock. In other words, we can associate time as indicated by the clock with particular states of a

stock as indicated by its stock price. Thus, a manifold is really a set of states of the reference

frame.

Things, as defined in Bunge's Ontology, are governed by state laws which restrict the

combination of attribute values for a class of things. In other words, state laws interrelate the

attributes of things [11:p78]. Thus for example, the title of an employee will determine the range

of salary values that is legal. Finally, Bunge states that "The choice of state functions is not

uniquely determined by empirical data but depends partly on our available knowledge, as well

as upon our abilities, goals and even inclinations" [11: p127].

We note two important points in Bunge's statement. First, our view of the world (i.e. our

functional schemas) depend upon "available knowledge". This means that as more knowledge is

acquired, our functional schemas can change accordingly. Second, our view of the world is also

influenced by our "abilities, goals and even inclinations". Thus, there could be pragmatic reasons

why one party chooses to view the world in one manner, while another chooses a different

perspective.

This also implies that there is no absolute "right" standard in determining suitable state

functions for the conceptualization of a thing. Bunge's Ontology is an extremely basic theory and

does not dictate the choice of state functions, what things exist or what the state laws are. By

choosing a particular set of things to focus on, a corresponding set of state functions, a set of state

laws, we commit ourselves to a more specific point of view of the world i.e. a more specific

ontology. A set of ontological commitments is "in effect, a strong pair of glasses that determine

what we can see, bringing some part of the world into sharp focus at the expense of blurring

other parts" [23:p19].

The problem of context heterogeneity arises when sources and receivers attempt to share

information but adopt different local ontologies. In database terms, sources and receivers use

different relational schemas (which correspond to Bunge's functional schemas), use different

integrity constraints (i.e. state laws) and focus on different things. However, if sources and

receivers agree on a shared ontology which consists of a set of shared functional schemas and a set

of shared state laws that interrelate these functional schemas, information exchange is possible

without requiring sources and receivers to change their functional schemas or view of the world.

The set of state laws, in the shared ontology, which relate shared functional schemas will be used

as conversion rules or axioms. Moreover, rules are required to relate local functional schemas to the

shared set of functional schemas in the shared ontology. This is done by means of context

definition rules. We discuss this in more detail later in the chapter.

State functions, and therefore functional schemas, can alternatively be represented by

means of predicates [11:p116-11 7]. Thus for example the title of an employee a can be cast as

Title (a, Sales Rep, 12 /12 /94) where Sales Rep is an attribute value and 12 /12 /94

is a point in the manifold. Bunge further proposed that arguments of predicates can be classified

according to the following types (1) object, (2) property, (3) space, (4) time, (5) unit and (6) scale [9: p.

40]. Thus for example, Salary (a, 50, 1000, usd, 1994) means that an employee a (an

object argument) has a salary of 50 (a property argument) thousand (a scale argument) usd (a

unit argument) in 1994 (a time argument). Note that in this case, the property, scale and unit

arguments are combined to define the value of the employee's salary (i.e. 50,000 usd). The

proposition Title(a, Sales Rep, 12/12/94) is an elementary fact. So too is the

proposition Salary(a, 50, 1000, usd, 1994) . As we plan to use logic to define the

BULLION model, we shall view functional schemas and state functions in terms of predicates.

One useful characteristic of space-time variables not discussed by Bunge but which will

prove useful is the concept of granularity. Consider for example the day 4 July 1987. This date

refers to a 24-hour period within the month of July 1987 which is within the year 1987. This is an

example of a granularity hierarchy of time (Fig. 3a). The smallest granularity of time is an instant.

No time periods can therefore appear below an instant in a granularity hierarchy of time.

Similarly, Chicago is in the USA is an example of a granularity hierarchy of space (Fig. 3b). The

smallest granularity of space is a point. No space values can appear below a point in a granularity

hierarchy of space. The concept granularity hierarchies of space and time will prove useful for

the purpose of defining conversion knowledge as we shall see in the next section.

Finally, Bunge defines an event as a change in the state of a thing and is represented as an

ordered pair of states. For example, a promotion is an event where the title of an employee

changes.

USA1987

Jul 1987 New York Chicago

(a) (b)

Fig 3.2 Granularity Hierarchy of (a) Time and (b) Space

3.3 Defining the Shared Ontology

Semantic interoperability means that sources and receivers can exchange information in a

meaningful manner. Information is made up of propositions about a world of interest.

Propositions describe things, attributes and states etc. More specifically, our knowledge about

the world can be described in terms of the truth values we attach to such propositions.

The shared ontology represents the basis for the exchange of information among sources

and receivers within a Context Interchange system. It defines a common set of propositions in

which all relevant arguments are made explicit. Propositions within the ontology can be

converted to one another by means of deduction. Tuples in sources are mapped to propositions

of the shared ontology by means of context definition rules. Context definition rules for receivers

map propositions in the shared ontology to tuples in the receiver's view.

The process of transforming tuples in sources to tuples in a receiver's view, in a Context

Interchange system, can be briefly and intuitively described in the following manner. Tuples in

sources are first transformed, by means of context definition rules, to corresponding propositions

in the shared ontology. These propositions are then mapped to tuples in the receiver's view, also

by means of context definition rules. This might not always be possible because not all

propositions in the shared ontology can be mapped to tuples in a receiver's view. In this case,

these propositions must be converted, using deductive laws, to other propositions in the shared

ontology that might map to the tuples in the receiver's view. The details of this process will be

explained over the course of the remaining chapters of this thesis. In the remainder of this

chapter, we discuss the specification of a shared ontology, context definition rules and integrity

constraints.

A key task in the construction of Context Interchange systems is the definition of such a

shared ontology. In practice, we envision this task to be a collaborative process, involving

different parties interested in exchanging information about a common domain of interest. An

important point to understand about the specification of a shared ontology is that it need not, and

probably will not, be a "once and for all" process. This is because the definition of a shared

ontology also depends on "available knowledge". We cannot anticipate in advance all future

needs.

Furthermore, the definition of a shared ontology depends on our "abilities, goals and

even inclinations". So even if we did have all available knowledge about future needs, we may

not necessarily want to incorporate all the desired features into the ontology immediately for

pragmatic reasons. For example, even if we knew that we need to incorporate the new European

currency (i.e. ECU) and the respective conversions some time in the future, we may not want to

do so now since it will not be useful until then.

For these reasons, specifying the shared ontology will probably be an ongoing process in

which new knowledge is incrementally added "as needed". For example, new predicates can be

introduced into the shared ontology over time. In Context Interchange, the ability to anticipate

the needs of the federation is not as critical as the ability to adapt to future needs in a graceful

manner.

3.3.1 Defining Predicates and Arguments

The first step in defining the shared ontology is the definition of the global set of

elementary propositions for a universe of discourse. Sources and receivers must agree on this set

of definitions for the purpose of exchanging information. Defining the these propositions

essentially boils down to agreeing on attributes or predicates of things and the appropriate

argument list. As discussed above, predicate arguments can be of type (1) object, (2) property, (3)

space, (4) time, (5) unit and (6) scale.

Let us consider, as an example, the definition of the closing-price predicate in the

shared ontology so that we can make available, to the federation, information such as that

contained in Source 3. Sources and receivers must agree on the meaning (i.e. intension) of this

predicate or attribute.

Furthermore, when sources and receivers communicate, they must be communicating

about some population of things or objects. Therefore, there must also be agreement on the

things to which this attribute applies. In this case, the closingprice attribute applies to a

class of things called stocks. Without a clear agreement on the particular things that form the

subject of a communication, meaningful information exchange can be hindered.

Then, there is the attribute value of the closing price of the stock itself. Bunge's

framework suggests that this might be viewed in terms of a real number (i.e. a property value), a

currency (i.e. a unit) and a scale value. Sources and receivers must consider these argument types

as part of the definition of the clos ingprice attribute. If, for example, the currency argument

is not defined as an argument of the predicate, ambiguity can arise when more than one currency

value is possible. On the other hand, if the entire federation of sources and receivers agree on a

single currency for closing prices, then there is no necessity to explicitly include the currency

argument. For our example, let us assume that various currencies are possible and therefore we

require the currency argument to be an explicit part of the definition of c los ing-price. Let us

also assume that for the entire federation, the scale value of all closing prices is one. Therefore,

there is no need to explicitly include the scale argument. For purely pragmatic reasons, we may

therefore leave the scale argument implicit, saving us the trouble and effort of specifying

additional arguments.

Some time in the future however, we may encounter sources and receivers with different

scale assumptions. In this case, a scalable and flexible system should be able to easily

accommodate such sources and receivers. Towards the end of the thesis, we will show how two

federations with differing and implicit scale assumptions can be easily integrated. For now,

therefore, Stock, Price and Currency form part of the argument list of closingprice.

Based on the framework described in Section 3.2, there is yet another argument to

consider, Date, which is time argument. This is because closing prices of stocks vary with time.

If various dates are involved, we need to explicitly include the Date as part of the definition of

the closing price.

Finally, we note also that the same stock can be traded on different exchanges. For

example, IBM's stock is traded on the New York Stock Exchange as well as on the Tokyo Stock

Exchange. Depending on which exchange we are talking about, the closing price of IBM's stock

can differ. In Bunge's framework, the closing price attribute actually represents a mutual

property of two things, the stock and the exchange on which it is traded. Thus, we add an

argument Exchange to the closingprice predicate.

Note that this proposition is an elementary in that it represents a minimal unit of

information with respect to the domain of interest. An elementary proposition needs to be

unambiguously defined with respect to the sources and receivers within a federation. This is

done by specifying the appropriate argument list of the predicates. So, we have just seen how

Bunge's framework aided us in identifying and defining fundamental agreements necessary for

the meaningful exchange of information by means of propositions.

3.3.2 Defining the Deductive Relationships among Propositions

Agreeing on the global set of propositions, however, is not enough. This is because there

are constraints on what propositions a receiver is willing to accept and what propositions a

source can represent. For example, while a particular source might have information

representing some properties of a thing, the receiver might be interested in other properties of the

same thing. Hence, where sources and receivers with multiple contexts are involved, the

propositions represented by sources may not always be acceptable "as is" by receivers. Sources

and receivers must, as a result, also agree to some means of converting propositions. Since state

laws interrelate the attributes of things, they are the only mechanism for conversion.

In BULLION, these state laws are called conversion rules or axioms. We shall also see

later on that these conversion axioms can also facilitate the definition of contexts. There are a

number of conversions that are highlighted in the literature including unit conversions and scale

conversions. We will now present examples of conversion axioms that may be specified within

this framework. The following examples, by no means comprehensive, serve to illustrate the

range of conversion possibilities that can be expressed within this framework, and which go well

beyond what has been described in current database integration literature. For clarity, the

predicate names and arguments which are being converted are highlighted in bold. Arguments

with upper case first letters are variables, otherwise, they are constants. Furthermore, all

variables are assumed to be universally quantified. Each argument is followed by a slash (/) and

its corresponding type (in italics) as described within Bunge's framework.

(1) Conversion Axioms based on Generalization:

E.g. If trade-price is more general than nominaltradeprice,

nominaltrade-price (CompanyName/object, Price/ property,

Scale/scale, Currency/uni t, Time/ t i me ) = tradeprice

(CompanyName/object, Price/property, Scale/scale, Currency/unit,

Time/time).

(2) Conversion Axioms based on Unit conversions:

E.g. If 1 US dollar is 100 Yen,

revenue (CompanyName/ object, Rev/property, Scale! scale, usd/unit,

Yr / time) * revenue (CompanyName/ object, Rev*100/property,

Scale/scale, yen/unit, Yr/time).

Explanation: Observe that the unit conversion rules require that the predicate, object,

scale and time remain unchanged for this conversion to be valid.

(3) Conversion Axiom based on Scale conversions:

E.g. For differing scales,

revenue (CompanyName/object, Rev/property, Scalel/scale,

Currency/unit, Yr/time) t revenue (CompanyName/object,

Rev*Scalel/Scale2/property, Scale2/scale, Currency/unit, Yr/time).

Explanation: Observe that scale conversion rules require that the predicate, object, unit

and time remain unchanged for this conversion to be valid.

(4) Conversion Axioms based on an Aggregation Hierarchy:

E.g. For an aggregation such as that described in Fig. 3.1,

speed (computer/object, Speed/property, Scale/scale,

SpeedUnit/unit) (-* speed (cpu/object, Speed/property, Scale/scale,

SpeedUnit/unit).

Explanation: The cpu is a part of the computer in this aggregation hierarchy. In this

example, the speed of a computer is an inherited attribute from the cpu.

(5) Conversion Axioms going down the Time hierarchy:

E.g. For a time hierarchy such as that described in Fig. 3.2(a),

title (Employee/object, Title/property, 1987/time) -> title

(Employee/object, Title/property, jun 1987/time)

Explanation: "Employee possessed Title for all of 1987" means that "Employee

possessed Title for all of jun 1987".

(6) Conversion Axioms for going up the Time hierarchy:

E.g. For a time hierarchy such as that described in Fig. 3.2(a),

birth (Person/object, 4 jul

(Person/object, jul 1958/time,

1958/time, Place/space) -> birth

Place/space).

Explanation: Person was born at some instant within the day 4 Jul 1958, it is also true

that Person was born at some instant within the month Jul 1958.

(7) Conversion Axioms for going up the Space Hierarchy:

E.g. For a space hierarchy such as that described in Fig. 3.2(a),

birth (Person /object, Date/time, chicago/space) => birth

/object, Date/time, usa/space).

(Person

Explanation: Since Person was born at some point within Chicago, Person was

therefore born at some point in the USA.

(8) Conversion Axioms among different Quantitative Amounts

E.g. Those involving addition and subtraction,

revenue (CompanyName / o b j

Currency/unit, Yr/time)

Exp/property, Scale/scale,

(CompanyName/object,

Currency/unit, Yr/time).

ect, Rev/property, Scale/scale,

A expense (CompanyName/object,

Currency/unit, Yr/time) => profit

Rev-Exp/property, Scale/scale,

Observe that the last conversion rule not only expresses the arithmetic relationship among the

predicates revenue, expense and prof it, but also states that this relationship holds only if

these predicates refer to the same company and have the same currency, time period and scale

value. This ability to state the conditions under which a conversion holds is critical for the

context mediator to perform the correct conversions and conversion axioms allow us to do this.

Having defined the shared ontology, we need to relate propositions associated with

sources and receivers to propositions in the shared ontology. These is done by means of context

definition rules.

3.4 Context Definition Rules and Integrity Constraints

In contrast to the definition of a shared ontology, we envision the specification of context

definition rules to be a more localized process. That is, such a task will probably be performed by

the local database administrator. This task is analogous to the process of concept matching in

Carnot [21]. To understand, in principle, how a local database administrator would perform such

a task, let us consider defining the context of Source 3 as an example.

The local database administrator must first determine the predicates in the shared

ontology which accurately describes the information in Source 3. If such predicates do not exist,

they must then be introduced into the shared ontology. In this case, the appropriate predicate is

closing-price, which was described in Section 3.3. Recall that closing-price has five

arguments: Stock, Price, Exchange, Currency and Date. Remember also that the scale value

is implicitly assumed to be one.

We can now describe the context of Source 3 in Bunge's terms (i.e. (S,P,D)) using the

closing-price predicate of the shared ontology. The closing-price predicate belongs in P.

All the possible values of the arguments of closing_price are in D (i.e. D ={stkl,

stk2,...,nyse..., ti, 0. 00 ... 55. 0 0. . . , us d, yen}). Recall that all closing prices of stocks in

Source 3 are for the time t1. Furthermore, stocks traded on nyse must have currency in usd.

The proposition closingprice (stkl, 55. 0 0, nyse, usd, t1) is an example of a

proposition that satisfies all these restrictions. We say that this proposition is allowed within this

context. S represents the set of allowable propositions that can be expressed by a source, in this

case Source 3. Hence, the propositions represented as closing-price (Stock, Price,

nyse, usd, ti) , where the variable arguments Stock and Price range over the appropriate

values in D, are allowed in this context. The variable argument Stock, for example, ranges over

the values stkl, stk2 and so on.

The schema of Source 3 is s3_closingprice (Stock, Price, Exchange). The

local database administrator now determines that the three arguments in the local schema

correspond to the first three arguments in the closingprice predicate. However, two

remaining arguments of the closing-price predicate need to be instantiated i.e. Currency and

Date. In fact, these arguments have been implicit in Source 3 all along. The closing-price

predicate alerts the local database administrator to implicit assumptions that have to be made

explicit for meaningful information exchange. The context of the source can, therefore, be

conveniently defined by the local database administrator in terms of the following context

definition rule:

s3_closingprice (Stock, Price, nyse) -> closing-price (Stock,

Price, nyse, usd, t1).

A context definition rule maps tuples in sources to propositions in the shared ontology.

Similarly, the restriction that the closing prices for all stocks traded on tokyo is in yen can be

represented by another context definition rule

s3_closingprice (Stock, Price, tokyo) => closingprice (Stock,

Price, tokyo, yen, t1).

Note that since conversion of propositions takes place from the left hand side of the above rules

to the right hand side, the source proposition is on the left, while the corresponding proposition

of the shared ontology is on the right.

We have also indicated that traditional integrity constraints (e.g. domain constraints) are

subsumed by the notion of a context. Integrity constraints can be easily specified as part of a

context definition rule. For example, the above context definition rule can be modified as follows

(modification in bold)

s3_closingprice(Stock, Price, nyse) Y ((Stock=stk1) v

(Stock=stk2)) = closingprice(Stock, Price, nyse, usd, t1).

This rule restricts the domain of Stock to stk1 and stk2 for stocks traded on nyse. Now

suppose a tuple that violates this domain constraint is introduced into Source 3. This tuple

wrongly states that stk3 was traded on nyse (instead of tokyo). The above conversion rule,

however, will not fire and no conversion will take place. Thus specifying constraints can prevent

erroneous deductions.

If however, the constraints in Source 3 are satisfied, then the rule becomes

s3_closing-price(Stock, Price, nyse) Y t r u e ->

closing-price(Stock, Price, nyse, usd, t1).

which then reverts back to the original rule. Thus, if the constraints of sources are always

satisfied, specifying integrity constraints as part of a context definition rule makes no difference

from the point of view of logic. We might not, therefore, bother to specify the integrity

constraints of sources in context definition rules. From the efficiency point of view however,

specifying constraints is valuable because it facilitates semantic query optimization. We will

return to this topic later in the thesis.

Now consider Receiver 2 with schema r2_closing-price (Stock, Price

Exchange). Furthermore, Receiver 2 expects all closing prices to be in usd and for the time

period t 1. The context definition rule for the receiver is

closing-price(Stock, Price, Exchange, usd, ti) ->

r2_closing-price(Stock, Price, Exchange).

Since we are converting global propositions to receiver propositions, global propositions appear

of the left hand side of the rule while the receiver propositions are on the right.

Recall that Receiver 2 expects information only on stocks traded on nyse and tokyo. If

this assumption is violated, the receiver might compute erroneous statistics. For example, the

receiver might be using this information to calculate the total value of stocks on the two

exchanges. If stocks traded on other exchanges entered into the computation, the result will be

incorrect. In order to ensure that this expectation is satisfied, the above context definition rule

can be modified as follows (modification in bold)

closingprice(Stock, Price, Exchange, usd, t1) Y((Exchange=nyse)

v (Exchange=tokyo)) = r2_closing-price(Stock, Price, Exchange) .

Now, only information on closing prices of stocks traded on nyse and tokyo can be converted to

receiver propositions.

3.5 Simplifying the task of Domain Definition

We note that these domain restrictions have been defined by enumeration. That is, we

explicitly stated which elements were allowed in a particular domain. Enumeration can

cumbersome if this list of elements is very long! However, if a domain is stable, we need only

specify it once as part of a context definition. Since the context definition is specified outside the

confines of a query, a receiver need not restate domain restrictions each time a query is issued,

saving the receiver a lot of effort.

Furthermore, if a particular domain is considered "standard" knowledge throughout the

federation, it can be defined in one place and sources and receivers can re-use this definition. For

example, if stk1 and stk3 are stocks of computer companies, we can assert this as

computer-stock (stkl).

computer-stock (stk3).

By making these facts globally available to the federation (say by placing them in the shared

ontology), these definitions can be re-used to specify domains more conveniently. For example, if

Receiver 2 changes its context definition and accepts only closing prices of stocks of computer

companies, the new context definition becomes

closingprice(Stock, Price, Exchange, usd, t1) Y

computerstock(Stock) = r2_closing-price (Stock, Price, Exchange).

Note that no enumeration of individual stocks is required in this case.

In sum, we have described the process of context definition. Furthermore, the ability to

specify constraints as part of context definition rules ensures a higher integrity of the answers.

For sources, specifying constraints facilitates semantic query optimization. By making general

knowledge available to the federation through the shared ontology, the task of defining

constraints can be further simplified. Defining constraints outside the confines of a query also

makes life easier for a receiver.

In the next chapter, we integrate the notion of a shared ontology and context definition

rules into a proof theoretic specification of a Context Interchange system. More specifically, we

formalize these ideas in terms of a deductive database framework [29]. In such a framework, one

or more relations share a set of deductive laws. From the point of view of the BULLION model,

these laws take on special significance in that they are the laws from the shared ontology and the

context definition rules.

4 BULLION: A Proof Theoretic Specification ofContext Interchange

To date, there has been a proliferation of proposals for approaches (e.g. [3, 4, 21, 25, 36,

46, 64, 67]) to semantic interoperability. These proposals emphasize a focus on implementation.

The range of such dissimilar implementations is broad, emphasizing various aspects of the

problem of semantic interoperability. To complicate matters, the range of representation schemes

within each system is potentially broad too. This is because interoperable systems can contain

sources, receivers and global media (e.g. global schemas [4] and ontologies[21]) based on

different data models and representation schemes. Important assumptions made by these varied

representation schemes can affect meaningful data retrieval, and are often buried within the

structure and procedures of these representations. The problem is that we currently lack a means

with which to understand, analyze and compare these approaches. More specifically, we lack

abstract, formal descriptions of the cognitive task of semantic interoperation, independent of

idiosyncratic implementation details.

Therefore, an abstract specification of a model of semantic interoperability, based on the

Context Interchange Architecture, is presented in this chapter. This model, which incorporates

the various insights of the previous chapter, is referred to as the BULLION model. BULLION is a

model of what Context Interchange systems should do and represents a theoretical ideal.

BULLION should not be confused with an implementation which may fall short of the theoretical

ideal because of practical limitations. For example, a specification defines the set of all answers to

be returned in response to a query. However, for efficiency reasons, a particular implementation

may choose to sacrifice the completeness of the answers returned. BULLION is a model at the

knowledge level.

4.1 The Knowledge Level: Background and Motivation

The knowledge level is a concept first introduced by Newell [61]. Newell distinguishes the

knowledge level from the symbol level. The knowledge level is concerned with what knowledge is

being represented, not how. How knowledge is to be represented is a symbol level issue. A

(knowledge) representation is a structure at the symbol level. Knowledge serves as a specification

for what a symbol structure should do. The knowledge level is useful because it permits predicting

and understanding system behavior without an operational model of the processing that is

actually being done by the system. According to Newell, logic is a structure at the symbol level

uniquely fitted to the analysis of knowledge and representation. In the area of knowledge

representation, for example, Levesque used logic to provide a knowledge level analysis of a

fundamental trade-off in various knowledge representation schemes such as databases, semantic

nets, production systems and frames [51].

In his landmark paper where he proposed a proof theoretic view of relational databases

[62], Reiter also argued for the need for specification in the context of conceptual modeling of

databases. A chief factor which motivated his argument was the proliferation of proposals for

data models "served up without benefit of an abstract specification of the assumptions made by

that data model about the world being modeled". Without a precise specification, the semantics

of database operations may be unclear and there is no basis to prove the correctness of database

operations. He argued that first order logic, specifically proof theory, is ideal as a specification

language because it has precise and unambiguous semantics. It therefore provides a rigorous

specification of meaning. Moreover, it does so at a very high level of abstraction, in the sense that

the specification is entirely nonprocedural. It tells us what knowledge is being represented. A

logical data model is transparent in that all and only the knowledge being represented is open for

inspection, including assumptions that might otherwise be buried in data model operations (e.g.

the domain closure axiom and the completion axioms are embedded in the division and set difference

operator respectively of the relational model). First order proof theory also provides a conceptual

advantage in that it possesses representational and operational uniformity. Proof theory has

representational uniformity in that queries, integrity constraints and facts are represented by the

same first order language. Proof theory has operational uniformity in that first order proof

theory is the sole mechanism for query evaluation and the satisfaction of integrity constraints.

Proof theory provides a specification which can be realized by a variety of procedurally oriented

data models. Conversely, nonlogical data models can be given precise semantics by translating

them into logical terms. Furthermore, because of the uniformity of logic, different data models

can be compared and non proof theoretic query evaluation algorithms may be proven correct

with respect to the logical semantics of queries. The generality of proof theory as a specification

language is based on the observation that the sort of real world semantics data models attempt to

capture are first order definable. That is, various kinds of databases (e.g. those that contain

incomplete information and incorporate more real world semantics) can be characterized as

special types of first order theories.

One of Reiter's chief criticism of the proliferation of data models concerns the definition

of an answer to a query. "Insofar as the concept of an answer to a query is defined at all, it is

defined operationally, for example by a generalization of the relational algebra, or by some set of

retrieval routines which may or may not perform inferences. Now these data models are

complicated. Therefore these operational definitions for answers to queries are also complicated.

Why should one believe that these definitions are correct...(and) complete?" [62: p221]. Reiter's

main point is that "no matter how one extends data models...the definition of an answer to a query

remains the same...". Therefore, given an abstract definition of an answer, say in proof theoretic

terms, "we can prove the correctness of proposed query evaluation algorithms". However, this does not

mean that a query evaluation algorithm must resemble proof procedures. In fact, the relational

algebra is one such algorithm that has been proven complete and correct and yet it "looks nothing

like proof theory".

Levesque [50], referring to the proof theoretic view of relational databases, states that:

"The point of this translation exercise is not that we can sometimes reduce first order logic to

relational databases, but rather that we can explain the information content of simple databases

as a certain restricted type of first order theory in such a way that answers are to queries are

implicit in the theory (i.e. logical consequences of it). This is not to suggest that we might want to

answer a query by doing theorem proving of some sort; that would be like cracking an egg with a

hammer. Rather, it is precisely because first order theory is restricted in a certain way (namely,

that it is in what I have called "DB form") that something like resolution is not necessary, and

simple database retrieval is sufficient for sound and complete logical inference. What the first

order account gives us is what the answers to queries should be, not how to compute them (at least

in this case). In other words, we are interested in using first order logic at the knowledge level,

not at the symbol level. ...What counts here is using a sufficiently general knowledge

representation language. ...What we get from first order logic (and the only thing that concerns

us here) is a reasonably general and precise language of sentences and a clear way of understanding

the propositions expressed by complex sentences..."

The need for specification and the value of logic as a specification language clearly apply

to the goal of achieving semantic interoperability. The current proliferation of database

integration implementations, not unlike the "embarrassing" proliferation of data models observed

by Reiter, suffer from a lack of a rigorous specification. Unfortunately, a specification is often

equated or confused with the operational model of the computation itself. A specification is

especially needed and particularly difficult to achieve in complex and heterogeneous

environments where sources, receivers and global media are based on different representation

schemes. Knowledge within disparate sources are embodied not only by the data model

structure but also by data model operations. Meaningful integration of the knowledge of

disparate sources requires first an understanding of the knowledge contained within these

sources. Brachman and Levesque [7] observed that "A knowledge level analysis is especially

important when considering combining representation schemes. Indeed one can imagine

wanting to combine schemes with wildly different and incomparable implementation structures

on the one hand and organizational facilities on the other. In this case, the only common ground

between the schemes might be that they represent related kinds of knowledge about the world.

This might be true even for the integration of two databases, arguably the simplest kinds of

knowledge bases". Proof theory provides a general, uniform, explicit and precise means of

specifying varied elements of interoperable systems. Specifying interoperable systems in proof

theoretic terms will greatly aid understanding, evaluation, comparison and subsequent design of

such systems.

4.2 Features and Benefits of the Specification

Our specification of the BULLION model is cast in proof theoretic terms. More

specifically, we formalize BULLION within the theoretical framework of deductive databases [28,

29, 55, 62]. Within such a framework, one or more relations share a set of deductive laws. From

the point of view of BULLION, these deductive laws take on special significance. That is, these

laws are made up of the laws of the shared ontology (such as those described in Chapter 3), as

well as context definition rules. For each relation, there is an associated set of context definition

rules. Similarly, each receiver has its own set of context definition rules.

BULLION has a number of noteworthy features. First, it draws upon theories from

philosophy [9], Al [61] and Reiter's work on logic and databases [62]. The second feature is the

inclusion of the notion of a context as defined in [9]. This notion of a context incorporates

domains and traditional integrity constraints as an integral components. Third, our model allows

for heterogeneous contexts, in particular, different relations and receivers can have different

contexts. Fourth, our model incorporates the notion of a shared ontology thus clearly defining its

role in facilitating semantic interoperability. Fifth, our specification does not require the source

or receiver to sacrifice autonomy. Finally, this specification provides a formal definition of an

answer to a query in a Context Interchange system. As shown in Fig 1.7, this specification defines

the relationship between the output answer (6) and the inputs (1-5). As this specification is at the

knowledge level, it ignores issues at the symbol level. That is, this specification defines the

relationship between the knowledge embodied by the inputs and the knowledge embodied by the

output.

There are several key benefits of this work. The first is of course a specification which

serves to formalize and communicate the key ideas behind the Context Interchange Approach,

unencumbered by details, limitations, compromises and artifacts of idiosyncratic

implementations. Such a specification is valuable because it provides a rigorous basis and

reference for alternative implementations of Context Interchange systems by defining a

correctness criterion, as well as the semantics of such systems. In particular, it provides clear

definitions of what we mean by a context and an ontology and their roles in constructing answers

to a query.

Furthermore, because BULLION is couched in the formal framework of logic, we can

draw upon research developed by the logic and database community for the theoretical

understanding and analysis of BULLION systems. Gallaire et al claimed that "Logic, we believe,

provides a firm theoretical basis upon which one can pursue database theory in general." [29].

Finally, this work also contributes in two ways to database integration efforts in general.

First, we have provided an exemplar of how a logical specification for interoperable systems may

be achieved. Similar specifications may also be developed for other types of interoperable

systems. As a result, we have a rigorous means of understanding, comparing and evaluating

these systems. Second, as we have provided a precise and high level definition of certain key

elements, such as context and ontology. These concepts can subsequently be more easily

understood and adopted by other database integration approaches if desired and implemented

accordingly.

4.3 The Logical View of Databases: A Review

We now briefly review Reiter's reformulation [62] of a relational database in proof

theoretic terms. This reformulation is the foundation for the field of deductive databases. We

begin with a discussion of first order languages. Then, the model theoretic view of relational

databases, which is the view initially associated with the Relational data model, is presented.

Then, we present Reiter's reformulation of a relataional database in proof theoretic terms.

4.3.1 First Order Languages

Following Reiter [62], let F = (A, W) be a first order language, where A is an alphabet

containing constant symbols, predicate symbols, variable symbols, function symbols, punctuation

signs and logical constant symbols. For a relational database however, function symbols are not

permitted. W is a set of well-formed formulas (wffs) which obey the grammatical rules of first

order predicate calculus. Wffs are simply syntactically legal first-order logic sentences. A ground

formula is a wff that contains no variables.

There is a distinguished predicate symbol, =, in A which will function as the equality

predicate. There is also a distinguished subset in A, possibly empty, of predicate symbols that

stand for special unary predicates called simple types. For example, some constants such as s tkl

is of the type stock, i.e. stock (stk1 ).

Furthermore, if ti and t2 are types, then (T1 A T2), T1 v r2, and -, 1 I are also types. V x is a

universal quantifier and 3x is an existential quantifier. For a type restricted universal quantifier Vx/t

and a wff o, (Vx/t)(o) should be read as "For all x of type t, co is the case". For a type restricted

existential quantifier 3x/r, (3x/ t) (o) should be read as "There exists an x of type r such that o is

the case".

The interpretation I for the first order language F is (D, K, E). K is a one-one and onto

mapping from constant symbols of A into D. This is a logician's way of saying that the constant

symbols in A are assigned meanings, by the mapping K, from the domain D. Since K is a one-to-

one and onto mapping, each and every constant symbol has a unique meaning. E is a mapping

from the predicate symbols of A into sets of tuples of elements of D. That is, for an n-ary

predicate symbol p, E(p) c Dn is the extension of p in I.

4.3.2 Relational Databases: The Model Theoretic View

In the model theoretic view, a relational database DB is defined as (F, IC, I). The alphabet

A, of F, does not contain function symbols. IC c W is a set of integrity constraints. In particular,

it is required that for each n-ary predicate symbol p (distinct from = and type predicate symbols),

IC must contain a wff of the form

Vxi---.Vxn [p(x1- ---,xn) --> T1(x1) ^--^. An(xn)]-

Essentially, this wff defines the domains of the predicate p. The set of integrity constraints IC is

said to be satisfied if I is a model of IC, i.e. IC is true in I. Fig. 4.1 shows the corresponding model

theoretic view of Source 3.

Queries are defined with respect to a given language F. Specifically

Definition: A query Q for F = (AW) is any expression of the form <x 1/T 1,...,xn/Ta I W

(xl,...,xn)> where:

1. x1,..., xn are variables of A.

Alphabet A

Predicate symbols: s3_closing-price(,,), stock(), price(,

exchange().

Simple Types: stock(),price(),exchange().

Constant symbols:stkl,stk2,...,nyse,tokyo...

Integrity Constraints

(VxiVx2Vx 3Vx 4Vx 5 ) s3_closing-price

Aprice(x2) A exchange (x3);

Extension of type predicates

stock: stkl,...,stk4.

price: 40.00,55.00...

exchange: nyse,tokyo

Extension of s3 closing price

(stkl,

(stk2,

(stk3,

(stk4,

(xl,x2,x3) => stock(xi)

55.00, nyse),

40.00, nyse),

6330.00, tokyo),

6531.00, tokyo);

Fig. 4.1 Model Theoretic View of Source 3

2. Each ri is a type of A.

3. W (x1,...,xn) E W.

4. The only free variables are among x1,...,xn.

5. All quantifiers occurring in W (x1,...,xn) are type restricted.

For example, the query against Source 3

<Stock /stock s3_closing-price(Stock, Price,Exchange)

A(Exchange= nyse)>

is equivalent to the SQL query

select Stock

from s3_closing-price

where Exchange= "nyse".

Definition: A tuple of constants <c,..., cn> of F's alphabet is an answer to Q with respect

to DB= (F, IC, I) iff

1. ci(ci) is true in I for i=1,...,n.

2. W (c1 ,...,c) is true in I.

<stkl> and <stk2> are answers to the above query.

4.3.3 Relational Databases: The Proof Theoretic View

Unfortunately, the model theoretic view is limited in modeling databases such as those

that have incomplete information and those that incorporate more real world knowledge [62].

However, in the proof theoretic paradigm, which we are about to review, such databases can be

viewed as special kinds of first order theories. The prime advantage of the proof theoretic view,

over the model theoretic one, is therefore its capacity for generalization. The advantages of the

proof theoretic view are discussed in [55 p.147, 62]. For a relatively simpler and useful tutorial on

the model theoretic versus the proof theoretic view of databases, the reader may refer to [22:pp

641-682].

In the proof theoretic paradigm, there are two basic rules of inference: modus ponens and

generalization. In this paradigm, a relational database is defined as (F, IC, T) where F = (A,W) is a

first order language as defined above, IC is a set of integrity constraints and T c W is a first order

theory containing the following axioms:

1. Domain Closure Axiom (DCA):

Vx[=(x,c1) v... v =(x,c, )] where c 1,...,c , are all the constant symbols in A.

The domain closure axiom simply defines the set of all possible constant symbols that

may be used to instantiate a particular database. Equivalently, it defines the scope of

universal quantification. Thus, when we say "for all x", we are referring to all the

constant symbols in the domain closure.

2. Unique Name Axioms (UNA):

-, =(ci1c 2) A...A-,=(ic ).

These axioms state that different symbols have different meanings i.e. the constant

symbols in the alphabet A are pairwise distinct. This is consistent with the one-to-one

mapping K defined in the model theoretic view.

2. Equality Axioms (EA): Reflexivity, commutativity, transitivity and the principle of

substitution of equal terms.

These axioms give the equality predicate its usual meaning.

3. The set of all ground atomic axioms corresponding to the extensions of the predicates

(excluding =). Each tuple in an extension of a predicate corresponds to a ground fact

in proof theory.

4. Completion Axioms (CA) for each predicate symbol (excluding =): Suppose the

extension of an m-ary predicate p is {(c1(1)., cm( 1)),..., (c1(r),..., cm(r))}, then the

completion axiom for p is

Vx1,---, vxmlp(x1,---, xm) => [=(x1,c1(1) A ... A =(xmCm(1) v ...v [=(x1,c1(r)) A...A

=(xmcm(r)))

The set of completion axioms is a formalization of the closed world assumption. It allows us

to represent negative facts in a compact fashion. As a simple example, let p be a unary predicate

and the domain closure axiom be Vx[=(x,ai) v=(x,a 2 )v =(x,a 3 )]. Furthermore, let the positive

ground axioms of p be p (a 1 ) and p (a 2 ). The completion axiom for p is therefore Vx[p(x) =>

=(x,ai) v =(x,a 2 )]. Note that a3 is in the domain closure. By substituting a 3 for x in the

completion axiom (i.e. using the generalization inference rule), we get

p(a3 ) => =(a3 ,ai) v =(a3 ,a2 )

Then, by the contrapositive rule', we get

- p(a3 ) <=- =(a3 ,ai) A - =(a3 ,a 2 )

Then using modus ponens and the unique name axioms, we can prove -,p (a 3 ). Thus, since a3 is

in the domain closure but p(a3) is not asserted as true, we can show -,p (a 3 ).

The completion axiom for Source 3 is

VStock,VPrice,VExchange[s3_closingprice (Stock, Price, Exchange) ->

[=(Stock,stk1) A=(Price,55.00) A =(Exchange,nyse)] v

[=(Stock,stk2) A=(Price,40.00) A =(Exchange,nyse)] v

[=(Stock,stk3) A =(Price,6 33 0. 00) A =(Exchange,tokyo)] v

[=(Stock,stk4) A =(Price,6531. 00) A =(Exchange,tokyo)].

The set of integrity constraints IC is satisfied iff it is provable from T (i.e. T |-IC).

It is useful to note another advantage of the proof theoretic view over the model theoretic

view: it makes explicit the assumptions implicit in the model theoretic view (i.e. axioms in 1, 2,

and 4 above). Fig. 4.2 shows the corresponding proof theoretic view of Source 3.

The definition of a query for the proof theoretic view remains unchanged. The definition

of an answer to a query, however is modified and is given by:

Definition: A tuple of constants <ci,..., cn> of F's alphabet is an answer to query Q with

respect to DB = (F, IC, T) iff

1. TI-ti(ci) for i=1,...,n.

2. T|- W(ci,...,c a)

For the purpose of our specification, we will adopt the proof theoretic view of relational

databases.

4.4 A Logical View of a Context

We now present a formalization of Bunge's notion of a context, (S,P,D), in logical terms.

Let us first consider the context of Source 3. Let Cs3=(Ss3,Ps3,Ds3) be the context of Source 3

1 The contrapositive rule is a rule derived from modus ponens and generalization. It states that

we can infer -, B= -, A from A => B.

Alphabet A

Predicate symbols: s3_closingprice(,,), stock(), price(),

exchange (.

Simple Types: stock() ,price() ,exchange(.

Constant symbols:stkl,stk2, ... ,nyse,tokyo ...

Tntpaaritv Constraints

(VxiVx2Vx3Vx4Vx 5 ) s3_closing-price

A price (x2) A exchange (x3);

(xl,x2,x3) => stock(xi)

Domain Closure Axiom

Vx[= (x, stkl) v...v = (x, tokyo)]

Unique Name Axioms

-,=(stki,stk2) A...A-,=(nyse, tokyo)

Equality Axioms

Ground Axioms of type predicates

price(00.00),..., stock(stkl), stock(stk2), stock(stk3),

stock (stk4), exchange (nyse), exchange (tokyo);

Ground Axioms of s3 closing price

s3_closingprice (stkl,

s3_closingprice(stk2,

s3_closing-price(stk3,

s3_closing-price (stk4,

55.00, nyse),

40.00, nyse),

6330.00, tokyo),

6531.00, tokyo);

Completion Axioms

(As described in the text of Section 4.3.3.)

Fig. 4.2 Proof Theoretic View of Source 3

specified in terms a first order language Fs3. The symbols in Fs3 is a subset of the symbols used

in the language of the shared ontology called Fg. To formalize this, we introduce the following

Definition: Let F1 =(Ai, WI) and F2 = (A2, 'JV2) be two first order languages. Then F1

subsumes F2 iff A 2 c Al.

We therefore say that Fg subsumes Fs3.

The closingprice predicate as described in Chapter 3 is an element of Ps3. Recall

also that closing-price had five arguments i.e. Stock, Price, Exchange, Currency and Date.

These will be represented by the following type predicates respectively: stock, price,

exchange, currency and date respectively. These type predicates also belong in Ps3.

All the possible constants that can be represented in this context are in Ds3. This is

formalized by means of the domain closure axiom. Furthermore, there are no naming conflicts

among the constant symbols of Fs3. That is, different constant symbols stand for different

meanings and a constant symbol always has one meaning. This is formalized by the unique

name axioms and the equality axioms.

Moreover, each constant symbol is associated with a type predicate. Thus, stk1 is a

constant symbol of type stock. This is formalized by means of ground axioms of the type

predicates e.g. stock (stkl). The only constants that are of type stock are stkl, stk2, stk3

and s tk4. Other constants are not of type stock. This will be formalized by the completion

axioms for the stock predicate.

Ss3 is the set of propositions that are allowed within this context. The restrictions on S s3 is

formalized by means of a set of integrity constraints. For example, the restriction on the type of

arguments the closingprice predicate can take is represented as

(VxiVx2Vx3Vx 4Vx 5 ) closingprice (xl,x2,x3,x4,x5) => stock(xi ) A

price(x2) Aexchange(x3) Acurrency(x4) Adate(x5);

Recall also that the currency of the closing prices depended on the exchanges on which the stocks

are traded. This restriction is represented by the following two constraints.

(VxiVx2Vx 3 Vx4Vx5) closingprice (xl,x2,nyse,x4,x5) => =(x4,usd);

(VxiVx2Vx3Vx4Vx5) closing-price (xix 2 ,tokyo,x4,x5) => = (x4,yen);

The context of Source 3 defined in logical terms is shown in Fig. 4.3.

In general, a context is defined in logical terms by specifying a first order language F c,

and the following axioms in terms of this language:

1. Domain closure axiom (DCA).

2. Unique name axioms (UNA).

3. Equality axioms (EA).

4. Ground axioms of type predicates (TA).

5. Completion axioms of the type predicates (TCA).

6. A set of integrity constraints (IC).

Any fact asserted in terms of Fc within this context must be consistent with the above axioms.

For example, the fact closing-price (stkl, 55, nyse, yen, t1) violates one of the

constraints and is therefore not allowed in this context. This is because all closing prices of stocks

traded on nyse must be in usd.

Formally, a context can be defined in logical terms as follows

Definition: Let Fc be a first order language. A context C is defined as (Fc, Tc) where Tc =

DCAuUNAuEAuTAuTCA uIC.

Next, we define what we mean when we say that a proposition is allowed in a particular context.

Definition: Let C = (F c, Tc) be a context. Furthermore, let o be a wff of Fc. 0 is allowed

in the context C iff Tc u o is a consistent theory. Otherwise, w is not allowed in context C.

Context definition rules for a source must map tuples in the source to corresponding

propositions that are allowed in the source context. In specifying context definition rules, we

must ensure that the context we are concerned with should not be violated. Thus, the following

context definition rule for Source 3 from Chapter 3

Alphabet of Fs3

Predicate symbols: closingprice(,,,,), stock(),

price(),exchange(),currency(),date();

Simple Types: stock(), price() ,exchange() ,currency() ,date();

Constant symbols: stkl,..., stk4, nyse, tokyo,usd,yen,...;


Unique Name Axioms

Equality Axioms


price(OO.00),...,price(10000.00), date(tl), stock(stkl),

stock(stk2), stock(stk3), stock(stk4), currency(usd),

currency (yen), exchange (nyse), exchange (tokyo);

Completion Axioms of type Predicates


(VxlVx2Vx3Vx4Vx5) closingprice (xl,x2,x3,x4,x5) -

stock(xl) Aprice(x2) Aexchange(x3) Acurrency(x4) A

date (x5);

(VxlVx2Vx3Vx4Vx5) closing-price (xi,x2,nyse,x4,x5) ->

=(x4,usd);

(VxlVx2Vx3Vx4Vx5) closing-price (xi,x2,tokyo,x4,x5) =

=(x4,yen);

Fig. 4.3 A Logical View of the Context of Source 3

VStockVPrice s3_closing-price(Stock, Price, nyse) =>

closingprice(Stock, Price, nyse, usd, t1).

will ensure that closing prices of stocks traded on nyse will be associated with the appropriate

currency i.e. usd.

The context for a receiver can be similarly defined. As an example, the logical

formulation of the context of Receiver 2 defined in terms of the language Fr2 is shown in Fig 4.5.

A receiver will only accept propositions that are allowed in its context. The proposition

closing-price (stk1, 55, nyse, yen, t1), for example, is not allowed in the context of the

receiver because the only currency in its context is usd. The following context definition rule for

Receiver 2 will ensure that only closing prices in usd will be converted to tuples in the receiver's

view.

VStock VPrice VExchange closing-price (Stock, Price, Exchange,

usd, t1) = r2_closingprice(Stock, Price, Exchange).

In sum, a context defines a set of allowable propositions in terms of the language of the

shared ontology. Context definition rules for a source should map tuples in a source only to

corresponding propositions allowed within its context. Context definition rules for a receiver

should only map propositions allowed within its context to a receiver's view.

4.5 Adding the Shared Ontology and Context Definition Rules

Reiter argued that the value of the proof theoretic view of a relational database is its

capacity for generalization. Reiter then showed how a such a view can be generalized to describe

databases with null values and disjunctive information. Furthermore, by adding deductive laws,

we get more general relational theories that can incorporate more real world knowledge such as

the representation of events, hierarchies and the inheritance of properties. Adding laws to a

relational database results in a deductive database. In BULLION, sources are relations. The laws

that are then added are those from the shared ontology and the context definition rules such as

those described in Chapter 3. For convenience, we will refer to such theories as BULLION

theories. We will now proceed to formalize this notion of a BULLION theory.

Alphabet of F]

Predicate symbols: closing-price(,,,,), stock(),

price(),exchange(),currency(),date();

Simple Types: stock(, price() ,exchange() ,currency() ,date(;

Constant symbols: stkl, ... , stk4, nyse, tokyo,usd,yen, ... ;


Unique Name Axioms

Equality Axioms


price(OO.00),.

stock(stk2),

exchange(nyse),

..,price(10000.00), date(tl), stock(stkl),

stock(stk3), stock(stk4), currency(usd),

exchange(tokyo);

Completion Axioms of type Predicates


(VxlVx2Vx 3Vx4Vx5) closing-price (xl,x2,x3,x4,x5) ~

stock (xl) A price (x2) A exchange (x3) A currency(x4) A

date (x5) ;

Fig. 4.5 A Logical View of the Context of Receiver 2

We note that the language of the shared ontology is, in general, distinct from the

languages of sources and receivers. For example, Source 3 has a predicate s3_closing-price

that is not in the language of the shared ontology. Therefore, the language used to express a

BULLION theory must subsume the languages of the shared ontology, sources and receivers.

That is

Definition: Let Fg be the language of the shared ontology. Furthermore, let F ,...,Fn be

the languages used by various sources and receivers within the BULLION federation.

Then the language Fb of the BULLION federation subsumes Fg, F 1,... and Fn.

A BULLION theory Tb, specified in language Fb, will contain the following axioms:

1. Domain closure axiom (DCA).

2. Unique name axioms (UNA).

3. Equality axioms (EA).

4. Ground axioms of type predicates (TA).

5. Ground of axioms of non-type predicates (GA).

6. Axioms for the closed world assumption (CWA).

7. Context definition rules for sources and receivers (CD).

8. Laws from the shared ontology (SO).

More formally,

Definition: A BULLIONfederation is defined as (Fb,Tb) where Fb is the language of the

federation and Tb is a BULLION theory where Tb = DCA u UNA u EA u TA u GA u

CWA u CD u SO.

We now reexamine the significance of the various axioms in a BULLION theory. The

domain closure axiom now represents the set of all constant symbols within the BULLION

federation. This means that the scope of universal quantification applies to all constants within

the federation. These constant symbols include those in the language of the various sources and

receivers, as well as that of the shared ontology.

The unique name axioms state that all constant symbols that are not syntactically equal

are also not semantically equal. Furthermore, the equality axioms state that all constant symbols

are equal to themselves. In other words, there are no naming conflicts among constants in a

BULLION federation. This is an example of how proof theory makes explicit the assumptions of

the BULLION model. Obviously, naming conflicts will arise in practice. Such conflicts need to be

resolved before context mediation can be performed.

The completion axioms as defined previously are no longer appropriate as a

formalization of the closed world assumption when general rules are introduced to a relational

database. More general formalizations of the closed world assumption have been developed for

relational theories that contain deductive laws [29]. A detailed discussion of the formalization of

the closed world assumption, however, is not critical to the understanding of the BULLION model

and is therefore omitted.

The ground axioms of the BULLION theory include ground facts of the individual sources

as well as ground facts from the shared ontology. A BULLION theory can contain context

definition rules for multiple sources and receivers. Finally, a BULLION theory contains deductive

laws from the shared ontology.

Another observation we make is that BULLION is a consistent theory. Therefore, no

contradictory sentences are allowed. Furthermore, a statement that is asserted as true within a

source or the shared ontology is true throughout the BULLION federation. Dealing with multiple

inconsistent theories is beyond the scope of this thesis. However, any large scale federation of

sources and receivers must expect to deal with such an eventuality. The Logic of Contexts,

described in Chapter 2, is designed with such a problem in mind. LOC is an extension of first-

order logic. Since we have just defined the BULLION model in terms of first-order logic, its

relationship to LOC will not be difficult to understand. We will return to this matter in Chapter

7.

Recall the definition of a query Q in terms of Fb is defined as

Definition: A query Q for Fb = (AW) is any expression of the form <x1/ti,...,xn/Tn I W

(xl,...,xn)> where:

1. x1,..., xn are variables of A.

2. Each ti is a type of A.

The definition of an answer to a query issued against a BULLION theory Tb is now relatively

straightforward:

Definition: A tuple of constants <c 1,..., cn> of Fb's alphabet is an answer to query Q with

respect to Tb, a BULLION theory, iff

1. Tb I-Ti(ci) for i=1,...,n.

2. Tb I- W (c1,...,cn).

An answer to a query in a BULLION system, therefore, represents propositions that are provable

from multiple sources augmented by context definition rules and laws of the shared ontology.

These rules transform data in multiple sources to an answer that is consistent with the context of

the receiver.

4.6 Discussion

We have provided a formal specification of the Context Interchange Architecture from

the perspective of the knowledge level. Its nonprocedural and declarative nature provides a very

explicit specification of meaning, providing a formal semantics for what we are computing. It

tells us, for example, what knowledge is being represented in the ontology and in the contexts. It

also tells us what an answer should be, given the ontology and the contexts of source and receiver.

However, its nonprocedural character also means that our specification does not indicate how to

perform query operations in a Context Interchange Architecture. Of course various

implementations are possible, keeping in mind the usual trade-offs at the symbol level: namely

soundness, completeness, expressiveness, efficiency, degree of autonomy, functionality, etc.

However, such a description provides a rigorous basis for the understanding, analyzing and

implementing Context Interchange systems.

From the point of view of this specification, query processing is theorem proving. A

query defines the set of propositions to be proven from a first order theory made up of sources, a

shared ontology and the context definitions of sources and receivers. The propositions that are

provable make up the answer to a query. This specification can thus serve to verify various

implementations of Context Interchange systems. Therefore, in principle, if the sources, contexts

and shared ontology of a Context Interchange implementation are described in proof theoretic

terms, the answers that can be proven should be identical to that retrieved from the

implementation. This specification thus provides a basis for proving the correctness of an

implementation.

Also, from the point of view of this specification, multiple proofs might exist for an

answer to a query. This means that there is more than one possible way to perform a conversion

for a particular answer. Our specification allows us to express all possible proofs of an answer.

This raises some interesting theoretical and practical issues. For example, the efficiency of a

Context Interchange system might be increased by using only the most economical conversion

path available. On the other hand, the system might be highly inefficient if it were to prove the

same answer in all possible ways. This redundancy might however have a side benefit in that if

some conversion paths are "broken" due to missing deductive rules, the correct answer can still

be derived by means of an alternative proof. Also, a general theorem prover for our specification

can reveal certain conversion paths which are more viable but missed out by other more

specialized, "hand-crafted", query processing mechanisms. Our notion of an answer, therefore,

provides a framework in which to conceptualize and discuss such issues.

Furthermore, a proof can be used to explain an answer. That is, a proof can tell us which

original data sources were used, and how the original data was transformed to the final answer.

Thus, our specification, very naturally, provides a mechanism for explanation, which is one of the

key features of a Context Interchange system.

Therefore, our definition of an answer to a query is useful for various conceptual reasons.

However, a precise and formal model of what a Context Interchange should deliver is not

enough. How do we know if this model itself makes sense? Does this definition of a mediated

answer conform to our intuitive notion of what a mediated answer should be? In the following

chapter, we describe a prototype written in Prolog based on BULLION. This prototype is

intended as a proof of concept to demonstrate that this model makes sense. Further, it will serve

to concretely illustrate the ideas presented up until now.

5 The BULLION Prototype: Proof of Concept I

To make the ideas presented in the previous chapters more concrete, we now describe

these ideas in terms of a series of Prolog programs. Prolog, a logic programming language, was

chosen for a number of reasons. From the theoretical point of view, Prolog's theoretical

relationship to proof theory is relatively straightforward and well documented [55]. For example,

the unique name axioms are incorporated into Robinson's unification algorithm [63], which is the

heart of Prolog's inference engine. The domain closure of a Prolog program is the Herbrand

domain, which is simply the set of constants in the program. Furthermore, Prolog's relationship

to relational databases is also well understood and researched [17]. From a practical point of

view, Prolog readily facilitates the construction of a prototype system that can be demonstrated.

These Prolog programs will be used to illustrate as well as to verify the theory underlying the

BULLION model. For an introduction to Prolog, the reader can refer to the following texts [19,

70].

5.1 Context Interchange for Example A

5.1.1 Sources and Receivers

We first describe the Prolog prototype for a Context Interchange system for Example A.

The tuples in Source 1 and Source 2 will be modeled as Prolog statements, where the relation

name is the predicate name and the cell values of each tuple correspond to predicate arguments.

So Source 1 (Fig. 1.1) appears in a Prolog program as:

s1_head_office(cl, ny).

s1_headoffice(c4, ny).

Similarly, Source 2 (Fig. 1.2) appears as

s2_countryofincorporation(c5,usa).

s2_countryofincorporation(c6,japan).

As usual, arguments whose first letter is lowercase are constants, otherwise, the arguments are

variables.

The schema of Receiver 1 is

rlcountryof_incorporation (Company_name, Countryname).

Recall that Receiver 1 is interested only in companies c2 -c5.

5.1.2 The Shared Ontology and Context Definitions

We now consider the construction of the shared ontology and the respective context

definition rules. We first determine the appropriate predicates of the shared ontology. They are

headoffice(Companyname, City-name),

countryofincorporation(Companyname, Country-name) and

located_in (Cityname, Countryname). Now we incorporate the knowledge shown in

Fig. 1.3 into the shared ontology i.e.

country-of_incorporation (Company-name, Country-name) :-

head_office (Company-name, City-name),

locatedin (Cityname, Countryname).

locatedin(ny, usa).

locatedin(tokyo, japan).

locatedjin(london, uk).

locatedin(chicago, usa).

We also introduce the domain predicate which specifies the domain of a variable and is defined

in Prolog as

domain (X, [X|]).

domain(X,[_IY]) domain(X,Y).

The context definition of Source 1 is given by the following context definition rule

head_of fice(Companyname,City-name):-

domain(Company-name, [cl,c2,c3,c4]),

s1_head_office (Company_name,City-name).

Note that in this particular case, we did not choose to specify a restriction on the domain of

Ci ty-name for this source. Similarly, the context definition rule for Source 2 is

country_ofjincorporation (Company-name, City-name) :-

domain(Companyname, [c5,c6]),

s2_country-ofincorporation (Companyname, Cityname)

The context definition rule for Receiver 1 is

rl-country_of_incorporation (Companyname, Cityname) :-

country_of_incorporation (Companyname, City-name),

domain(Companyname, [c2,c3,c4,c5]).

Observe the simplicity of this context definition rule in that it need not contain the knowledge of

the relationship between headof f ice and country-of-incorporation. Knowledge of

this relationship is in the shared ontology and can be shared and reused, lowering the cost of

construction and maintenance. This rule also specified the domain of companies that Receiver 1

is willing to accept. The entire Prolog program is shown in Fig. 5.1.

5.1.3 Sample Queries and Answers

Thus, the SQL query

select Company-name

from r1_country-ofjincorporation

where Country-name= "usa"

is equivalent to the Prolog query

? rlcountry-of-incorporation(Company-name, usa).

The answer to this query is

Companyname = c4 ;

Companyname = c5 ;

which is the expected result. Observe that the receiver issues a query against its own schema "as

usual" and was not required to make the domain assumption of Company-name explicit within

the query. Furthermore, despite the different schemas of Source 1 and Receiver 1, knowledge in

the shared ontology was used to translate across heterogeneous schemas to deliver answers that

are meaningful to the receiver. The autonomy of the sources and receiver have also been

preserved. We now consider Example B where we deal with the problem of implicit arguments.

%Definition of the domain predicatedomain(X,[XI]).domain(X,[_IY]) :- domain(X,Y).

%Context Definition Rule for Receiver 1rlcountry-ofincorporation(Company-name,Cityname):-

country-of_incorporation(Company-name,Cityname),domain(Companyname, [c2,c3,c4,c5]).

%The Shared Ontologycountry-ofjincorporation(Company-name,Countryname):-

head-office(Company-name,City-name),locatedin(Cityname,Countryname).

locatedin(ny, usa).locatedin(tokyo, japan).locatedin(london, uk).locatedin(chicago, usa).

%Context Definition Rule for Source 1head_office (Companyname,Cityname):-

domain(Companyname, [cl,c2,c3,c4]),slheadoffice(Company-name,Cityname).

%Source 1 Dataslheadoffice(cl,ny).slheadoffice(c2,tokyo).slheadoffice(c3,london).slheadoffice(c4,chicago).

%Context Definition Rule for Source 2countryofjincorporation(Company-name,City-name):-

domain(Company-name, [c5,c6]),s2_country-ofjincorporation(Company-name,City-name).

%Source 2 Datas2_country-ofincorporation(c5,usa).s2_country-ofincorporation(c6,japan).

Fig. 5.1 Context Interchange System for Example A

5.2 Context Interchange for Example B


For Example B, Source 3 is modeled as

s3_closingprice(stkl, 55,nyse).

s3_closingprice(stk2, 40,nyse).

s3_closingprice(stk3, 6330,tokyo).

s3_closing-price(stk4, 6531,tokyo).

The schema of Receiver 2 is

r2_closing-price(Stock, Price, Exchange).

5.2.2 The Shared Ontology and Context Definitions

We will use the predicate closing-price with arguments Stock, Price, Exchange,

Currency and Date based on the framework described in Chapter 3. Essentially,

closing-price is the attribute of Stock, a thing. Price and Currency represent the value of

the attribute closingprice. Date is a time argument which functions as the manifold. Recall

that the scale value of monetary amounts assumed by all sources and receivers is one. Thus, the

scale argument was not be specified.

The deductive laws needed for this example is the conversion of closing prices in yen to

usd. This is expressed as

closingprice(Stock,USprice,Exchange,usd,Date) :-

USprice = Yenprice/100,

closing-price (Stock, Yenprice, Exchange,yen, Date).

The context definition rules for Source 3 are

closingprice(Stock, Price,nyse,usd,tl):-

s3_closing-price(Stock, Price,nyse).

and

closing-price(Stock,Price,tokyo,yen,tl):-

s3_closingprice (Stock, Price, tokyo).

% Context Definition Rule for Receiver 2

r2_closing-price(Stock, Price, Exchange):-

closingprice(Stock, Price, Exchange,usd,t1).

% The Shared Ontology

closing-price(Stock,USprice,Exchange,usd,Date):-

closing-price(Stock,Yenprice,Exchange,yen,Date),

USprice = Yenprice/100.

% Context Definition Rules for Source 3

closingprice(Stock,Price,nyse,usd,tL):-

s3_closing-price(Stock,Price,nyse).

closingprice(Stock,Price,tokyo,yen,tL):-

s3_closingprice(Stock,Price,tokyo).

% Source 3 data

s3_closing price(stk1,55,nyse).

s3_closing-price(stk2,40,nyse).

s3_closingprice(stk3,6330,tokyo).

s3_closingprice(stk4,6531,tokyo).

Fig. 5.2 Context Interchange System for Example B

The first rule says that the closing prices for all stocks traded on the nyse exchange have

currencies in usd, and the corresponding date is t1. Similarly, the second rule says that the

closing prices for all stocks traded on the tokyo exchange have currencies in yen, and the

corresponding date is ti.


r2_closingprice(Stock, Price, Exchange):-

closing-price(Stock, Price, Exchange,usd,tl).

The entire program is shown in Fig. 5.2.


The SQL query

select Stock, Price

from r2_closing-price

where Price > 50.

is equivalent to the Prolog query

? r2_closing-price(Stock,Price,) ,Price>50.

Observe that the underscore for the third argument, which is the exchange on which the stock is

traded, in the query means "don't care". The answer returned to this query has three tuples

Stock = stk3

Price = 6330/100

Stock = stk4

Price = 6531/100

Stock = stk1

Price = 55 ;

which is the desired result. Observe that the Prolog program did not evaluate the closing prices

of stocks stk3 and stk4. This is because the equality predicate "=" was used in the currency

conversion rule. To evaluate the closing prices, we should use "is" in place of "=".

Observe how the implicit currency and time argument for both source and receiver were

explicated. The receiver need not explicitly declare its currency or time assumptions within the

query. Note also how we can represent the assumption that the currency of the closing price of a

stock, in Source 3, depends on the exchange on which that stock is traded. Both source and

receiver autonomy are preserved. We now describe an even more challenging scenario and

demonstrate how the BULLION model resolves it in Example C.

5.3 Context Interchange for Example C


In Example C, there are two sources, Source 4 and Source 5. Source 4 has two relations

shown in Fig. 5.4 and Fig. 5.5. The relation s4_head_of f ice contains information about the

head offices of various companies, much like in Source 1. The relation s4_revenue contains

information on the Revenue of the companies for the year t1 and t2. The currency of the

Revenue of a company depends on the country in which that company is incorporated. In this

case, the currencies of Revenue are usd, yen and pound for companies incorporated in the usa,

j apan and the uk respectively.

s4_headoffice

Company-name Citynamecl ny

c2 london

c3 tokyo

c4 chicago

c5 tokyo

Fig. 5.3 A Source 4 relation

s4_revenue

Company-name Revenue Yr

ci 1100 ti

c2 400 ti

c3 80000 ti

c4 1000 ti

c5 90000 ti

c1 1000 t2

c2 300 t2c3 70000 t2

c4 900 t2

c5 95000 t2

Fig. 5.4 Another Source 4 relation

Source 5 is shown in Fig. 5.5 and contains information on the Expense of various

companies. The implicit assumptions are that all currencies are in yen and this information

pertains to the year t1.

s5_expense

Company-name Expense

c1 10000

c2 9000

c3 8500

c4 9500

c5 11000

Fig. 5.5 Source 5

There are two receivers in this example: Receiver 3 and Receiver 4. Receiver 3 has the schema

rs3 (Company-name, Revenue, Profit, Country-name)

Country-name is a variable that stands for the country of incorporation of a company.

Furthermore, the domain of Country-name is the set of worldscope companies defined in the

shared ontology as cl-c4. The currency and year of Revenue and Prof it are usd and t1

respectively.

Receiver 4 has the following schema

c2 (Revenue,Yr).

Note that in this case, the name of the schema (i.e. c2) is actually a company name in Source 4

and 5. That is, we have schematic discrepancies (Chapter 2) within this federation! The currency

of Revenue expected by the Receiver 4 is usd.

5.3.2 The Shared Ontology

For the shared ontology, we will use the head_of f ice, country-of_incorporation

and located_in predicates of Example A. We will also use the predicates revenue, prof it

and expense, each of which have the following argument types Company-name, Amount,

Currency and Yr. These arguments correspond to thing, property, unit and time as described

by the framework presented in Chapter 3. Once again, we assume that the scale values for all

monetary amounts is one.

We will also require the conversion rates for yen to usd (100 yen to 1 usd) and for

pound to usd (2 pound to 1 usd) for revenue i.e.

revenue(Company-name, USamount, usd, Yr):-

USamount=Yenamount /100, revenue (Companyname, Yenamount, yen, Yr).

and

revenue(Companyname, USamount, usd, Yr):-

USamount=UKamount *2 , revenue (Company-name, UKamount, pound,

Yr).

Similar currency conversion rules are written for prof it and expense. Next, the relationship

among revenue, profit and expense is specified as

profit(Company-name, Profit, Currency, Yr):-

revenue(Company-name, Revenue, Currency, Yr),

expense(Company-name, Expense, Currency, Yr),

Profit=Revenue-Expense.

Observe that this rule states that to compute profit from revenue and expense, these amounts

must refer to the same company, year and currency. This enables the context mediator to select

the appropriate conversions. We also require the knowledge from the shared ontology of

Example A. Finally, we define the domain of worldscope companies as

worldscope (Companyname) :-

domain(Company-name, [cl,c2,c3,c4]).

Defining universally known domains within the shared ontology allows for reuse by sources and

receivers as discussed in Chapter 3

5.3.3 Context Definitions

The context definition rules for Source 4 are

head_office (Companyname, Cityname) :-

s4_head_of fice(Companyname,Cityname).

revenue(Company-name,Amount,usd,Yr):-

s4_revenue(Company-name,Amount,Yr),

country-ofincorporation(Company-name,usa).

revenue (Companyname,Amount,yen,Yr) :-

s4-revenue (Companyname, Amount, Yr),

country-ofjincorporation (Companyname, japan).

revenue (Companyname, Amountpound, Yr) :-

s4_revenue (Company-name, Amount, Yr),

countryofjincorporation (Company-name, uk).

The last three rules simply state that the currencies of the revenue are usd, yen and pound for

companies incorporated in the countries of usa, japan and uk respectively. The context

definition rule for Source 5 is

expense (Company-name, Expense, yen, tl) :-

s5_expense (Companyname, Expense).

This rule says that all expenses are in yen for the year tI. The context definition rule for

Receiver 3 is

rs3 (Company-name,Revenue, Profit, Country-name) :-

revenue (Company-name, Revenue,usd, tL),

profit (Company-name, Profit,usd, tL),

countryof_incorporation(Companyname,Countryname)

,worldscope (Company-name).


c2(Revenue,Yr):-

revenue(c2,Revenue,usd,Yr).

The entire program is shown in Fig. 5.6.

domain (X, (X| jI ) .

domain(X,[_IY]) :- domain(X,Y).

%***********Receiver 3: Context Definition Rules********

rs3(Companyname,Revenue,Profit,Country-name):-

revenue(Company-name,Revenueusd,tL),

profit(Company-name,Profit,usd,tl),

countryofjincorporation(Company-name,Countryname),

worldscope(Company-name).

%***********Receiver 4 Context Definition Rule********

c2(Revenue,Yr):-

revenue(c2,Revenue,usd,Yr).

%********The Shared Ontology***************

country-of-incorporation(Companyname,Country-name):-

head_office (Companyname, Cityname),

locatedin(Cityname,Countryname).

locatedin(ny, usa).

locatedin(tokyo, japan).

locatedin(london, uk).

locatedin(chicago, usa).


USamount=UKamount*2,revenue(Company-name, UKamount, pound, Yr).


USamount=Yenamount/100,revenue(Company-name, Yenamount, yen, Yr).

expense(Company-name, USamount, usd, Yr):-

USamount=UKamount*2,expense(Companyname, UKamount, pound, Yr).

expense(Company-name, USamount, usd, Yr):-

USamount=Yenamount/100,expense(Company-name, Yenamount, yen, Yr).

Fig. 5.6. Context Interchange System for Example C


For Receiver 3, consider a query which requests for all tuples in the receiver's view to be

returned i.e.

? rs3 (Company-name, RevenueProfit, Country-name).

The following answer is returned,

profit(Company-name, USamount, usd, Yr):-

USamount=UKamount*2,profit (Companyname, UKamount, pound, Yr).

profit(Company-name, USamount, usd, Yr):-

USamount=Yenamount/100,profit (Companyname, Yenamount, yen, Yr).

profit (Company-name, Profit, Currency, Yr) :-

revenue(Company-name, Revenue, Currency, Yr),

expense(Company-name, Expense, Currency, Yr),

Profit=Revenue-Expense.

worldscope (Companyname) :-

domain(Companyname, [cl,c2,c3,c4]).

%*******Source 4 Context Definition Rules***********

head_office (Companyname,City-name) :-

s4_head_of fice(Company-name,City-name).

revenue (Company-name,Amount,usd,Yr) :-

s 4_revenue (Company-name, Amount, Yr),

countryofjincorporation (Company-name, usa).

revenue (Company-name, Amount , yen, Yr) :-

s 4_revenue (Company-name, Amount, Yr),

countryofjincorporation (Company-name, japan).

revenue (Company_nameAmount, poundYr) :-

s 4_revenue (Companyname, Amount, Yr),

countryofjincorporation (Companyname, uk).

Fig. 5.6. Context Interchange System for Example C (cont'd)

%*********Source 4 Data********

s4_revenue(cl,1100,tl).

s4_revenue(c2,400,tl).

s4_revenue(c3,80000,t1).

s4_revenue(c4,1000,tl).


s4_revenue(cl,1000,t2).





s4_head_office(cl,ny).

s4_headoffice(c2,london).

s4_headoffice(c3,tokyo).

s4_headoffice(c4,chicago).

s4_headoffice(c5,tokyo).

%*******Source 5 Context Defintion Rules***********

expense(Company-name,Expense,yen,tl):-

s5_expense(Company-name,Expense).

%*********Source 5 Data********

s5_expense(cl,10000).

s5_expense(c2,9000).




Fig. 5.6. Context Interchange System for Example C (cont'd)

Companyname = c2

Revenue = 400*2

Profit = 400*2-9000/100

Countryname = uk

Companyname = c3

Revenue = 80000/100

Profit = (80000-8500)/100

Countryname = japan

Company-name = c3

Revenue = 80000/100

Profit = 80000/100-8500/100

Countryname = japan

Company-name = c1

Revenue = 1100

Profit = 1100-10000/100

Countryname = usa

Company-name = c4

Revenue = 1000

Profit = 1000-9500/100

Countryname = usa ;

Note that there are multiple responses for company c 3. To understand why, observe the

corresponding values for Pro f it. We see that both responses calculated Prof it is different

ways, although the final result is still the same. In one answer tuple for c 3, subtraction occurred

first, followed by the conversion from yen to usd. In the other tuple for c3, Revenue and

Expense were both converted from yen to usd before subtraction. The reason for this is that

Prolog will look for all possible ways to derive an answer, giving rise to this redundancy

[70:ppl36-13 9].

Another query

? rs3 (Companyname, Revenue,Profit, Countryname),

Prof it>900.

which asks for all tuples but with Pro f it values greater than 90 0 usd, will retrieve the answers

Companyname = c1

Revenue = 1100

Profit = 1100-10000/100

Country-name = usa

Companyname = c4

Revenue = 1000

Profit = 1000-9500/100

Country-name = usa ;

There are a number of noteworthy features to note in the computation of these queries.

First, source and receiver autonomy were preserved. Second, the knowledge about

head_of f ice and its relationship to country-o fjincorporation was used for translation as

well as to define the currencies of revenue. This seems reasonable because there is no reason

why general knowledge cannot be used for both purposes. Third, there was a two-step

conversion (e.g. for c1) which involved a currency conversion (yen to usd for Expense of c1),

followed by a conversion from Revenue and Expense to Profit. Obviously, in a richer

ontology, longer conversion chains are possible. Fourth, just as in Example A, multiple sources

were involved. Fifth, the receiver's domain for company-name can be specified in its context

definition rule.

Finally, we now demonstrate how a relation and a receiver with discrepant schemas can

semantically interoperate. Consider the query issued by Receiver 4

? c2(Revenue,Yr).

which then receives the following answers

Revenue = 400*2

Yr = tl

Revenue = 300*2

Yr = t2

Thus, we have just shown an example of how schematic discrepancy can be resolved within the

Context Interchange framework without resorting to higher order languages, and without

sacrificing autonomy. Observe also how the currency conversion rule (from pound to usd) can

also be used for this other receiver.

5.4 Where did the Data come from? - The Logic Behind the Logic

To give further insight into the reasoning process of the context mediator, i.e. the Prolog

engine, we can construct a proof tree. Let us consider the tuple r s 3 ( c 2 , 4 0 0 * 2 , 4 0 0 * 2 -

90 0 0 /100 , uk), which is part of an answer returned to the query

? rs3 (Companyname, Revenue, Profit, Country-name).

issued by Receiver 3. A proof tree can explain how the Prolog engine proved this answer. The

program for constructing a proof tree is taken from [70:p329] and is shown in Fig. 5.7. To instruct

the program to construct a proof tree for rs3 (c2, 40 0*2, 400*2-9000 /100 , uk), we submit

the following query

? explain(rs3(c2,400*2,400*2-9000/100,uk),How).

The proof tree returned is

How = rs3(c2,400*2,400*2-9000/100,uk):-(revenue(c2,400*2,usd,tl):-

(400*2=400*2:-builtin), (revenue(c2,400,pound,tl):-

(s4_revenue(c2, 400, t1) :-true), (country-of_incorporation(c2,uk):-

(headoffice(c2,london) :-(s4_headoffice(c2,london):-

true)), (locatedin(london,uk):-true)))), (profit(c2,400*2-

9000/100,usd,t1):-(revenue(c2,400*2,usd,t):-(400* 2=400* 2 :-

builtin), (revenue(c2,400,pound,tl):-(s4_revenue(c2,400,t1):-

true), (countryofjincorporation(c2,uk) :-(headof fice(c2, london):-

(s4_headoffice(c2,london) :-true)), (locatedin(london,uk):-

true)))), (expense(c2,9000/100,usd,tl):-(9000/100=9000/100:-

builtin), (expense(c2,9000,yen,tl):-(s5_expense(c2,9000):-

true))), (400*2-9000/100=400*2-9000/100:-

builtin) ),(country_of_incorporation(c2,uk):-

(headoffice(c2,london) :-(s4_headoffice(c2,london):-

true)),(locatedin(london,uk) :-true)), (worldscope(c2):-

(domain(c2,[c1,c2,c3,c4]):-(domain(c2,[c2 ,c3 ,c4 ]):-true)))

%The basic meta-interpreter

builtin(builtin(X)).

explain(true,true):-!.

explain((A,B), (ProofA,ProofB)):-

!,explain(A,ProofA),explain(B,ProofB).

explain(A, (A :- builtin)):-builtin(A), !,A.

explain(A, (A:-Proof)):-clause(A,B), explain(B,Proof).

%A portion of the table of built-in predicates which %depend on

the particular Prolog implementation

builtin(A=B).

builtin(clause(A,B)).

builtin(A>B).

Fig. 5.7 A meta-interpreter for constructing proof trees

This is somewhat difficult to read though! So with a little reformatting, the refurbished

proof tree is shown in Fig. 5.8. Essentially, we have to prove (1). In order to prove (1), we need

to prove (1.1)-(1.4). But in order to prove (1.1), we need to prove (1.1.1)-(1.1.2) and so on.

Expression (1.1.1) is true because of the definition of the built-in predicate =, which is true if both

its arguments are syntactically equal. Expression (1.1.2.1) is true because it is asserted in Source 4.

Similarly, expression (1.1.2.2.2) is true because it is asserted in the shared ontology. The proof

tree shows the individual facts used from the various sources and the transformations that were

employed to derive the tuple.

A proof tree suggests a means of explaining to the receiver not only where the original

data came from, but what transformations took place in arriving at the final result. This

explanation capability might be very useful particularly since the entire query processing is

hidden from the receiver i.e. the receiver was not required to know what sources were accessed,

what rules in the ontology to use etc. The need to explain to the user "Where did the data come

from?" has been identified in [77, 78]. Wang and Madnick developed an algebra that produced

source tags as the explanation for which original and intermediate sources of data were used. A

proof tree, however, is more detailed and informative, showing which particular facts were used

and traces through the transformation steps. This is particularly useful for decision makers who

want to understand how reliable the answers returned are and what the assumptions made were.

5.5 What does the data mean? - Context Explication

The BULLION prototypes also allows us to ask questions about the underlying semantics

of data sources. This capability complements that of explaining the derivation of answers. For

example, the user might ask "What is the currency of the closing price of s tkl in Source 3?". This

question can be expressed as a query

? closing_price(stk1,_,_,Currency,_).

against Source 3 and its context definition rules, which is the following program

closing-price(Stock,Price,nyse,usd, t):-

s3.closingprice(Stock,Price,nyse).

closing-price(Stock, Price,tokyo,yen,tl):-

s3_closingprice (Stock, Price, tokyo).

How =rs3(c2,400*2,400*2-9000/100,uk):-

revenue(c2,400*2,usd,tl):-

400*2=400*2:-builtin

revenue(c2,400,pound,t1):- (1.1.2)

s4_revenue(c2,400,t1):-true (1.1.2.1)

country_ofincorporation(c2,uk):- (1.1.2.2)

head office(c2,london):- (1.1.2.2.1)

s4_headoffice(c2,london):true (1.1.2.2.1.1))

located in(london,uk):-true (1.1.2.2.2))

))

profit(c2,400*2-9000/100,usd,t1):- (1.2)

revenue(c2,400*2,usd,t1):- (1.2.1)

(1

4OO*2=4OO*2:-builtin (1.2.1.1)

revenue(c2,400,poundtl ):- (1.2.1.2)

s4 revenue(c2,400,tl1): -true (1.2.1.2.1)

country-ofjncorporation(c2,uk):- (1.2.1.2.2)

head office(c2,london):- (1.2.1.2.2.1)

s4_head office(c2,london):-:rue (1.2.1.2.2.1.1)

),f

Fs.4_8ArevenuePc2,400t)re for.2.1.2.1)

locatedjin(london,uk): -true

expense(c2,9000/100,usd,tl):-

9000/100=9000/100:-builtin

expense(c2,9000,yen,tl):-

s5_expense(c2,9000):-true

400*2-9000/100=400*2-9000/100:-builtin)

countryof_incorporation(c2,uk):-

headoffice(c2,london):-

s4_head__office(c2,london):-true)

located in(london,uk):-true)

worldscope(c2):-

domain(c2,[cl,c2,c3,c4]):-

domain(c2, [c2,c3,c4]): -true)

Fig. 5.8 A Prolog Proof Tree for Example C (cont'd)

(1.2.2)

(1.2.2.1)

(1.2.2.2)

(1.2.2.2.1)

(1.2.3)

(1.3)

(1.3.1)

(1.3.1.1)

(1.3.2)

(1.4)

(1.4.1)

(1.4.1.1)

(1.2.1.2.2.2)

s3_closing-price(stk1, 55,nyse).

s3_closing-price(stk2, 40,nyse).

s3_closingprice(stk3, 6330,tokyo).

s3_closing-price(stk4, 6531,tokyo).

The answer returned is

Currency = usd

signifying that the currency of the closing price of s tkI in Source 3 is usd. The next query

? closing-price (_,,Exchange Currency, _).

retrieves the following answers

Exchange = nyse

Currency = usd ;

Exchange = nyse

Currency = usd;

Exchange = tokyo

Currency = yen;

Exchange = tokyo

Currency = yen;

This answer reveals that nyse corresponds to usd and tokyo corresponds to yen.

5.7 Discussion

These series of Prolog programs served to illustrate the philosophy, principles and

capabilities that Context Interchange systems should possess. We have demonstrated how

various heterogeneities identified in the literature can be resolved within the Context Interchange

framework while preserving the autonomy of sources and receivers. We also showed how

knowledge in the shared ontology can be re-used by various sources and receivers for the

purposes of conversion and context definition. Furthermore, we also showed how conversion

procedures can be automatically constructed from simpler deductive rules. Finally, we

illustrated the notion of explanation and context explication.

These Prolog programs might also serve as specifications for Context Interchange

systems. They can be used to verify various implementations of Context Interchange systems.

For example, Prolog programs may be used to verify that the answers returned by a Context

Interchange system is complete and correct with respect to the Prolog specification. For example,

the implementation may be considered complete if the set of answers to a query that are provable

by the Prolog specification can be returned by the implementation. An implementation is correct

if the set of answers returned by the implementation are provable from the Prolog specification.

The next chapter concerns other aspects of scalability and flexibility. Specifically, we aim

to show that given a particular knowledge representation language, there will be more than one

way of encoding a BULLION theory. Each way will have a different impact of the scalability and

flexibility of the system. This dimension of scalability and flexibility has less to do with the

BULLION model itself and more with the way the model is encoded.

6 The BULLION Prototype: Proof of Concept II

We have already argued how global knowledge and its re-use allows for interoperable

systems to achieve greater scalability and flexibility. We now discuss other aspects of scalability

and flexibility of the BULLION prototype. Specifically, given a particular knowledge

representation language, there are different ways of encoding a BULLION theory with it.

Although these programs might be logically equivalent, they will have different impacts on the

scalability and flexibility of the system. Therefore, this particular dimension of scalability and

flexibility is independent of the BULLION model. Rather, it is related to how the BULLION model

is programmed in a particular knowledge representation language. In this chapter, we will

discuss various difficulties with scalability and flexibility of the Prolog program for Example C of

the previous chapter. We then show how refinements of the program lead to an equivalent but

more desirable system.

6.1 Refinement 1

Introducing new knowledge into the shared ontology of the Prolog program of Example

C might be cumbersome. For example, if a new currency conversion factor were introduced (say

US dollars to Singapore dollars with a conversion factor of 7/5), we will need to add another

three rules, one for each monetary amount i.e.

profit(Company-name, SINamount, sgd, Yr):-

SINamount=USamount*7/5,profit(Company-name, USamount, usd, Yr).

revenue(Company-name, SINamount, sgd, Yr):-

SINamount=USamount*7/5,revenue(Company-name, USamount, usd, Yr).

expense(Company-name, SINamount, sgd, Yr):-

SINamount=USamount*7 /5, expense (Companyname, USamount, usd, Yr).

Thus, if there were n monetary amounts, we would need to add n additional rules for each new

currency conversion. This is very cumbersome and certainly violates the requirement for Context

Interchange systems to be scalable and flexible.

domain(X, [X|1_).domain(X,[_IY]) :- domain(X,Y).


rs3(Companyname,Revenue,Profit,Country-name):-revenue(Company-name,Revenue,usd,t1),profit(Company-name,Profit,usd,t1),countryof_incorporation(Company-name,Country-name),

worldscope(Company-name).


c2(Revenue,Yr):-revenue(c2,Revenue,usd,Yr).


countryofjincorporation(Company-name,Countryname):-head_o f f ice(Company-name,Cityname),locatedin(City-name,Countryname).

locatedjin(ny, usa).locatedjin(tokyo, japan).locatedin(london, uk).located-in(chicago, usa).

curcvt(yen,usd,1/100).curcvt(pound,usd,2).

revenue(Company-name, AmtTar, Curtar, Yr):-cur-cvt(Curorg,Curtar,Cfact),AmtTar=AmtOrg*Cfact,

revenue(Company-name, AmtOrg, Curorg, Yr).

profit(Company-name, AmtTar, Curtar, Yr):-cur-cvt(Curorg,Curtar,Cfact),AmtTar=AmtOrg*Cfact,

profit(Companyname, AmtOrg, Curorg, Yr).

expense(Company-name, AmtTar, Curtar, Yr):-cur-cvt(Curorg,Curtar,Cfact),AmtTar=AmtOrg*Cfact,

expense(Company-name, AmtOrg, Curorg, Yr).

profit(Company-name, Profit, Currency, Yr):-revenue(Company-name, Revenue, Currency, Yr),expense(Company-name, Expense, Currency, Yr),Profit = Revenue-Expense.

Fig. 6.1 Example D

worldscope(Company-name):-domain(Company-name, [cl,c2,c3,c4)).

%*******Source 4 Context Definition Rules***********head_office (Companyname, City-name) :-

s4_headoffice(Companyname,Cityname).revenue(Companyname,Amount,usd,Yr):-

s4_revenue(Company-name,Amount,Yr),country-ofjincorporation(Companyname,usa).

revenue(Company-name,Amount,yen,Yr):-s4_revenue (Company-name, Amount, Yr),country-ofjincorporation(Companyname,japan).

revenue(Company-name,Amount,pound,Yr):-s4_revenue (CompanynameAmount , Yr),countryofjincorporation(Companyname,uk).

%*********Source 4 Data********

s4_revenue(cl,1100,tl).s4_revenue(c2,400,tl).s4_revenue(c3,80000,tl).s4_revenue(c4,1000,tl).s4_revenue(c5,90000,tl).s4_revenue(cl,1000,t2).s4_revenue(c2,300,t2).s4_revenue(c3,70000,t2).s4 revenue(c4,900,t2).s4_revenue(c5,95000,t2).

s4_headoffice(cl,ny).s4_headoffice(c2,london).s4_headoffice(c3,tokyo).s4_headoffice(c4,chicago).s4_headoffice(c5,tokyo).%*******Source 5 Context Defintion Rules***********

expense(Company-name,Expense,yen,t1):-s5_expense(Company-name,Expense).

%*********Source 5 Data********s5_expense(cl,10000).s5_expense(c2,9000).s5_expense(c3,8500).s5_expense(c4,9500).s5_expense(c5,11000).

Fig. 6.1 Example D (cont'd)

This problem, however, can be eliminated by introducing the currency conversion

predicate cur-cvt (Curtar, Curorg, CFact) where Curtar is the target currency, Curorg

is the original currency and CFac t is the conversion factor. Then, currency conversion rules of

Example C can be rewritten as

revenue (Comp, Revenue*CFact, Curtar,Yr) :- cur_cvt (Curtar, Curorg,

CFact) ,revenue (Comp, Revenue, Curorg,Yr).

expense (Comp, Expense*CFact, Curtar, Yr) :- curcvt (Curtar, Curorg,

CFact) ,expense (Comp, Expense, Curorg, Yr).

profit (Comp, Profit*CFact,Curtar,Yr) :- cur cvt(Curtar, Curorg,

CFact) ,profit(Comp,Profit,Curorg,Yr).

cur_cvt(usd, yen, 1/100).

cur-cvt(usd, pound, 2).

This new program, Example D, is shown in Fig. 6.1. This program is logically equivalent to that

of Example C in that identical results will be obtained for the same queries. Now, however, the

addition of a currency conversion factor requires the addition of only a single statement e.g.

cur_cvt(sgd, usd, 7/5),

6.2 Refinement 2

There is, however, yet another problem. If we were to introduce a new monetary

amount, say tax, we would have to write one rule for currency conversion for this quantity.

That is,

tax(Comp,Tax*CFact,Curtar,Yr) :- curcvt (Curtar, Curorg,

CFact) ,tax(Comp,Tax,Curorg,Yr).

Thus, if we were to introduce n such amounts into the shared ontology, n such additional rules

for currency conversion need to be written, one for each amount. Once again, this problem can

be eliminated by introducing a new predicate moneyamt and replace the previous currency

conversion rules with the following more general currency conversion rule for all monetary

amounts

moneyamt (MoneyAmt, Companyname, AmtTar, Curtar, Yr) :-

cur_cvt(Curorg,Curtar,Cfact),

AmtTar=AmtOrg*Cfact,

moneyamt (MoneyAmt, Companyname, AmtOrg, Curorg,

Yr).

The variable argument MoneyAmt can take the values of revenue, expense and prof it or any

other monetary amount yet to be introduced. Fig. 6.2 graphically illustrates this refinement. The

new program, Example E, is shown in Fig. 6.3. This program is equivalent to that of Example C

and D in that identical results will be obtained for the same queries. Now however, the addition

of a new monetary amount, say tax, does not require the addition of any currency conversion

rules.

Let us illustrate. Suppose we add a new source and receiver. The new source, Source 6,

contains tax information i.e.

s6_tax (c1,11000).

s6_tax(c2,10000).

s6_tax(c3,9000).

s6_tax(c4,9700).

s6_tax(c5,10000).

The currency and year assumed by this source is yen and t 1 respectively. The context definition

rule for this source is

moneyamt (tax, Company-name, Tax, yen, t) :-

s6_tax(Companyname,Tax).

This context definition rule defines tax as a monetary amount. By doing so, all the currency

conversions for monetary amounts automatically apply to tax. So additional rules for currency

conversion specifically for tax are not required.

The new receiver, Receiver 5, has the schema rs5 (Company-name, Pro f it). The

argument Profit actually refers to profits after tax. Furthermore, the receiver expects the

information to be in Singapore dollars (sgd) for the time period t1. The name of the monetary

amount profit after tax as described in the shared ontology is pf t_at. The context definition

rule for the new receiver is therefore

reven e(Company-name, AmtTar, Curtar, Yr):-cur_ Curorg,Curtar,Cfact),AmtTar= tOrg*Cfact,

revenue(C any-name, AmtOrg, Curorg,Yr).

profit(Company-name, AmtTar, Curtar, Yr):-cur_cvt(Curorg,Curtar,Cfact),AmtTar=AmtOrg*Cfact,profit(Company-name, AmtOrg, Curorg, Yr).

expense(Companyname, AmtTar, Curtar, Yr):-curcvt(Curorg,Curtar,Cfact),AmtTar=AmtOrg*Cfact,expense(Company-name, AmtOrg, Curorg, Yr).

money-amt(MoneyAmt,Company-name, AmtTar, Curtar, Yr):-cur-cvt(Curorg,Curtar,Cfact),

AmtTar=AmtOrg*Cfact,money-amt(MoneyAmt,Company-name, AmtOrg,

Curorg, Yr).

Fig. 6.2 Refinement 2: Concepts revenue, prof it and expense transformed from predicates

to argument values

domain(X, [X|1).domain(X,[_IY]) :- domain(X,Y).


rs3(Company-name,Revenue,Profit,Country-name):-money-amt(revenue,Company_name,Revenue,usd,t1),moneyamt(profit,Companyname,Profit,usd,t1),country_ofincorporation(Companyname,Countryname),worldscope(Company-name).


c2(Revenue,Yr):-moneyamt(revenue,c2,Revenue,usd,Yr).


country-ofjincorporation(Companyname,Countryname):-head_o ffice (Company-name, City-name),locatedin(Cityname,Countryname).

locatedjin(ny, usa).located in(tokyo, japan).located in(london, uk).located in(chicago, usa).

cur-cvt(yen,usd,1/100).cur_cvt(pound,usd,2).

moneyamt(MoneyAmt,Companyname, AmtTar, Curtar, Yr):-cur_cvt(Curorg,Curtar,Cfact),

AmtTar=AmtOrg*Cfact,moneyamt(MoneyAmt,Companyname, AmtOrg, Curorg, Yr).

moneyamt(profit,Companyname, Profit, Currency, Yr):-

Profit = Revenue-Expense,moneyamt(revenue,Companyname, Revenue, Currency, Yr),

moneyamt(expense,Companyname, Expense, Currency, Yr).

worldscope(Companyname):-domain(Company-name, [cl,c2,c3,c4)).


headof fice(Companyname,City-name) : -s 4_head_o f fice(Company-name,City-name).

money-amt(revenue,Companyname,Amount,usd,Yr):-s4_revenue (Companyname, Amount , Yr),countryof_incorporation(Company-name,usa).

Fig. 6.3 Example E

money-amt(revenue,Companyname,Amount,yen,Yr):-s4_revenue(Company-name,Amount,Yr),countryof_incorporation(Companyname,japan).

moneyamt(revenue,Company_name,Amount,pound,Yr):-s4_revenue(Companyname,Amount,Yr),countryof-incorporation(Company-name,uk).

%*********Source 4 Data********

s4_revenue(cl,1100,tl).s4_revenue(c2,400,tl).s4_revenue(c3,80000,tl).s4_revenue(c4,1000,tl).s4_revenue(c5,90000,tl).s4_revenue(cl,1000,t2).s4_revenue(c2,300,t2).s4_revenue(c3,70000,t2).s4_revenue(c4,900,t2).s4_revenue(c5,95000,t2).

s4_headoffice(cl,ny).s4_headoffice(c2,london).s4_headoffice(c3,tokyo).s4_headoffice(c4,chicago).s4_headoffice(c5,tokyo).%*******Source 5 Context Defintion Rules***********

moneyamt(expense,Companyname,Expense,yen,t1):-s5_expense(Company-name,Expense).

%*********Source 5 Data********

s5_expense(cl,10000).s5_expense(c2,9000).s5_expense(c3,8500).s5_expense(c4,9500).s5_expense(c5,11000).

Fig. 6.3 Example E (cont'd)

rs5 (Company-name, Profit) :-

moneyamt (pft_at,Company-name, Profit, sgd, t1).

We also need the currency conversion factor from US dollars to Singapore dollars. We therefore

add the statement

cur_cvt(sgd, usd, 7/5).

Finally, we need to add a rule that states the relationship among prof it, tax and pf t_at i.e.

moneyamt (pftat, Companyname, Pft-at, Currency, Yr) :-

Pftat = Profit-Tax,

moneyamt(profit , Company_name, Profit, Currency, Yr),

moneyamt(tax,Companyname, Tax, Currency, Yr).

The new program, Example F, is shown in Fig. 6.4.

The query

? rs5(Company_name,Profit).

will then retrieve the following answers

Companyname = c3

Profit = (80000-8500-9000)*(1/

Company-name = c5

Profit = (90000-11000-10000)*(

Company__.name = c3

Profit = ((80000-8500)*(1/100)

100)*(7/5) ;

1/100)*(7/5)

-9000*(1/100))*(7/5) ;

Companyname = c5

Profit = ((90000-11000)*(1/100)-10000*(1/100))*(7/5)

Companyname = c3

Profit = (80000*(1/100)-8500*(1/100)-9000*(1/100))*(7/5)

domain(X, [X|]).domain(X,[_IY]) :- domain(X,Y).


rs3(Companyname,Revenue,Profit,Country-name):-moneyamt(revenue,Company-name,Revenue,usd,t1),moneyamt(profit,Companyname,Profit,usd,t1),country_ofincorporation(Companyname,Countryname),worldscope(Company-name).


c2(Revenue,Yr):-moneyamt(revenue,c2,Revenue,usd,Yr).


rs5(Company-name,Profit):-moneyamt(pftat,Companyname,Profit,sgd,t1).


country-ofjincorporation(Companyname,Countryname):-head_o ffice (Company-name, City-name),locatedjin (City-name, Countryname).

locatedjin(ny, usa).located in(tokyo, japan).located in(london, uk).locatedjin(chicago, usa).

cur_cvt(yen,usd,1/100).cur-cvt(pound,usd,2).cur_cvt(usd,sgd,7/5).

money_amt(MoneyAmt,Company-name, AmtTar, Curtar, Yr):-cur_cvt(Curorg,Curtar,Cfact),

AmtTar=AmtOrg*Cfact,moneyamt(MoneyAmt,Company_name, AmtOrg, Curorg, Yr).

moneyamt(profit,Companyname, Profit, Currency, Yr):-

Profit = Revenue-Expense,moneyamt(revenue,Companyname, Revenue, Currency, Yr),

moneyamt(expense,Companyname, Expense, Currency, Yr).

moneyamt(pftat,Company-name, Pft-at, Currency, Yr):-Pft at = Profit-Tax,moneyamt(profit,Company-name, Profit, Currency, Yr),

moneyamt(tax,Companyname, Tax, Currency, Yr).

Fig. 6.4 Example F

100

worldscope (Company-name) :-domain(Companyname, [cl,c2,c3,c4]).

%*******Source 4 Context Definition Rules***********head_office (Company-name, City-name) :-

s4_head_of fice(Companyname,Cityname).moneyamt(revenue,Company_name,Amount,usd,Yr):-

s4_revenue(Companyname,Amount,Yr),countryof_incorporation(Company-name,usa).

money-amt(revenue,Company_name,Amount,yen,Yr):-s4_revenue(Companyname,Amount,Yr),country-ofjincorporation(Company-name,japan).

money-amt(revenue,Companyname,Amount,pound,Yr):-s4_revenue(Companyname,Amount,Yr),countryofjincorporation(Company-name,uk).

%*********Source 4 Data********

s4_revenue(cl,1100,tl).s4_revenue(c2,400,tl).s4_revenue(c3,80000,tl).s4_revenue(c4,1000,tl).s4_revenue(c5,90000,tl).s4_revenue(cl,1000,t2).s4_revenue(c2,300,t2).s4_revenue(c3,70000,t2).s4_revenue(c4,900,t2).s4_revenue(c5,95000,t2).

s4_headoffice(cl,ny).s4_headoffice(c2,london).s4_headoffice(c3,tokyo).s4_headoffice(c4,chicago).s4_headoffice(c5,tokyo).%*******Source 5 Context Defintion Rules***********moneyamt(expense,Company-name,Expense,yen,t1):-

s5_expense(Company-name,Expense).%*********Source 5 Data********

s5_expense(c1,10000).s5_expense(c2,9000).s5_expense(c3,8500).s5_expense(c4,9500).s5_expense(c5,11000).

%*******Source 6 Context Defintion Rules***********

moneyamt(tax,Company-name,Tax,yen,t1):-s6_tax(Companyname,Tax).

%*********Source 5 Data********s6_tax(cl,11000).s6_tax(c2,10000).s6_tax(c3,9000).s6_tax(c4,9700).s6_tax(c5,10000).

Fig. 6.4 Example F (cont'd)

101

Company name = c5

Profit = (90000*(1/100)-11000*(1/100)-10000*(1/100))*(7/5)

Company-name = c2

Profit = (400*2-9000*(I/1OO)-1OOOO*(1/100))*(7/5) ;

Company-name = c1

Profit = (1100-10000*(1/100)-11000*(1/100))*(7/5) ;

Companyname = c4

Profit = (1000-9500*(1/100)-9700*(1/100))*(7/5)

Company-name = c3

Profit = (80000-8500)*(1/100)*(7/5)-9000*(1/100)*(7/5)

Companyname = c5

Profit = (90000-11000)*(1/100)*(7/5)-10000*(1/100)*(7/5)

Companyname = c3

Profit = (80000*(1/100)-8500*(1/100))*(7/5)-9000*(1/100)*(7/5)

Company-name = c5

Profit = (90000*(1/100)-11000*(1/100))*(7/5)-10000*(1/100)*(7/5)

Companyname = c2

Profit = (400*2-9000*(1/100))*(7/5)-10000*(1/100)*(7/5) ;

Company-name = cl

Profit = (1100-10000*(1/100))*(7/5)-11000*(1/100)*(7/5) ;

Companyname = c4

Profit = (1000-9500*(1/100))*(7/5)-9700*(1/100)*(7/5)

Companyname = c3

Profit = 80000*(1/100)*(7/5)-8500*(1/100)*(7/5)-9000*(1/100)*(7/5)

Companyname = c5

Profit = 90000*(1/100)*(7/5)-11000*(1/100)*(7/5)-10000*(1/100)*(7/5)

Companyname = c2

Profit = 400*2*(7/5)-9000*(1/100)*(7/5)-10000*(1/100)*(7/5) ;

Companyname = c1

Profit = 1100*(7/5)-10000*(1/100)*(7/5)-11000*(1/100)*(7/5) ;

Company-name = c4

Profit = 1000*(7/5)-9500*(1/100)*(7/5)-9700*(1/100)*(7/5)

Once again, note that Prolog returns some redundant solutions, corresponding to different ways

of computing the same answer.

102

6.3 Discussion

In this chapter, we showed that the scalability and flexibility of a Context Interchange

system is also affected by the way in which knowledge in a shared ontology is encoded, given a

particular knowledge representation language. The program of Example C was refined to that of

Example E. Now, adding knowledge of a currency conversion (in this case US dollars to

Singapore dollars) to the shared ontology required adding one simple statement to Program E, as

opposed to introducing three more complex satements into Program C. Furthermore, adding a

new monetary amount such as tax to Program E did not require any any accompanying rules to

define the various currency conversions. In the case of Program C, three additional rules would

have been needed to define the currency conversions for tax (i.e. yen to usd, pound to usd, usd

to sgd). This difference in impact on scalability and flexibility has little to do with the BULLION

model itself, but more with the way the model is encoded. Thus, given a particular

representation language in which to represent a shared ontology, care must be taken to

appropriately encode it.

In the next chapter, we consider logic as an implementation paradigm for Context

Interchange systems. We discuss potential obstacles and solutions to making logic a viable

implementation approach for Context Interchange. The discussion will include the strengths and

weaknesses of Prolog as an implementation tool, and consider how these various weaknesses

may be resolved by more powerful logic languages. Some of these languages can add further

scalability and flexibility to the system.

103

7 Logic as an Implementation Paradigm

In this chapter, we will consider the use of Prolog, and more generally, logic

programming and the Logic of Contexts, in implementing Context Interchange Systems. There

are a number of benefits, both theoretical and practical, in doing so. First, the declarative nature

of logic programming tools makes them desirable for specifying shared ontologies as well as for

making context definitions explicit. Second, the theoretical base for logic and logic programming

is rich and well established [26, 35, 38, 55]. Many of the resources developed by the logic

programming community can be drawn upon to implement Context Interchange systems. We

have already seen in Chapter 5 how a meta-interpreter for explanation, a contribution from the

logic programming literature, has been used to explain answers for the purpose of Context

Interchange systems.

One of the first observations that one can make is that most of the sources that are likely

to be integrated are not Prolog databases, but will probably be relational databases instead. Thus,

a means of coupling Prolog to relational database systems is needed. We discuss this in Section

7.1. Next, we highlight some difficulties in Prolog with semantic query optimization (Section 7.2)

and looping (Section 7.3). These are well known weaknesses of Prolog. Then, we point to more

advanced logic programming languages aimed at overcoming various weaknesses of Prolog

(Section 7.4). These may be used to implement Context Interchange systems instead. Finally, we

shall revisit the Logic of Contexts and consider how it can be used to extend the BULLION model

to deal with various other issues pertaining to scalability and flexibility as a federation of sources

and receivers grows in size (Section 7.5).

7.1 Prolog and Relational Databases

A survey of systems for Coupling Prolog to Relational databases (i.e. CPR systems) is

described in [17]. A key focus of this stream of research is aimed at providing efficient data

access.

There are four major components of CPR systems. They are the Prolog engine, the Prolog

interface, the database interface and the database engine. The Prolog engine may have enhanced

features in order to adapt it to the proposed database environment. For example, some Prolog

engines provide fast access methods in main memory for searching and accessing facts. These

techniques include sophisticated indexing and hashing mechanisms. Another modification to the

104

Prolog engine concerns returning the set of all answers to a user who issued a query, as opposed

to returning it tuple at a time as was the case in the last chapter.

The Prolog interface is concerned with specifying and recognizing database Predicates in

the Prolog program. There are various levels of transparency for doing this. With full

transparency, database predicates are recognized by the interface with no user support. In this

case, a program analyzer accesses database directories in order to recognize database predicates.

With no transparency, the Prolog programmer is forced to write explicit queries in the language

of the database engine, e.g. SQL.

Next, the database interface is specifically designed for interacting and extracting tuples

from a database. The component is concerned with the types of queries that are considered by

the database. For example, queries issued by the Prolog engine to the database can be single-

predicate (selection) or multiple-predicate (join). Another concern is how the tuples are extracted

from the database i.e. tuple at a time versus the entire set. Note that this is a different issue from

returning answers to the end-user, which is the concern of the Prolog interface. The final

component is the database engine itself, which is taken to be a given in the BULLION model.

Examples described in [17] of CPR system prototypes include PRO-SQL[18], EDUCE [5]

and the CGW approach by Ceri, Gottlob and Wiederhold [16] among others. The point is that

this stream of work represents a resource upon which we can draw to construct Context

Interchange systems using logic programming.

7.2 Prolog and Semantic Query Optimization

The context definition rules of the BULLION model allow us to incorporate traditional

integrity constraints as part of the context definition. This facilitates semantic query optimization

[43]. For example, a semantic query optimizer checks for constraints to see if the result to a query

is empty. If it is, then the query need not be executed at all. This can minimize unnecessary

access to databases, and can potentially save a lot to time. Essentially, semantic query

optimization involves the intelligent pruning of a potentially large search space.

As a concrete example, consider Source 5 and its context definition rules as shown below

(from Example C). The context definition rule is modified by adding the constraint

Expense< 120 0 0 to the body of the context definition rule. This means that all values in the

E x p e n s e column are less than 1 2 0 0 0. Now, consider the query

expense (Companyname, 13000 , yen, t1) ?. A negative answer will be returned without

even having to check the database Source 5. This is because Prolog tests the literals in the body of

105

a clause from left to right. Thus, the query will fail (i.e. return a negative answer) when Prolog

encounters Expense<12 000.


expense (Company_name, Expense, yen, t) :-

Expense<12000,

s5_expense (Company_name, Expense).

%*********Source 5 Data********

s5_expense(cl,10000).





Unfortunately, other queries such as expense (Company-name, Expense, yen, t1) ? cannot

be correctly processed by the above Prolog program. This is because in order to test the condition

Expense<12 000, the variable Expense has to be instantiated. This is not the case for this

particular query and the execution will be aborted.

Fortunately, however, BULLION does not restrict the implementation of Context

Interchange systems to Prolog, which is a limited logic language for the sorts of things we would

like to do. More powerful and efficient logic languages are being developed that have virtually

the same syntax as Prolog. These languages could be used instead. We will briefly describe some

of these languages in Section 7.4.

7.3 Prolog and Looping

Prolog employs a depth-first search strategy and might enter into an infinite branch of an

SLD tree and continue looping indefinitely. Such a situation can occur, for example, when we

want to specify a two-way currency conversion. Sincefunction inversion is not available to us in

Prolog, introducing a reverse currency conversion from Singapore dollars to US dollars to the

program of Example F means adding the Prolog statement

cur_cvt(sgd,usd,5/7).

We will, as a result, obtain an SLD-tree with an infinite branch in which Prolog continues to loop

by converting US dollars to Singapore dollars and back again. The problem of detecting an

infinite branch however is undecidable as Prolog has the full power of recursion theory.

106

This problem is being addressed by the logic programming community. One approach

to this problem is based on modifying the underlying computation mechanism that searches the

SLD-tree by adding the capability of pruning [6]. Such mechanisms are also called loop checking.

The purpose of a loop check is to prune every infinite SLD-tree to a finite subtree that still

contains the root of the original tree. Informally, a loop check is considered sound when no

solutions are lost due to the pruning. A loop check is complete if every infinite SLD-tree can be

pruned to a finite sub-tree. Besides soundness and completeness, another interesting property of

a loop checking mechanism is shortening. That is, if there are multiple derivations of an answer, a

shortening loop check mechanism will prune off the longer derivations and retain only the

shorter ones. This will certainly enhance the efficiency of a Prolog computation.

In [6], Bol et al introduced a class of such pruning mechanisms called simple loop checks.

They went on to prove, however, that no sound and complete simple loop checks exists even for

Prolog programs without function symbols. To date, we are not aware of any loop checking

mechanisms that are sound and complete for our purposes.

A possible, but crude, alternative solution to this problem is to prune branches of an SLD-

tree such that no branches exceed a certain finite depth, say d. This pruning mechanism is

obviously complete. This is because any SLD-tree with infinite branches can be pruned with this

mechanism to give a tree in which all branches are finite (i.e. with depth less or equal to d).

However, we run the risk of pruning off branches that lead to potential solutions to a query. In

this respect, this strategy is not sound. This is the classical trade-off between tractability and the

correctness of a computational strategy. That is, if we want tractability, we have to live with the

fact that a computational strategy may not be sound.

Ideally, we would like to have a value of d such that no potential solutions are lost. If, for

a particular federation of sources and receivers, we do not expect the longest finite conversion

chain to exceed a depth of h, then d can be safely set to h. We might also consider setting d<h.

This might be the case, for example, if we knew that all potential solutions pruned off in this

manner have shorter derivations in the newly pruned SLD-tree. If we are very concerned about

efficiency and not very worried about obtaining all solutions, we can make d even smaller.

We could also use a less expressive language, in which case we give up the ability to

express certain kinds of knowledge. Yet another possibility is to use more specialized, "hand-

crafted", inference strategies that take special advantage of domain knowledge to avoid such

looping. But once again, there is a trade-off in that we give up the generality of the inference

mechanism.

107

Another approach, yet to be explored us, is to employ more powerful programming

languages that do functional logic programming (Section 7.4). Such logic languages make

available to us the capability for function inversion. This feature might enable us to introduce

two-way currency conversions without having to write programs that go into infinite loops.

7.4 Other Logic Programming Languages

Various limitations of Prolog have led researchers to develop more powerful logic

programming languages aimed at addressing some of these weaknesses. One class of such

languages is referred to as constraint logic programming languages [38]. Constraint logic

programming, as the name suggests, involves the merger of two paradigms: constraint solving

and logic programming. When viewed from a constraint solving point of view, the unification

algorithm used by Prolog can be considered to be a special case of constraint solving where the

constraint is such that two terms are semantically equivalent if and only if they are syntactically

identical. This is the reason for Prolog's failure to correctly process the last query in Section 7.2.

A constraint logic programming language such as CLP(R) [39] , which takes a more general view

of constraints, can easily perform semantic query optimization without such problems. In fact,

constraint logic programming is considered more efficient than Prolog because constraint solving

involves the intelligent pruning of a search space.

Another class of logic programs combine logic programming and object-oriented

concepts such as inheritance and default values [24, 42]. Inheritance and default values are

powerful concepts that can be exploited in Context Interchange. For example, we can explicitly

declare that all currencies are in US dollars unless stated otherwise. This makes specifying

knowledge in the shared ontology easier.

Yet another class of logic programming languages, called functional logic programming

languages [35], combines functional programming with logic programming. Such languages allow

us to define functions using the "=" predicate, something that Prolog does not allow us to do.

One of the consequences is that functional logic programming languages have features such as

function inversion, which is not available in Prolog. Thus, specifying a two-way currency

conversion in Prolog requires additional statements that would not have been needed had we

used a functional logic programming language. Functional logic programming languages also

have more efficient operational behavior compared with pure logic languages since functions

allow more deterministic evaluations than predicates.

108

Finally, some logic languages such as LIFE (which stands for Logic, Inheritance,

Functions and Equations) [1] combine various aspects of constraint solving, object-orientation

and functional programming into one language! The point is this, there are powerful tools that

are available for implementing Context Interchange systems within a paradigm based on logic.

We are not just limited to Prolog. These tools would be useless, however, if we did not have a

good "blueprint" of what a Context Interchange system is. The key contribution of this thesis is

just such a blueprint, in the form of a formal specification based on logic.

7.5 The Logic of Contexts Revisited

In Examples C to F, we have been considering a federation of sources and receivers in

which monetary amounts were assumed to have a scale of one. Suppose we had another

federation of sources and receivers in which all monetary amounts are assumed to be in ten's (i.e.

10). Thus the statement

moneyamt (revenue , c1, 1100 , usd, t).

is true in the first federation but false in the second federation. Integrating the two federations

"as is" will lead to problems because of inconsistency in the scale values. One approach to solving

this problem is to require that all existing predicates be modified to include explicit scale

arguments. This, however, can be cumbersome and violates the Context Interchange objective of

being scalable and flexible.

Another approach is to use the Logic of Contexts (LOC) described in Chapter 2. LOC

was designed specifically with such problems in mind. In building the knowledge base the size

of Cyc, problems of inconsistent theories are bound to occur. LOC alleviates this problem in Cyc

by partitioning the knowledge base into multiple, internally consistent, microtheories. These

microtheories need not, however, be consistent with one another. This "chunking" reduces the

cognitive effort required in managing such a huge knowledge base because we can focus on

particular microtheories at a time, as opposed to the whole knowledge base at once. Links among

microtheories are provided by means of lifting axioms.

Since BULLION is specified in first order logic, making the link to LOC is not very

difficult. Each BULLION model can be considered to be a microtheory. Thus, the federation

described in Example F represents a single consistent microtheory. Let us call it Fedone. Any

proposition p asserted within this federation can be expressed as ist (FedOne, p) (e.g.

ist (FedOne, money-amt (revenue, c1, 1100, usd, t1) ) ). Let the second federation be

FedTwo. Lifting axioms can be used to make scale assumptions explicit by linking these two

109

microtheories to a more specific microtheory called FedOne&Two. The following lifting axiom

links FedOne to FedOne&Two:

ist (FedOne, moneyamt (Moneyamt, Companyname, Amount, Currency, Year))

ist (FedOne&Two, moneyamt (Money-amt, Companyname, Amount, Currency

Year, 1).

The last argument of the predicate money-amt in FedOne&Two is the scale argument. The next

lifting axiom links FedTwo to FedOne&Two:

i st (FedOne, moneyamt (Moneyamt, Companyname, Amount, Currency, Year))

ist (FedOne&Two, money-amt (Money-amt, Company-name, Amount, Currency

Year, 10).

A scale conversion rule can then be declared in FedOne&Two:

ist (FedOne&Two, moneyamt (Money-amt, Companyname, Amount1, Currency

Year, Scale1) * Amount1=Amount2*Scale2/Scale1 Amoney-amt

(Moneyamt, Companyname, Amount2, Currency Year, Scale2)).

This approach eliminates the need to modify existing predicates in both federations while

allowing both to interoperate meaningfully. A graphical depiction of this arangement is shown

in Fig. 7.1.

Each microtheory can also simply be an ontology without sources or receivers. Thus, we

can combine ontologies in this fashion as well. Using LOC, we can more easily manage the

growth of the size of a federation of sources and receivers.

In sum, Prolog is ideally suited to demonstrate proof of concept in this thesis because it is

a common language and its relationship to logic and databases is well understood. Its declarative

style is also in keeping with the Context Interchange philosophy of explicit representation of

knowledge. However, a number of weaknesses may prove to be a serious hindrance in using

Prolog as a full-fledge implementation platform. Fortunately, more advanced logic programming

systems, which retain many of the benefits of Prolog, have been or are being developed that are

aimed at making logic programming a more viable implementation solution. The choice of such

a language for a full fledge implementation of a Context Interchange system however, is a matter

for further investigation.

110

articulation axiom

articulation axiom

111

FedOneShared Ontology

Receiver Context

Context

FedOne&Twomoney-amt (Money-amt,Company-name, Amount1, CurrencyYear,Scale 1)4ee Amount1=Amount2*Scale2/Scalel A money-amt(MoneyamtCompany-name, Amount2, Currency Year,Scale2)

FedTwoShared Ontology

Source Context Receiver Context

niMediator ~

Fig. 7.1 Using LOC to integrate two federations

8 Conclusions and Future Work

8.1 Thesis Summary

The overarching goal of this thesis is the development of a formal model of semantic

interoperability, called BULLION, that realizes the philosophy of the Context Interchange

Approach. The aim of the Context Interchange Approach is to achieve a high level of semantic

interoperability while preserving source and receiver autonomy. We have demonstrated how

various important semantic heterogeneities identified in the literature can be solved by the

BULLION model. Furthermore, the autonomy of receivers and sources were preserved while

doing so.

We also described how Context Interchange systems based on BULLION are scalable and

flexible, as compared with current interoperable systems which fair poorly in achieving this goal.

This is possible in BULLION because general knowledge is encoded at the global level in the

shared ontology. Sources and receivers then use this ontology to explicitly declare their contexts.

The context mediator then uses the knowledge in the shared ontology and the context definitions

to automatically select the appropriate conversion procedures to execute. General knowledge is

globalized and is specified once. It can then be used and reused for context explication and

conversions. All this adds up to greater ease of construction and maintenance of a federation,

compared with some traditional approaches. We also demonstrated how explanation of answers

and context explication can be incorporated in the BULLION model. We also discussed how other

tools such as the Logic of Contexts can be used to extend the BULLION model to manage other

aspects of scalability and flexibility.

The core concepts that underlie the BULLION model are drawn from Bunge's Semantics

and Ontology. These concepts center around the notion of an "elementary" proposition as a basic

unit of exchange. A context is a constraint on what propositions are allowed. Conversion of

propositions is equated with logical deduction. The shared ontology contains the definition and

deductive relationships among propositions. Context mediation is then equivalent to

transforming propositions from a source, through deduction, to propositions that satisfy the

requirements of the receiver's context.

These core ideas are integrated into BULLION, a proof theoretic model of Context

Interchange. There are several reasons for using proof theory. For one thing, the semantics of

proof theory are more widely understood than idiosyncratic implementations. There is also a

112

wider and more established theory base for logic. Furthermore, since proof theory is abstract and

nonprocedural, underlying assumptions of the model are very explicit.

BULLION not only serves to formalize concepts such as conversion, context and ontology,

it also defines the notion of a context mediated answer to a query. Such a definition is critical to

the enterprise of delivering a context mediated answer, which is suppose to be more

"meaningful" than an unmediated one.

But how do we know if this is the "right" definition i.e. one that matches our intuitive

notions of a "meaningful" answer. As proof of concept, these ideas were concretely illustrated by

means of a series of Prolog programs. In the Prolog programs, we showed what is meant by

context definition and a shared ontology. We showed how the mediated answers returned in

response to queries satisfy the context requirements of the receiver and matched our intuitions as

to what a suitable answer should be. Thus, we can conclude that BULLION is a formal model that

faithfully realizes the goals of Context Interchange.

8.2 Future Work

This thesis also suggests that logic is useful not only for conceptualizing BULLION, but

also for implementing it. Logic offers a potential new paradigm for the semantic integration of

databases. This has benefit from both theoretical and practical perspectives. The rigor and

declarativeness of logic as a language are valuable features to the enterprise of semantic

integration. We highlighted several logic programming technologies that may be used to

effectively construct a Context Interchange system. Therefore as future research, we intend to

evaluate the use of various logic programming languages for the purposes of building a full-

fledge Context Interchange prototype. This prototype should possess the functionality described

in this thesis.

From a theoretical perspective, we intend to investigate how multiple inconsistent

theories may be meaningfully integrated. The problem of multiple inconsistent theories cannot

be avoided as the federation grows in size. This might involve integrating different ontologies

and even different federations. We discussed this briefly already in Section 7.5. The most

promising tool for this purpose, at least from the perspective of BULLION, is the Logic of

Contexts.

We also have to consider the notions of "default" and "overriding" conversion rules. That

is, some receivers may have preference for their own local conversions that may override the

corresponding conversion in the shared ontology. For example, a receiver might have preference

113

for a different currency conversion than the one specified in the shared ontology. Then for this

specific receiver, its local conversion rule overrides that in the shared ontology. Other receivers

will use the rule in the shared ontology as a default.

114

References

[1] Ait-Kaci, H., & Podelski, A. Towards a meaning of LIFE. In Proceeding of the ThirdInternational Symposium on Programming Language Implementation and Logic Programming:255-274, Passau, Germany, 1991.

[2] Akmajian, A., Demers, R. A., Farmer, A. K., & Harnish, R. M. Linguistics, anintroduction to language and communication (3rd ed.). MIT Press, Cambridge, MA, 1993.

[3] Arens, Y., & Knoblock, C. A. Planning and Reformulation Queries for Semantically-Modeled Multidatabase Systems. In Proceeding of the Conference in Information andKnowledge Management, Baltimore, MD, 1992.

[4] Batini, C., Lenzirini, M., & Navathe, S. A comparative analysis of methodologies fordatabase schema integration. ACM Computing Surveys, 18, 4: 323 - 364, 1986.

[5] Bocca, J. On the Evaluation Strategy of EDUCE. In Proceedings of the ACM-SIGMODConference, Washington, D.C., 1986.

[6] Bol, R. N., Apt, K. R., & Klop, J. W. An Analysis of loop checking mechanisms for logicprograms. Theoretical Computer Science, 86, 1: pp.35-79, 1991.

[7] Brachman, R., & Levesque, H. J. The Knowledge Level of a KBMS. In M. L. Brodie & J.Mylopolous (Eds.), On Knowledge Base Management Systems: IntegratingArtificialIntelligence and Database Technologies: 63-69, Springer-Verlag, New York, 1986.

[8] Bright, M. W., Hurson, A. R., & Pakzad, S. H. A Taxonomy and Current Issues inMultidatabase Systems. IEEE Computer: 50-60, 1992.

[9] Bunge, M. Semantics I: Sense and Reference D. Reidel Publishing Company, Boston,1974.

[10] Bunge, M. Semantics II: Interpretation and Truth D. Reidel Publishing Company, Boston,1974.

[11] Bunge, M. Ontology I: The Furniture of the World D. Reidel Publishing Company,Boston, 1977.

[12] Bunge, M. Ontology II: A World of Systems D. Reidel Publishing Company, Boston,1979.

[13] Bunge, M. Epistemology & Methodology III: Philosophy of Science and Technology D.Reidel Publishing Company, Boston, 1985.

[14] Buvac, S., Buvac, V., & Mason, I. A. The Semantics of Propositional Contexts. InProceeding of the Eighth International Symposium on Methodologies for Intelligent Systems,1994.

[15] Buvac, S., & Mason, I. A. Propositional Logic of Context. In Proceedings of the EleventhNational Conference on Artificial Intelligence, 1993.

[16] Ceri, S., Gottlob, G., & Wiederhold, G. Efficient database access through Prolog. IEEETransactions on Software Engineering, Feb 1989.

[17] Ceri, S., & Tanca, G. L. Logic Programming and Databases Springer Verlag, BerlinHeidelberg, 1990.

[18] Chang, C. L. PROSQL: A Prolog Programming Interface with SQL/DS. In Proceedings ofthe First Workshop on Expert Database Systems, Kiawah Island, SC, 1984.

[19] Clocksin, W. F., & Mellish, C. S. Programming in Prolog (2nd ed.). Springer-Verlag, NewYork, 1984.

[20] Codd, E. F. A relational model of data for large shared data banks. Communications of theACM, 13, 6: 377-387, 1970.

115

[21] Collett, C., Huhns, M. N., & Shen, W. Resource Integration Using a Large KnowledgeBase in Carnot. IEEE Computer, 24, 12: 55-63, 1991.

[22] Date, C. J. An Introduction to Database Systems (5th ed.). Addison-Wesley, Reading,MA, 1990.

[23] Davis, R., Shrobe, H., & Szolovits, P. What is a Knowledge Representation? AI Magazine:17-33, 1993.

[24] Davison, A. A Survey of Logic Programming-based Object Oriented Languages.Technical Report No. 92/3. Dept. of Computer Science, University of Melbourne, 1992.

[25] Dayal, U., & Hwang, K. View definition and generalization for database integration inmultidatabase system. IEEE Transactions on Software Engineering, SE-10: 628-644, 1984.

[26] Enderton, H. B. A Mathematical Introduction to Logic Academic Press, San Diego, CA,1972.

[27] Farquhar, A., Dappert, A., Fikes, R., & Pratt, W. Integrating Information Sources UsingContext Logic. Technical Report No. KSL-95-12. Stanford University, 1995.

[28] Gallaire, H., & Minker, J. Logic and Databases Plenum Press, New York and London,1978.

[29] Gallaire, H., Minker, J., & Nicolas, J. M. Logic and Databases: A Deductive Approach.ACM Computing Surveys, 16,2: 153-185, 1984.

[30] Graham, G. (1995, December 13). Single currency's computer debit. Financial Times, p.

[31] Gruber, T. R. The role of a common ontology in achieving sharable, reusable knowledgebases. In Principles of Knowledge Representation and Reasoning: Proceedings of the 2ndInternational Conference: 601-602, Cambridge, MA, 1991.

[32] Gruber, T. R. Toward Principles for the Design of Ontologies Used for KnowledgeSharing. In Guarino & Poli (Eds.), To appear in Formal Ontology in Conceptual Analysis andKnowledge Representation, Kluwer, 1993.

[33] Guha, R. V. (1991). Contexts: A formalization and some applications. STAN-CS-91-1399-Thesis, Stanford University.

[34] Guha, R. V., & Lenat, D. B. Cyc: A Midterm Report. AI Magazine: 32-59, 1990.

[35] Hanus, M. The Integration of Functions into Logic Programming: From Theory toPractice. The Journal of Logic Programming, 19,20: pp.583-628, 1994.

[36] Heimbigner, D., & McLeod, D. A Federated architecture for information management.ACM Transactions on Office Information Systems, 3: 253-278, 1985.

[37] Huhns, M. N., Jacobs, N., Ksiezyk, T., Shen, W.-M., Singh, M. P., & Cannata, P. E.Enterprise Information Modeling and Model Integration in Carnot. In Proceedings of the1st International Conference on Enterprise Integration Modeling, 1992.

[38] Jaffar, J., & Maher, M. J. Constraint Logic Programming: A Survey. The Journal of LogicProgramming, 19,20: pp.503-581, 1994.

[39] Jaffar, J., Michaylov, S., Stuckey, P., & Yap, R. The CLP(R) Language and System. ACMTransactions on Programming Languages and Systems, 14, 3: 339-395, 1992.

[40] Kay, R. L. (1994, October 17). What's the meaning of this? Computerworld, p. 89-93.

[41] Kent, W. Solving Domain Mismatch and Schema Mismatch Problems with an Object-Oriented Database Programming Language. In Proceedings of the 17th InternationalConference on Very Large Data Bases, Barcelona, Spain, 1991.

[42] Kifer, M., Lausen, G., & Wu, J. Logical Foundations of Object-Oriented and Frame-BasedLanguages. Journal of the ACM, May 1995, 1995.

116

[43] King, J. QUIST: A System for Semantic Query Optimization for Relational Databases. InIn Conference on Very Large Databases, 1981.

[44] Krishnamurthy, R., Litwin, W., & Kent, W. Interoperability of Heterogenous Databaseswith Schematic Discrepancies. In Proc. First Int'l Workshop on Interoperability inMultidatabase Systems: 144-151, 1991.

[45] Krishnamurthy, R., Litwin, W., & Kent, W. Language Features for Interoperability ofDatabases with Schematic Discrepancies. In Proceesings of 1991 ACM Sigmod Int'lConference on Management of Data: 40-49, Denver, Colorado, 1991.

[46] Landers, T., & Rosenberg, R. An Overview of Multibase. In Proceedings of the SecondInternational Symposiumfor Distributed Databases: 153-183, 1982.

[47] Larson, J. A., Navathe, S. B., & ElMasri, R. A Theory of Attribute Equivalence inDatabases with Application to Schema Integration. IEEE Transactions on SoftwareEngineering, 15,4, 1989.

[48] Lee, J., & Malone, T. W. Partially Shared Views: A Scheme for Communicating amongGroups that Use Different Type Hierarchies. ACM Transactions on Information Systems, 8,1, 1990.

[49] Lenat, D. B., & Guha, R. V. Building Large Knowledge-Based Systems Addison-Wesley,Reading, Mass., 1990.

[50] Levesque, H. J. A View of Knowledge Representation. In M. L. Brodie & J. Mylopolous(Eds.), On Knowledge Base Management Systems: IntegratingArtificial Intelligence andDatabase Technologies: 63-69, Springer-Verlag, New York, 1986.

[51] Levesque, H. J., & Brachman, R. J. A Fundamental Tradeoff in KnowledgeRepresentation. In H. J. Levesque & R. J. Brachman (Eds.), Readings in KnowledgeRepresentation, Morgan Kaufmann Publisher's Inc., Los Altos, Calif., 1985.

[52] Litwin, W. An Overview of the Multidatabase System MRDSM, Denver, CO, 1985.

[53] Litwin, W., & Abdellatif, A. Multidatabase interoperability. IEEE Computer: 10-18, 1986.

[541 Litwin, W., Mark, L., & Roussopoulos, N. Interoperability of Multiple AutonomousDatabases. ACM Computing Surveys, 22, 3: 183-235, 1990.

[55] Lloyd, J. W. Foundations of Logic Programming (2nd ed.). Springer-Verlag, 1987.

[56] Madnick, S. E. From VLDB to VMLDB (Very MANY Large Data Bases): Dealing withLarge Scale Semantic Heterogeneity. In Proceeding of the 21st VLDB Conference: 11-16,Zurich, Switzerland, 1995.

[57] McCarthy, J. Generality in Artificial Intelligence. Communications of the ACM, 30, 12: 1030-1035, 1987.

[58] Motro, A. Superviews: Virtual Integration of Multiple Databases. IEEE Transactions OnSoftware Engineering, 13, 7: 785-798, 1987.

[59] Motro, A., & Buneman, P. Constructing Superviews. In Proceedings of SIGMOD: 56-64,1981.

[60] Neches, R., Fikes, R., Finin, T., Gruber, T., Patil, R., Senator, T., & Swartout, W. R.Enabling Technology For Knowledge Sharing. AI Magazine, 12, 3: 16-36, 1991.

[61] Newell, A. The Knowledge Level. AI Magazine, 18, 1: 87-127, 1981.

[62] Reiter, R. Towards a Logical Reconstruction of Relational Database Theory. In M. Brodie,J. Mylopoulos, & J. W. Schmidt (Eds.), On Conceptual Modeling: Perspectives from ArtificialIntelligence, Databases and Programming Languages, Springer-Verlag, Berlin and New York,1984.

[63] Robinson, J. A. A Machine-Oriented Logic Based on the Resolution Principle. Journal ofthe ACM, 12: 23-41, 1965.

117

[64] Sciore, E., Siegel, M., & Rosenthal, A. Using Semantic Values to Facilitate InteroperabilityAmong Heterogenous Information Systems. Transactions on Database Sytems, 19, 2: 254-290, 1994.

[65] Shen, W.-M., Huhns, M. N., & Collet, C. Resource Integration without ApplicationModification. Technical Report No. ACT-OODS-214-91. MCC, 1991.

[66] Sheth, A., & Kahyap, V. So Far (Schematically) yet So Near (Semantically). In IFIPTC2/WG2.6 Conference on Semantics of Interoperable Database Systems, DS-5, Lome, Victoria,Australia, 1992.

[67] Sheth, A. P., & Larson, J. A. Federated Database Systems for Managing Distributed,Heterogeneous, and Autonomous Databases. ACM Computing Surveys, 22, 3: 183-235,1990.

[68] Siegel, M., & Madnick, S. Context Interchange: Sharing the Meaning of Data. SIGMODRecord, 20, 4: 77-79, 1991.

[69] Smith, J. M., & Smith, D. C. P. Database Abstractions: Aggregation and Generalization.Transactions on Database Systems, 2, 2, 1977.

[70] Sterling, L., & Shapiro, E. The Art of Prolog (2nd ed.). MIT Press, Cambridge, MA, 1994.

[71] Ventrone, V., & Heiler, S. Semantic Heterogeneity as a Result of Domain Evolution.SIGMOD RECORD, 20,4: PP.16-20, 1991.

[72] Wand, Y. A Proposal for a Formal Model of Objects. In W. Kim & F. Lochovsky (Eds.),Object-Oriented Concepts, Databases, and Applications: 602, ACM Press, New York, N.Y.,1989.

[73] Wand, Y., & Weber, R. An Ontological Analysis of Some Fundamental InformationSystems Concepts. In Proceedings of the Ninth International Conference on InformationSystems, Minneapolis, Minnesota, USA, 1988.

[74] Wand, Y., & Weber, R. Mario Bunge's Ontology as a Formal Foundation for InformationSystems Concepts. In P. Weingartner & G. J. W. Dorn (Eds.), Studies on Mario Bunge'sTreatise, Rodopi, Amsterdam, 1990.

[75] Wand, Y., & Weber, R. An Ontological Model of an Information System. IEEETransactions of Software Engineering, 16, 11: 1282-1292, 1990.

[76] Wand, Y., & Weber, R. Toward a Theory of the Deep Structure of Information Systems. InProceedings of the Twelfth International Conference on Information Sytems, 1991.

[77] Wang, Y. R., & Madnick, S. E. A Polygen Model for Heterogeneous Database Systems:The Source Tagging Perspective. In Proceedings of the 16th International Conference on VeryLarge Data bases (VLDB): 519-538, Brisbane, Australia, 1990.

[78] Wang, Y. R., & Madnick, S. E. A Source Tagging Theory for Heterogeneous DatabaseSystems. In International Conference on Information Systems: 243-256, Copenhagen,Denmark, 1990.

[79] Woods, W. A. What's in a Link: Foundations for Semantic Networks. In H. J. Levesque &R. J. Brachman (Eds.), Readings in Knowledge Representation, Morgan KaufmannPublisher's Inc., Los Altos, Calif., 1985.

118

integrating information from disparate contexts: a theory ...web.mit.edu/smadnick/www/wp2-old...

Documents