cis/tk: an object-oriented approach to information systems ...web.mit.edu/smadnick/www/wp2-old...

76
CIS/TK: An Object-Oriented Approach to Information Systems Integration December 1992 WP # CIS-92-08 Maria Elena Neira Stuart E. Madnick

Upload: others

Post on 15-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

CIS/TK:An Object-Oriented Approach toInformation Systems Integration

December 1992 WP # CIS-92-08

Maria Elena NeiraStuart E. Madnick

Page 2: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

CIS/TK:An Object-Oriented Approach to Information Systems Integration

byMaria Elena NeiraStuart E. Madnick

Composite Information Systems LaboratoryWorking PaperDecember, 1992

Abstract

The Composite Information System/Tool Kit (CIS/TK) is a prototype HDBMS withreal-time access and retrieval capabilities to data residing on disparate autonomousinformation systems. HDBMS must reconcile differences in DBMS structure anddata semantics. The means of representing, retrieving and integrating data fromthose multiple sources are provided, within CIS/TK, by a common data model, acommon retrieval language, a global schema and metadata for semanticsreconciliation; the object oriented environment provides the system with therequired representation and reasoning capabilities. CIS/TK data model, three-layersoftware architecture, interfaces and data translation facilities are the tools weenvisioned to provide physical and logical connectivity (data connectivity as well assemantic connectivity) among information systems. We describe the features of thesystem and evaluate its capabilities as an integration engine.

Keywords [data model, data semantics, distributed database management systems]

Page 3: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

List of Acronyms

AQL abstract local query

AT attribute catalogue

CIS composite information system

CIS/TK composite information systems / tool kit

CS communication server

DDL data definition language

DBMS database management system

DML data manipulation language

DTC data type catalogue

ER entity-relationship

ERM entity-relationship model

GQP global query processor

GRL global retrieval language

GS global schema

HDBMS heterogeneous database management system

IPC inter-process communication

KOREL knowledge-object representation language

LQP local query processor

LQPM local query processor manager

OODB object oriented database

OOP object oriented programming

SDM semantic data model

Page 4: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

Table of Contents

A b stract ............................................................................................................... . 2

List of Acronyms ...................................................................................................... 3

Table of Contents .................................................................................................... 4

1- Introduction .................................................................................................... 6

1.1 Background ........................................................................................ 61.2 Scope of this Paper ............................................................................ 6

2- CIS/TK Overview .......................................................................................... 8

2.1 CIS/TK Architecture .............................................................................. 82.1.1 Databases ............................................................................. 82.1.2 L Q P ........................................................................................ . 82.1.2.1 The Communication Server .......................................... 92.1.3 LQPM ............................................................................ 92.1.4 GQP ....................................... 10

2.2 Conceptual Design .......................................................................... 102.2.1 Global Schema ............................................................... 102.2.2 Semantics ............................................................................ 102.2.3 Global Retrieval Language ........................................ 10

2.3 OOP Approach .......................................................................... 102.4 Query Execution Flow ............................................................... 12

3- Global Schema: Entity-Relationship M odel ....................................... 133.1 Characteristics of CIS/TK ERM .................................................... 133.2 Implementation of ERM ............................................................... 13

3.2.1 ERM Objects and Hierarchy .................... 133.2.1.1 Global Schema M anager Object ................ 143.2.1.2 Entity Objects .................................................. 153.2.1.3 Relation Objects .................................................. 16

3.3 Mapping the Global Schema to the Underlying Systems ....... 17

4- The Local Query Processor .......................................................................... 174.1 The LQP Concept ............................................................................ 17

4.1.1 Objects Needed to Access a Database ............................ 174.1.2 Examples of LQP Objects and Instances ............................ 18

4.2 The LQPM ...................................................................................... 194.2.1 The lqp-set Object ................................................................ 20

4.3 Local Database: Auxiliary Tables, Pseudo Tablesand System Tables ............................................................................... 20

4.4 LQP Hierarchy .......................................................................... 21

Page 5: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

4.4.1 LQP Top Class Object .................................................... 214.4.2 LQP Class Object ............................................................... 224.4.3 LQP Instance Object .................................................... 23

5- Sem antic M odelling ....................................................................................... 245.1 D ata Translation ........................................................................... 265.2 Translation Tables ............................................................................ 26

5.2.1 Synonym Translation ....................................................... 265.2.2 Application Conversion Tables ....................................... 275.2.3 M ap Tables ........................................................................... 275.2.4 Interdata Type Translation Table ........................... 275.2.5 Table-Based Translation Example ............................ 275.2.6 datln schema Translation Table ........................... 28

5.3 Representation Translation Facility ....................................... 285.3.1 Object-Based Representation: Data Types ................ 295.3.2 Data Type Description: Example ........................................ 295.3.3 Data Type and Data Type Components Objects Hierarchy 305.3.4 Variable Data Types .................................................... 315.3.5 C atalogues .......................................................................... 31

5.3.5.1 Data Type Catalogues ........................................ 325.3.5.2 Attributes Catalogues ...................................... 33

6- C on clu sion s .................................................................................................. . 346.1 A ssum p tions ...................................................................................... 346.2 Design and Implementation .................................................. 356.3 Implementation Independent Evaluation ............................ 366.4 Fu tu re W ork ....................................................................................... . 37

B ib liograp h y .................................................................................................. . 39

Appendix No. 1: CIS/TK "tables" .............................................................. 42

Appendix No. 2: Global Schema for the PAS Application ........................... 47

Appendix No. 3: Sample Query .............................................................. 49

Page 6: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

1- Introduction

CIS/TK is a prototype HDBMS with real time access and retrieval capabilities to dataresiding on multiple disparate DBMS. It provides connectivity among federateddatabases, allowing users to access and combine data from various dissimilar sourcesas if the data came from a single virtual database [SMG91]. In addition, it providesthe environment and support to reconcile differences in data semantics.

The means of representing, retrieving and integrating information, are providedwithin CIS/TK by a common data model, a common retrieval language and a datasemantics model for data disparities reconciliation.

1.1 BackgroundThe first CIS/TK prototype defined the basic architecture of the system [MaW88a,Wo189]. The goal was to derive solutions to queries where the data reside onmultiple disparate systems, with a non-intrusive approach. Overall, the designedfeatures were:

- a global schema (GS) to provide an integrated view of the component DBMS;- global querying facilities (GQP);- interfaces to the DBMS provided by the LQPs; and- data catalogues for attribute naming management.

The architecture and modules that we described have been implemented on UNIXSystem V1, and coded in Common Lisp. Its kernel was developed using KOREL, anextension of Common Lisp adapted for the particular needs of CIS/TK.

After this first prototype, the system went through extensions [Tun9O, Mak90, Ger89]that enhanced its modularity, still maintaining the original conceptual architecture(LQP-GQP-GS).

In the last two years, major extensions to CIS/TK have been undertaken in thesemantic modelling arena [Rig90, SMG91], up to the point of questioning thesuitability of the actual overall design and architecture for such a purposes [KaS90].

1.2 Scope of this PaperThe current CIS/TK data model (fig. 1.1), built upon the assumption that data needsto be modelled having structural and semantic properties, has the followingcomponents:

- Structural data modelIt focuses on data connectivity, where the data is, how to access it and how tointegrate it. The global schema (Section 3), the LQPs (Section 4) and the GQP are theCIS/TK's modules that were created for such a purposes;

1Mini computer AT&T 3B2/500

Page 7: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

dataSRUCTURALPROPERTIES

global schema.LQP

quering facilities GQPAQP

Figure 1.1: Data Model Components

- Semantic data modelIn addition to the previous schema level integration, it is necessary to performtranslation at the instance level. For example, one might find different data formats(e. g. 5/25/92 vs. 25/5/92) that need the appropriate translation procedures in orderto be compared and integrated. CIS/TK copes with this problem of different datarepresentations by incorporating the notion of data semantics to the data itself. Ourapproach to data semantics modelling can be broken into two tasks:

- identifying a set of semantic concepts (metadata), suitable to capture thedifferent aspects of a data element's representation. This process has led to thedefinition of each data element as a combination of three characteristics:

Page 8: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

- name [SiM89];- representation [Rig90]; and- interpretation [SMG91];

- defining a set of symbolic/formal constructs to represent these semanticconcepts and the way to store them within the system environment.

The paper will focus on the design and implementation of all the above modellingissues. The CIS/TK modules and their functionalities will be described, withemphasis on their object implementations. In the last section of the paper, we willpresent the conclusions of evaluating CIS/TK's capabilities as an integration engineand position it in the 5-dimensional space of integration parameters defined byHeiler, Siegel and Zdonic.

2- CIS/TK Overview

2.1 CIS/TK ArchitectureThe goal of CIS/TK is to develop tools to support the entire spectrum ofconnectivity, with focus on semantic connectivity. The CIS/TK approach [MaW88a]explicitly allows for the coexistence and usage of a variety of information systemswhile preserving their local autonomy. This information systems are typicallyindependently developed, hard to modify, and contain data that is dynamicallychanging. Our prototype's layered-architecture (fig. 2.2) is meant to demonstrate howto build the necessary tools towards a systematic approach dealing with such aproblems in a federated DBMS environment.

The notions underlying our design are largely based on a variety of previousresearch [Dat86, LaR85, BLN86, BaL84]. Modular architecture, extensibility andknowledge-based techniques are the key concepts understanding the different partsof the system that we will describe.

2.1.1 DatabasesThe lowest level of the CIS/TK architecture is any DBMS which processes thequeries sent by CIS/TK. These DBMS can be of any type (relational, menu-driven,network, etc) and they may reside either in the system supporting CIS/TK or inremote host machines.

2.1.2 LQP2

The Local Query Processor (LQP) acts as the link between CIS/TK and the DBMSwhere the information is stored. This software layer , using knowledge that isspecific to particular applications, isolates the system from the idiosyncrasies of thedifferent DBMS and their environments. It also transforms the semantics and

2For further description and details on its object implementation, see Section 4.

Page 9: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

syntax of an ALQ 3 into the appropriate executable program for the remote system,calls the communication server (which transmits the data request), gets and sendsthe results, in standard syntax and standard semantic format, back to the GQP. TheLQP algorithm is given in Figure 2.1.

Figure 2.1: LQP Algorithm

This module is the key to the extensibility of the system. When CIS/TK is extendedto access a new database, a new LQP must be design.

2.1.2.1 The Communication ServerThe communication server creates and maintains links to remote systems (logon,data access and disconnection). The design of this module is based on the UNIXInterprocess Communication (IPC) facility (i. e. FIFO message queues) and otherUnix resident functions. It transmits to the remote sites the script files containingremote executable queries, waits for the query results and send them back to theLQP.

2.1.3 LQPMIn CIS/TK all structurally identical LQPs 4 are grouped under what is called an LQPset. The LQPM directs every ALQ to the LQP, within a given LQP set, that the system

3 Generic GRL query sent by the GQP to any LQP object.4 LQPs accessing the same data from different sources.

Send data request to the CS queue

Rea data

Translate data

Page 10: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

administrator/user has previously selected thus, it allows the user or systemadministrator to select the sources to be accessed; at the same time, it controls thestatus of the data sources to which CIS/TK have access to and, based on that,performs the query routing. In addition, it provides data caching to the databasesmanaged by CIS/TK.

2.1.4 GQPThe GQP is the engine for executing global queries. Its major functions are queryparsing , query routing, and construction of the query solutions. Data returned bythe LQPs are assembled by the GQP to provide the answer to the query. The GQPmay also need to access the data catalogue in order to perform semantic conversionsbefore integrating the data that have been returned.

2.2 Conceptual DesignIn an inter-database configuration, the problem of representing the underlying datain some consistent way, so that it appears as coming from a single database, becomescritical. The three tools used to achieve the former are the global schema, thetranslation package and the global retrieval language.

2.2.1 Global SchemaTo define the structural abstraction from the local schemas to the global data model,we use the ER constructs. The present scheme adopts the ERM for the conceptualdesign of a global schema and reverts to the relational model for the logical design.

2.2.2 Semantics 5After a unified view of the underlying data, provided by a global schema, isachieved, we now face the problem of merging data coming from heterogeneousenvironments. For example, suppose that we want to combine two moneyamounts, expressed as "1,000.00 USD" and "2000$"respectably. Our data conversionpackage 6 is meant to cope with that kind of problems, providing the translationand/or mapping facilities needed for each case.

2.2.3 Global Retrieval LanguageThe user formulates the queries to the global schema using an SQL-like globalretrieval language (GRL). This language supports retrieval only capabilities. For acomplete reference on GRL syntax, the reader is refereed to [Wo189].

2.3 OOP ApproachThe object oriented programming methodology was adopted for CIS/TK. It allowsus to store and build, as objects, complex representations of the real world, includingnot only the attributes of the object but also the processes associated with it.

5Throughout the paper we refer to this concept as both, data conversion and semantic mapping.6To see where semantic mappings are invoked we refer the reader to Figure 2.1 and, for further details,to Section 5 in this paper.

Page 11: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

input:GS query taoutput:data

.GRL ~ querytranslation query resultstables _ _ _

I

parser

router G dataconversion

LOP Manager

AQL AOL__query AQL

results query

LOP1 data LQPn dataconversion conversion

unformatedscript file data

communication server

executablelocal query

queryresults

Figure 2.2: CIS/TK Architecture

Page 12: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

An object may be any data model of the of the entities described in the system (ERentity object, LQP object, catalogue object,...). In this way, we encapsulate differentmodules as objects and have all the communications handled through the standardmessage passing system.

For example, any database is thought of as a set of attributes with a set of methodsthat allow us to perform retrieval operations. These objects are called Local QueryProcessors (LQP). After an LQP is implemented for a database, it handles all queriesto the database using a generic LQP query language.

Artificial intelligence techniques are also an integral part of CIS/TK. Three strategiesin particular are instrumental in its operation: table look up, functional mappingand heuristic reasoning. Overall, these techniques are used to construct andimplement the prototype.

2.4 Query Execution FlowIn this section we present how the modules described above operate and interactunder a global query request; the flow is controlled by the message passingmechanism:

MODULE ACTION

GS ..........................................- a global query is issued against the global schema;- query instance values are translated according to the synonym tables;

G Q P ..........................................................- parse the global query into sub-queries;map global query attributes to local attribute names;

- rout sub-queries (AQLs) to the LQPM;LQPM .............................................- select LQP form LQP set and send it the AQL

- if the selected LQP is not available, select another one within the set;LQ P .............................................................................- parse A Q L sent by the LQ PM :

a) remove from the AQL any predicate that cannot be applied at the remote site;b) translate query values(representation) to local database semantics;

- convert AQL syntax:a) translate the query structure into the query language used by the local DBMS;

b) the LQP generates and script that is used by the CS to request the data;C S ............................................................................- data request into the C S queue:

- the communication server writes the output of the database in a file:L Q P ...................................................................................................- read the data file;

a) the file is opened by the LQP;b) the data read is put into a table-like data structure within the CIS/TK environment;

- translate data back to GQP standards;- reselect the data;

LQ PM ....................................................................- passes the data back to the GQP;- catches the data to local tables;

G Q P ...........................................................- join/m erges data obtained in the ALQ;- gives back query results.

Page 13: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

3- The Global Schema: Entity-Relationship Model

The global schema, one of the components of the CIS/TK data model, provides alogical integrated view of the underlying data. It was developed using an extensionof the entity-relationship model (ERM) [Che76]. The extension gives our model thecapability to aggregate data from multiple sources to produce single entities andrelationships [Teo90].

3.1 Characteristics of CIS/TK's ERMCIS/TK supports a binary entity-relationship model (BERM) [Che82] and allowsattributes only for entities; no attributes are allowed in the relationships. As we willsee in Section 3.2, all the construct of the CIS/TK ERM have been implemented asobjects. These basic constructs are:

- entities, the principal data objects about which information is to becollected. Only one class of "entity" is defined in the system; that means that there isno possibility to have the "strong" and "weak" entity concepts in CIS/TK. Based onthe mappings to the relational model that we perform, we have no need for them.

- relationships, they represent entity associations. The role that an entityplays in a relationship is place within the "entity-to" and "entity-from" slots in therelation class object. A relationship is characterized by:

- degree: number of entities associated in a relationship. CIS/TK has a BERMwhere only two entities are involved in a relationship;- connectivity: mapping of the associated entities (i. e. 1:1, 1:N and N:N)7 i

- cardinality: number of entity occurrences in the relationship;

- attributes, characteristics of entities that provide descriptive detail aboutthem. A particular occurrence of an attribute within an entity is called an "attributevalue". There is one type of attributes called the "join-key" and it is used to join theattributes of two entities; this is the attribute that P. Chen [Che82] calls "identifier",the rest of the attributes within that entity are called "descriptors" (monkeys).

3. 2 Implementation of the ERM

3. 2.1 ERM Objects and HierarchyThe ERM is implemented in CIS/TK with three classes of KOREL objects:

- global schema manager object;- entity objects;- relation objects.

7Connectivity and cardinality cannot be expressed using the current ERM implementation. Documentsfrom previous versions (CIS/TK V1/2) show the relation class object as having a multiple value"constrains" slot to capture this concepts, that is not implemented in the current version.

Page 14: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

mit-placement

company

auiposition

portfoliofinance

relationsame-state finance-info

contains n recruiting-forworks-in works-for

Figure 3.1: ERM Class and Instance Objects Hierarchy

The object hierarchy for the ERM 8 is shown in Figure 3.1. The entity andobjects are both subordinated to the GSM class. The GSM object maintainsall known entities and also handles all external messages to the GS.

relationa list of

3.2.1.1 Global Schema Manager Object (GSM)The GSM object contains a list of all the entities and relations in the global schema.It's the interface for external messages to the global schema. It is implemented as aKOREL object (fig. 3.2).

gifsubordinates

entities

(multiple-value-f t)

(multiple-value-f t)

relations (multiple-value-f t)

instances (multiple-value-f t)methods get-lqp-location

get-foreign-keyget-translation-key)

Figure 3.2: gms CLASS OBJECT Frame

8The instances of the ERM hierarchy correspond to the application currently running with CIS/TK: ThePlacement Assistant System (PAS).

Page 15: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

3.2.1.2 Entity ObjectsThe "entity" class object (fig. 3.3) is created as an instance of gsm and represents any ofthe ERM entities; its slots store the following information:

"attributes": contains a list of the attributes for that entity; and"table rels": contains a list of all the relation objects that the entity is related to.

entity: superiors(multiple-value-f t)(value gsm)

attributes: (multiple-value-f t)

relations: (multiple-value-f t)

table-rels: (multiple-value-f t)

instances:(multiple-value-f t)(value alumni company position finance industry portfolio)

Figure 3.3: ENTITY CLASS OBJECT Frame

position: superiors:(multiple-value-f t)(value entity)

instance-of:(multiple-value-f t)(value entity)

title:(multiple-value-f t)(value (recruitset companytbl position))

attributes:(value title status num-position)

status:(multiple-value-f t)(value (recruitset companytbl status))

num-position:(multiple-value-f t)(value (recruitset companytbl schedule))

relations:(value recruiting-for)

Figure 3.4: Example of ENTITY INSTANCE "position" OBJECT Frame

Page 16: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

The entity object is created as an instance of the entity class using the global schemadefinition language [Wo189]. Figure 3.4 shows an example of an entity instanceobject (position).

The "attribute" slot represents the attribute of that entity. It stores the actual lqpset,table and column locations for that attribute. For instance, the "title" slot contains thevalue (recruitset companytbl position) which is a pointer to the actual location of theattribute, i. e., "recruitset" is the name of the lqp-set object that can access thedatabase, "companytbl" is the name of the table where the column "position"can befound. Notice that this slot is a multiple value in order to take care of the situationwhen the same attribute can be found in more than one place.

3.2.1.3 Relation ObjectsThe association that links two ER entities is implemented through the "relation"object class (fig. 3.5). It also stores the "Join" key that can be used to relate twoentities.(to join the data returned from each of the entities) The relation object isimplemented as a KOREL object and is a subordinate of the relation object class. Astandard relation object created for the relationship between the portfolio andcompany entity called "contains" is shown bellow (fig. 3.6).

relationsuperiors:

(multiple-value-f t) (value gsm)entity_to: (multiple-value-f t)entityfrom: (multiple-value-f t)join: (multiple-value-f t)instances:

(multiple-value-f t)(value works-for finance-info recruiting-for same-state containsworks-in)

Figure 3.5: RELATION CLASS OBJECT Frame

containssuperiors:

(multiple-value-f t) (value relation)entity to:

(value company)entity-from:

(value portfolio)join: (value (= (portfolio company-name) (company name)))

Figure 3.6: Example of RELATION INSTANCE OBJECT Frame

Page 17: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

3.3 Mapping the Global Schema to the Underlying SystemsAfter the conceptualization of the underlying data using ERM, CIS/TK reverts to therelational model for the logical design. The conversion ERM to RM is done in termsof attribute organization 9. Attributes are mapped to particular LQP sets according tothe following notation:

((<global-attribute>(LQP1_set-name> <table-name> <LQP1-attribute-name>)

(LQPnset-name> <table-name> <LQPn-attribute-name>))

Thus, every global attribute (<global-attribute>) is mapped to a local attribute(s)(<LQP1-attribute-name>(s)) in a particular table(s) (<table-name>(s)).

4- The Local Query Processor10

4.1 The LQP ConceptWithin our OOP environment, one can encapsulate the knowledge needed to accessa database as a set of attributes with a set of methods to perform retrieval operations.This object is called LQP (fig. 4.1).

A database. whatever its form (menu, SQL, ...), is thought of by CIS/TK as a set oftables. There is one LQP per database1 1 ; thus, one LQP can lead to more than onetable. Indeed, the code sections used to access two different tables in the samedatabase share most of the routines.

4.1.1 Objects and Instances Needed to Access a DatabaseAlso, for extensibility purposes, the LQPs [Hor88] are the "key" modules in oursystem.These objects encapsulate all knowledge on the host protocols, querylanguages, and semantics and syntax of the systems that CIS/TK is able to access.These knowledge and the functionalities that CIS/TK needs to access a database areencapsulated within the following objects:

- a KOREL LQP class object containing information to access the database 12;- a KOREL LQP instance object for each logical table in the same database 13;

9The implementation of the LQPM [Tun 901 changed the mappings of the entity attributes from specificLQPs to LQP sets.10 All the examples given in this section are taken from the I. P. Sharp Database and its LQP.1 1There is one exception to this rule: The I. P. Sharp database is accessed by two LQP classes"disclosure" and "ipsharp-currency": a different LQP for the Currency table in I. P. Sharp database.12For the I. P. Sharp database the object is called "disclosure".131n the I. P. Sharp database, the object "discstatic" (an LQP instance) was created in order to obtainthe static data from that database because, in the case of this source, one cannot mix static data with

Page 18: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

(send-message <iqp-name>:get-datadata-type-catalogue'(table (coll, clo2, ...... coln))

LQP

((col, coi2., coln)(dat11, datl2 ., datln)

(datm1, datm2, ... , datmn))

Figure 4.1: LQP as a Black Box

In addition to the above objects, if a database is remote, then the LQP uses a set oftwo local tables, also implemented as KOREL objects, per remote table in thedatabase:

- an auxiliary table, implemented as an LQP instance object;- a pseudo-table implemented as an LQP instance object.

4.1.2 Examples of LQP Objects and InstancesThe LQP modules contain a file declaring all the LQP related objects( usr /cistk/ demo/ v3.1 / lqp /databases/ <database-name> / <table-name>-obj.lsp14). Theyare:

- the lqp set instance objects:- the LQP class object:- the LQP instance 17 object(s):

<table-name>_set >15<lqp-class-name >16<lqp-instance-name(s)>18

timeseries data in the same query; for this reason, Disclosure was implemented as two separate tableswhich cannot be joined remotely, but can be locally.14 Example: "-/lqp/databases/ipsharp/disc-obj.lsp".

15 Example: "disctimesiriesset " (timeseries table) and "discstatic-set " (static table).16Example: "disclosure".17This object is a child(s) of the above defined LQP class.18Example: "discjtimeseries" and "disc-static".

Page 19: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

- the "informix_2e" instance 19 objects <aux-lqp-instance-name>and <pseudo-lqp-instance-name>20

After this overview of all the objects logically associated with an LQP, we will lookin detail at each of the terms in Section 4.4, examining their slots and their content.

4.2 The LQPMEach table accessed by CIS/TK (real or pseudo) belongs to what is called an lqp set.The LQPs within the same LQP set have the same attributes 21. The key idea behindthe grouping of LQPs into LQP sets is that different places may contain the sameinformation. The best example of this is a pseudo table and its corresponding realtable. There are other examples such as Disclosure and Dataline, whose intersectionis not empty. In general, we can say that the tables that contain the sameinformation form an LQP set 22.

The Local Query Processor Manager (LQPM) is the CIS/TK module designed to dealwith the situations described above. The functionalities of the LQPM module can bedescribed in terms of the advantages that the LQPM gives to the overallperformance of the system. The first advantage comes from the fact that the GQPsees only LQP sets.; since the information can be obtain for any given table of theset, the GQP maps its global attributes to LQP sets and not to particular LQP tablesthus, the LQP sets provide an abstraction layer between the global schema and theunderlying databases. The second advantage is that, if the access to a given tablefails, then the LQPM chooses another table in the corresponding set to get theinformation from. Hence, the LQPM plays the role of an interface between the GQPand the LQPs, giving the GQP the impression that the LQP data access is possibleunder connection difficulties conditions. Basically the LQPM extends dataavailability.

Iqpsetsubordinates:

(value alumnitbset countrytb-set positiontbset siccodetbsetsicnumtbset portfolioset recruitset disctimeseriesset discstaticsetip-currencyset datlnincome _set datlnbalancesetdatlnfinancing-set dat_ln_ratiosset datlncodeset tabtbsetporftb_set)

Figure 4.2: lqp.set Class Frame

19These are the child of "informix_2e" LQP instance that implement the auxiliary and pseudo tables(they exist only when the LQP class is meant to access a remote database).2 0 Example: to "aux disc timeseries" and "aux disc static" correspond "pseudo-disc-timeseri" and"pseudojdiscstatic".2 1Listed in the LQP attributes catalogue .22 The decision of which LQPs to group under the same LQP set is a design issue. For instance, do you putDataline and Disclosure Timeseries in the same set? The answer is yes if you consider that theirintersection is not empty, and no if you think that each one has information not available in the other.

Page 20: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

4.2.1 lqpset ObjectThe reason why an object per LQP set (fig.4.2) exists is that the GQP sends the ALQsto LQP sets and does so by sending the "GET-DATA" message to an object. An LQP setinstance object (fig. 4.3) contains only one slot the "method" whose unique value isthe couple "(: GET-DATA LISP-FUNCTION).

disc static setsuperiors:

(multiple-value-f t)(value lqp-set)

methods:(value (get-data get-data-lqpm-disc-static-data))

Figure 4.3: discstatcset Instance Object Frame

4.3 Local Database: Auxiliary Tables, Pseudo Tables and System TablesCIS/TK has an internal DBMS23 ( Informix/SQL). It is used by the system:

- to store metadata;- data caching (pseudo tables);- to download data (auxiliary tables);

All the local tables are considered as part of a single database; thus, there is onesingle KOREL LQP class to access the informix tables: "informix_2e" and all the tablesin informix are accessible through instances of this object 24.

This internal database is also used by the system to store additional data (metadata)needed to resolve data semantics incompatibilities.

The pseudo-tables are for the storage of data that was previously obtained byqueering the corresponding remote tables. This facility allows CIS/TK to fill up theselocal pseudo-tables are with data obtained at run time from the remote databases.The reason for this is that when a remote database cannot be access for one reason oranother, the pseudo-table can be queried instead. Ideally,a pseudo-table should be anaccurate local image of the remote system.

The auxiliary tables should be seen as a mere tool used by the system and completelyinvisible to the GQP and to the final user. They are used to download data obtainedfrom the remote site. After downloading the data obtained, the predicates that werenot applied remotely, are applied to the auxiliary table and the data obtained is what

23See Appendix No. 1: CIS/TK "tables"241n the case of the "disclosure" LQP, the auxiliary and pseudo tables for both the timeseries and thestatic tables.

Page 21: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

the remote system would have returned, providing the capability to process allpredicates. Right after it is used, the auxiliary table is emptied. An auxiliary table isbuilt with informix on the local workstation (mit2e in our case). Informix is usedhere to exploit the fact that an SQL request may embody complex predicates.

4.4 LQP HierarchySo far, we have explained the concept of LQP and its object implementation. Nowwe turn to describe the LQP hierarchy and how this hierarchical organization servesto the CIS/TK purposes.

4.4.1 LQP Top Class ObjectThe LQP top class object (fig.4.4) is the the top of a tree-like structure containing allKOREL LQP classes and instances. It has a methods slot with a list of generalpurpose methods that are used by all LQP objects (fig. 4.5). Each of those functionscorresponds to a specific phase in the LQP algorithm.

LQPmethods:

(multiple-value-f t)(value (built-Iqp-data-conversion-strategybuilt-lqp-data-conversion-strategy(get-lqp-data-types get-lqp-data-types)(Iqp-convert-data-list lqp-convert-data-list)(Iqp-parse-query lqp-parse-query)(Iqp-fetch-data lqp-fetch-data)(lqp-data-filter lqp-data-filter))

subordinates:(value informix_2e informix_2a informix_2c oracle-rt disclosure

ipsharp-currency dataline finsbury-code)

Figure 4.4: LQP Class Object Frame

main-get-data 25 main-get-data-with-conversion 26

main-get-data-without-conversion 27 built-lqp-data-conversion-strategylqp-parse-query lqp-fetch-data 28

lqp-data-filter lqp-convert-data-list

Figure 4.5: General Purpose lqp Methods

251n file "~/lqp/util/get-data.lsp".261n file "-/lqp/util/get-data.lsp".2 71n file "-/lqp/util/get-data.lsp".2 81n file "-/lqp/util/fetch-data.lsp".

Page 22: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

4.4.2 LQP Class ObjectThe LQP object represents a database. It contains all database-wide informationneeded to access the tables in the database. Its frame structure and slots are asfollows:

- "superiors": the value is always LQP (symbol), it stands for the object that isat the top of the LQP hierarchy;

- "description": short description of the database (string);- "type-of-DBMS":type of the database, e. g. SQL, menu, etc (string);- "local-DBMS": flag to indicate whether the database is local or remote

(T/NIL). It is necessary because different communication protocols are required ifthe database is in a host machine different than the CIS/TK's;

- "database": name of the database (string). When this slot has "irrelevant"as itsvalue it is because the database been accessed does not support the logical concept ofdatabase names;

- "lqp-directory": directory where the LQP code resides (string);- "script-writer-invoker": mane of the Lisp function that generates the

communication script used by the Communication Server (symbol). This functioncalls the C program that typically does most of the job;

- "script-writer": name of the C program that generates the script, asmentioned in the previous slot (string);

- "data-reader": name of the Lisp function that extracts the data from the filereturned by the Communication Server (sting). A typical cleaning is, for example, toremove all extra AM, AG and other obscure characters;

- "data -file-cleaner": name of the C or UNIX shell program that does some basicfiltering before;

- "switch-conversion-facility": flag to indicate if the Data Translation Facility isused or not by the LQP (T/NIL) If this flag is set to "NIL" the system won't gothought query parsing and data conversion steps when executing a query (v3.0behavior);

- "predicate-filter": name of the Lisp function that examines a predicate andindicates if it can be processed at the remote site (symbol). A predicate filter uses theattributes catalogue of that particular LQP : if the predicate is "(<operandi> <attribute><value>) ", the function obtains the data type of <attribute> from the attributescatalogue 29 . It may happen that the predicate been examined can not be apply at theremote site (e. g., some DBMS do not accept predicates such us "> 200"); then, it isalso the function of the predicate filter to remove the predicate from the query;

- "comm-server": gives the name of the C program that writes the connectionrequest into the Communication Server message queue;

- "database-directory": gives the directory of the real remote database, onlywhen the database resides in a UNIX system;

- 'methods": maps the fundamental LQP methods :self-info, :get-tables, :get-columns and :get-data to the Lisp functions that are invoked when a corresponding

2 9 Details on the flow control in a predicate filter can be found in [Rig90].

Page 23: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

message is sent to the LQP object. Any LQP must be able to respond to the fourmethods listed above.

disclosure:superiors:(multiple-value-f t)(value LQP)

description:(value disclosure database in the I. P. Sharp on-line information service)

type-of-DBMS:(value APL)

local-DBMS?(value nil)

database(value ipsharp)

lqp-directory:(value /usr/cistk/demo/v3.2/lqp/databases/ipsharp/disclosure/)

scri pt-writer-invoker:(value disclosure-script-writer)

script-writer:(value disc-script2)

data-reader:(value reader-disc)

data - file-cleaner:(value ipcleaner)

switch-conversion -facility:(value on)

predicate-filter:(value disc-predicate-filter)

comm-server:(value startlqp)

database-directory:(default irrelevant)

methods:(multiple-value-f t)(value (self-info display-ipsharp-self-info)

(get-tables get-ipsharp-tables)(get-columns get-ipsharp-columns)(get-data get-ipsharp-data))

instances:(multiple-value-f t)(value disc-timeseries disc-static)

Figure 4.6: LQP class Frame (disclosure)

4.4.3 LQP Instance ObjectIn addition to the slots described for the LQP class (fig. 4.6), an instance also has thefollowing slots:

- "table": name of the table (string). If the table the LQP is leading to is a logicconstruction, i. e., there is no physical table at the remote site in the relational

Page 24: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

sense 30, then this slot contains a name that the LQP developer made up; besides, ifthe remote site does not contain a table, this slot is not used;

- "attributes catalogue": name of the KOREL object that represent the attributescatalogue for that LQP instance;

- "aux-table-lqp-obiect": name of the LQP that is used to access to access theauxiliary table for this LQP31;

- "pseudo-table-lqp -object": name of the LQP that is used to access the pseudotable of a LQP . Likewise the previous slot, it exist only for LQPs that are notattached to pseudo tables; and

- "description": a condensed explanation of the lqp contents.

disc-staticsuperiors:

(multiple-value-f t) (value disclosure)instan ce-of:

(multiple-value-f t) (value disclosure)table:

(value discstatic)attributes-catalogue:

(value disc-static-attributes-catalogue)aux-table-lqp-object:

(value aux-disc-static)pseudo-table-lqp-object:

(value pseudo-disc-static)description: (static facts table in the disclosure database)

Figure 4.7: LQP INSTANCE object (disc-static)

5 Semantic Modelling

In a HDBMS environment one must deal with incompatibilities in data comingfrom disparate environments. Data semantics is CIS/TK approach to overcome suchdisparities. A first step on data integration was taken with the design of the globalschema; as we described on section 3 of this paper, the ERM allow us to join andmerge the data coming from different sources. A second step is to design a datatranslations facility for semantics reconciliation, that will allow us to perform"joins" between tables from different databases with different modes of representingdata (e. g., zip-code vs state name; sales expressed in USD vs sales expressed in PST).

In this chapter, we present the tools developed for CIS/TK in the domain of datasemantics reconciliation at instance level. They are based on two importantconcepts: the use of metadata to define the data [SiM89], and the use of data

30Not only there is no table in the relational sense, but also, this slot is not used for anything.3 1This slot exist only for the LQPs that are not attached to auxiliary tables. Notice that an auxiliarytable does not have an auxiliary table.

Page 25: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

L get-lqp-data-typesbuilt-lqp-data-conversion-strategylqp-convert-data-listlqp-parse-querylqp-fetch-data

lqp-data-filter

self-infoget-tablesget-columnsget-datainsert-datainsert-batch

download batch

self-infoget-tablesget-columnsget-dataget-number-rows

self-infoget-tablesget-columns

ALUMNITB-MIT2ECOUNTRYTB-MIT2EPOSITIONTB-MIT2ESICCODETB-MIT2ESICNUMTB-MIT2EPORTFOLIODBPSEUDO-RECRUIT-MIT2EAUX DISC TIMESERIESAUX~DISC STATICPSEUDO DlSC TIMESERIPSEUDO DISC STATICAUX IP CURRENCYPSECDO IP CURRENCYAUX DATLN INCOMEAUX DATLN BALANCEAUX DATLN FINANCINGAUX DATLN RATIOSPSEUDO DATLN INCOMEPSEDO DATLN BALNCEPSEUDO DATLN FINANCINGPSEUDO DATLN RATIOSCOLUMNS ATTPSEUDO D~ATLN CODEPORTFTB-MIT2ETATB-MIT2EINIT-STATUSINIT-ACTIVESWORKING-STATUSWORKING-ACTIVE

RECRUIT-IBMRT

ii -DISC-TIMESERIES

les DISC-STATIC

DATLNNAMES

IPCURRENCY

self-infoget-tablesget-columns

Figure 4.8:LQP Objects Hierarchy

DATLNINCOME - fDATLNBALANCE

DATLNRATIOS

DATLNFINANCING25

Page 26: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

catalogues to store metadata [Rig90, Wie87] 32. First we will describe data semanticsmodelling as our solution to the data disparities that we encountered; then, we willexplain how semantic representation is done in CIS/TK, mainly translation tablesand the data conversion package.

5.1 Data TranslationData translation is required whenever there are semantic incompatibilities in data,among information systems [SMG91]. These incompatibilities are due todiscrepancies in data representation, data naming and format conventions acrossthe various databases. In order to overcome such disparities, CIS/TK performs:

- instance representation translation (e. g., from 02139, a zip code, toMassachusetts, a state);

- instance name translation, also called synonyms translation (e. g., from"IBM" to "International Business Machines");

- data format translations (e. g., from 1,000,000 to 1000000);- units translations (e. g., currency conversion); and- scale translations (e. g., from thousands to units of USA Dollars);

Currently the system uses different schemes to cope with semantic heterogeneities.The rest of this chapter is a description of all of them, with emphasis on thesemantics mapping facility [Rig90].

5.2 Translation Tables

5.2.1 Synonym TranslationThe CIS/TK synonyms translation facility is based on a global synonyms table(maintained by the GQP), and a set of local synonyms tables; there is one translationtable for each logical table (LQP instance). Therefore, the synonym tables are meantfor the storage of synonyms found in the databases supported by CIS/TK.

The GQP global synonym table associates the global attribute with a table of all owednames of that attribute. When writing a query against the global schema, the user

can quickly search through this tables to find the names and insert values in thequery that will be understood by the GQP without any ambiguity. The tables areimplemented as LISP lists with the following format:

((syn1 syni ...... ) (syn2 sy2......))33

The synonym table maps for each LQP global name to local names (i. e. the globalname "REUTERS" , name understood and used by the GQP level user, is mapped to

32 Wiederhold defined data as the factual, observable and verifiable descriptions of the real-worldevents, and metadata as the higher level concepts used to structure, classify and relate such a real-world events.33 Example: ( (AT&T " "AT&T Corp" "American Telephone &Telegraph"))

Page 27: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

"Reuters Intl." and "Reuters Ltd." in the Alumni database). A global name may bemapped to more than one local name in a given LQP because later this may havesynonyms. To check whether a global attribute value has a synonym, the proceduresyn-for-conds? can be used.34

The synonym facility is meant to demonstrate the feasibility of using synonymtables. As the synonym tables become larger, it would be better to implement thetable in a database system for faster access and easy maintenance.

5.2.2 Application Conversion Tables 35

Before CIS/TK had a data type conversion facility, the system used additionaltranslation tables for conversion purposes. For instance, there is one that matchescentury+year(yyyy) to year(yy)36.

5.2.3 Map TablesThey are used as labeling mechanism. The three-key tables are used for data sourcetagging. The two-key tables are used to identity where to send the data request. Thesetables are created at run time and deleted off after the query is over. Their names are:

- *gs->lqp*(2key)- *1qp->gs* (3key)

5.2.4 Inter-Data Type Translation TableCIS/TK still uses a unique table: the one mapping state names to zip codes 37 . Thedata conversion facility does not deal with this kind of translation, it only deals withconversion within a given data type from a first representation to a secondrepresentation of the same data type. In the case of state name to zip codeconversion, an inter-data type conversion (extension of the actual one) would beneeded. Indeed, this is the only way in which CIS/TK can perform instancerepresentation translation.

5.2.5 Table-Based Translation ExampleThe translation table specifies whether or not data obtained from a given LQPshould be translated according to the synonym tables (i. e. "Reuters Intl." and"Reuters Ltd." are translated to "REUTERS"). If we take one of the entries of thistranslation table:

(discstaticset discstatic co standarize name)

34For example, to check for "IBM" using that procedure:(syn-for-conds? "IBM")-- > ("IBM" "International Business Machine" :IBM Corp.")

35 See "-/app/pas/pas-syns.1sp" and "-/app/pas/pas-trans.lsp"36 See "~/app/pas/pas-syns.lsp"3 7See "~-/app/pas/pas-syns.lsp"

Page 28: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

The entry tells the system to translate the "co " data obtained from any LQP member,to the discstaticset LQP set and the LQP attribute "co" is as follows 38:

(discstaticset discstatic co compnames-syntb3)

This entry tells the system to look up the table "compnames.syntb3" (fig. 5.1) tooperate the translation. Thus, if data is obtained from the LQP, the name "REUTERSHOLDING PCL" will be translated to "REUTERS".

(local-syntb compnames-syntb3("REUTERS" "REUTERS HOLDINGS PCL")

("HONDA MOTOR CO" "HONDA MOTOR CO LTD")

......... )

Figure 5.1: compnames-syntb3 Table

5.2.6 datln_schema Translation TableThis table maps the LQP attributes present in the Dataline attributes catalogue withthe real attribute names used in the database. Such an ISQL table is not used by anyother LQP.

For Dataline, we faced the following problem; Dataline supports multiple wordattributes name (i. e., FIRST NAME) The way the attributes catalogues are built doesnot allow to have a multiple word KOREL attribute name39, hence single-wordedattribute names were created for the attributes catalogue. The ISQL table"datlnschema" is the mapping tool between the LQP level names and the realones.

5.3 Data Conversion FacilityThe CIS/TK data conversion facility is a general purpose semantics mapping tool[Rig90]. It is based on the assumption that every piece of data can be classify within agiven set of data types. It models the representation of data type by a set ofrepresentation facets, and carries out intra-data-type representation translationfunctions. The module comes with a set of data types and translation procedures toconvert from one representation to another.

38See "-/app/pas/pas-syns.lsp"391n the case of Dataline where one does not query the database based on the attribute names but onreport names and categories, the real attributes names, not useful at first glance, but they are used bythe data reader LQP routine, in order to know what entries in the formatted output need to be extracted,hence it needs the real names.

Page 29: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

5.3.1 Standardization of Data Elements: Data TypesWe use a representation scheme that classifies every data element by its type-class.In CIS/TK, A data type class has been modeled as having different components, eachone with a particular representation 40. .It's important to notice that the conversionfacility does not impose either the number of data types or the number ofcomponents of a data type. In addition, a new data type can be added to the system atany time41.

All data types currently supported by CIS/TK are implemented as objects and thus,stored in a common hierarchy whose parent is the "data-type" object class (fig 5.2).Let's take a closer look at these concepts using an example.

The data type and representation taxonomy is implemented through objects (seeFigures 5.2 and 5.3). A facet object's arguments slot gives the list of the argumentsthat the translation procedures for that facet need. The end leaves of the objectshierarchy constitute the interface of the data translator to the exterior. Uponreception of data from the remote system, data translations are operated data type bydata type and facet by facet for each data type.

5.3.2 Data Type Description: ExampleLet's analyze data type description,

('figure 'FIGURE-PLAIN 'format)

('figure '(get-unit-disc IN) 'unit)

('figure 'SCALE-1000 'scale)

On this example, we observe that the "figure" data type has three components thatfully define the type description:

- 'format" describes the way the figure is written (with commas, dots, etc);

- 'unit" gives the money unit of the figure; and

- 'scale" gives the scale of the amount.

The symbols "FIGURE-PLAIN" or "SCALE-1000" are particular data type component

representations. For example, "SCALE-1000" represents a scale of a thousand.

40 For example, the data type "figure" is composed of "format", "unit" and "scale".4 1 Assuming that the necessary conversion routines are written and loaded in the system. Thisconsideration is very important when adding a new LQP to CIS/TK, which has a data type notsupported at that moment by the system.

Page 30: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

- figure-plain

figure

get-components Fname

year -_0

- figute-float

figure-format - figure-disc

convert - figure-datln

- usd

-jpyfigure-unit - gbp

- gbpe-- convert

- frf

- convert -scale-1

figure-scale - scale-1000

- scale-0.01

- year-yyyy

year-format

|- convert -_ year-yy- date-mm/dd/yyyy

- date-dd-mm-yy

dat-formaJ - date-ddsmmsyy

convert

- date-yyyymmdd

- date-month-dayFigure 5.2: data-type Objects Hierarchy

5.3.3 Data Type and Data Type Components Objects HierarchyData type and data type components are hosted inside the system as separatehierarchies of KOREL objects 4 2 . This way, when a piece of data must be converted,let's say from "SCALE-1000" to "SCALE-1" a message is sent to the object "SCALE-1",this message includes the piece of data to be converted and the information aboutthe current semantics of that piece of data. In our scale conversion example, themessage would specify that the current scale is "SCALE-1000"43 .

42See file "~ /lqp/conversion/config/conv-obj.lsp".43See also conversion functions.

Page 31: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

5.3.4 Variable Data TypesData types are built so that all data elements belonging to a given data type have thesame representation. In some databases, it is not possible to have a constant datatype component representation across all data. For example, in the case ofDisclosure, companies information is given in the currency of the country in whichthe company was incorporated. Hence, the "unit" component of the "figure" datatype in the Disclosure LQP data type catalogue cannot be given a constantrepresentation (i. e., USD). However, the currency can be inferred from the country

of incorporation 4 4. It is important to understand that the component representationis determined at run time when the data is obtained from the database. In addition,this component representation may differ from one record to another.

aux-table-data-type-catalogue

-get-data-oype- get-data-type-semantics

-get-attributes-of-data-type-get-attributes-lqp-name-get-attributes-catalogue

sppotof tesorage omeatfigure-unit-cataloguecatalogue

all-name-data-type-catalogue

Figure 5.3 Catalogues Objects Hierarchy

5.3.5 CataloguesThe system catalogue is the place to store the metadata (data about data). It containsinformation concerning the various objects about which the system must know of.

Examples of such objects are tables, views, indexes,etc. [Dat86]. We use this idea as a

support of the storage of metadata generated in the previous section. Here, thecatalogue structure is presented (fig 5.4).

All catalogues45 (attributes and data types) are implemented as KOREL objects. Theyconstitute a separate hierarchy whose top level object is "catalogue" (fig.5.3) so, bothhierarchies gather some common methods.

In the data-type-catalogue the slot "data-types" contain a list of all the data types thatcan be found in the database such as (figure date year name). There is a slot for eachdata type to describe its semantics in that table.

44The notation in this particular situation, (<lisp-function-name> attributel, attribute2,...attributeN), gives the lisp function that determines the component representation from the values.4 5 The code for its creation is in "-/lqp/conversion/config/catalogues.sp".

Page 32: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

cataloguemethods:

(multiple-value-f t)(value (get-data-type get-database-data-types)

(get-data-type-semantics get-data-type-semantics)(get-attributes-of-data-type get-attributes-of-data-type)(get-attribute-lqp-name get-attribute-lqp-name)(get-attributes-catalogue get-attributes-catalogue(get-lqp-attribute-name get-lqp-attribute-name))

subordinates:(value aux-table-data-type-catalogue all-name-data-type-cataloguefigure-unit-catalogue)

Figure 5.4: CATALOGUE Class Object (top of the catalogues hierarchy)

Every LQP and the GQP must have an attributes catalogue attached to it. Somedatabases have one single data type catalogue (and not one per table). In the case ofthe "disclosure" or "dataline" LQP, the LQP instances attached to the tables in thatdatabase, share the same data type catalogue. For example, the "disc-timeseries" andthe "disc-static" LQP instances share the same data type catalogue which is the "disc-data-type-catalogue"; this is possible because both tables has indeed the same datasemantics. Notice that this is not necessarily true in other databases, in which case aseparate data type catalogue would be created for each table (i. e. a first table with datetype expressed as "dd/mm/yyyy" and and a second table with that same typeexpressed as "mm/dd/yyyy").

The data semantics attached to an object can be changed by changing the values inthe data type catalogue of that object: i. e. if you want to have the money amount inpesetas at the GQP level go in the KOREL object "global-data-type-catalogue"(declared in the file /gqp/gqp-cat.lsp) and change the slot "figure" facet "unit" fromUSD to PST.

5.3.5.1 Data Type CataloguesA data type catalogue describes the data type semantics for a given entity (GQP orLQP). For example, the data type catalogue indicates that dates are expressed, in agiven entity, in the format CCYYMMDD or that money-amount is Dollars. CIS/TKhas the following kinds of data type catalogs:

- LQP(s) data type catalogue describes the data semantics (formatconventions) of each data type present in the table (local system) to which the LQP islogically attached;

- GQP data type catalogue is the KOREL object that represents the data typecatalogue for the GQP. It indicates the data semantics chosen by the GQP forreporting or joining purposes. To change the data semantics attached to the GQP,

Page 33: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

only the values on its attributes catalogue need to be changed; for example, if youwant to have the money amount in Pesetas at the GQP level, go to the KORELobject 4 6 and change the slot "figure" facet "unit" from USD to PST. This catalogue ispassed as argument when sending a message to an LQP set, as we saw earlier in thediscussion, a local system may not have a uniform data format across the same type.The data translator deals with this issue by exploiting the fact that data semantics canbe inferred from other data;

- global schema (GS) data types. The GS does not have either an attributescatalogue or a data type catalogue. The GS gives implicitly the data type of its globalattributes: for each global attribute, its data type is the data type of the localattribute(s) that are mapped to, providing that all these local attributes have thesame data type. That constitutes an intrinsic assumption of the CIS/TK design;

- auxiliary table data type catalogue is used to describe the data typesrepresentation used by a local auxiliary table. All the auxiliary tables LQPs share thesame data type catalogue.

5.3.5.2 Attributes CatalogueIt is a list of pairs <local attributes, data types>. Its function is to map each attribute,for the entity the catalogue is attached to, to its data type.

Every attributes catalogue is implemented as an instance object, child of the datatype catalogue object. This object contains the following slots:

- "primary-key" contains a list of the primary key attributes for the database;- "attributes" contains a list of all the attributes in the database;- finally, there is a slot for each attribute holding its data type.

In addition to these slots, an attributes catalogue can have additional slots to storethe attributes data type particular needs. In the attributes catalogue for the timeseriestable (disc-timeseries-attributes-catalogue), the slot for each attribute, in addition tothe data type attribute value, has a "fact" facet that indicates whether the attribute isstatic or timeseries 47 .

In some tables, some of the slots for the attributes contain a facet called "aux-attribute". The value stored here is the name of the corresponding attribute in theauxiliary table48 .

46Declared in the file "~/gqp/gqp-cat.lsp".4 7 This information is used by the data filereader functions located in the"- /lqp/databases/ipsharp/disclosure/disc-read-d2.lsp" file.48 This scheme of naming one single attribute with two different names (one name in the real table andanother in the auxiliary one) was implemented in CIS/TK to deal with situations when an attributename in a database is a keyword in SQL. For example, the string "IN" is a reserved key word in the

Page 34: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

The attributes catalogue must contain all the attributes attached to a given entity. Ifan attributes catalogue is "missing" an attribute, the following two KORELinstructions in the object creation are required:

('attributes <new-attribute-name>)('<new-attribute-name> <data-type-for-new-attribute-name>)

The data type of the new attribute has to be chosen from the set of data typessupported by the data type catalogue of that entity. If not, the work should be startedby adding the new data type in the data type catalogue:

('data-type '<new-data-type>)

An for each component of the data type semantics:

('<new-data-type> <componenti> <representation-of-componenti>)

('<new-data-type> <componentn> <representation-of-componentn>)

If the data type of the attribute is not supported by CIS/TK, then the new data typeshould be installed, which requires much harder work.

6- Conclusions

CIS/TK is a prototype HDBMS developed at the Composite Information SystemsLaboratory; it is our implementation strategy to achieve information systemsconnectivity in a decentralized organization [BPG87, MaW88a, MaW88b] .

The system, built as a general purpose prototype for information systemsintegration, provides physical and logical connectivity among these disparateinformation systems.

Throughout the paper, we covered its major design and implementation issues,with emphasis on its data model, its queering capabilities and its OOP andknowledge based environment. We also described the Placement Assistant System(PAS), the application built to demonstrate the feasibility of our approach.

6.1 AssumptionsNowadays, we are faced with the problem of integrating information systems thathave been developed independently. CIS/TK gives a solution to the problem of howwe may engage the task of providing connectivity among information systems. Asan integration engine, CIS/TK makes feasible to access, retrieve and integrateinformation residing in disparate and geographically distributed informationsystems. The design was undertaken based on the following assumptions:

Page 35: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

- non-intrusive approach to data integration; to address the need to explicitlyallow for the coexistence and usage of a variety of components (different DBMS,models and applications);

- It should access and retrieve data from any information system, regardlessof their DBMS and host types; flexibility is the key idea behind this assumption;

- centralized integration engine; allowing for the development of multipleintra organizational subsystems, yet been able to combine all the sources ofinformation available within an organization;

- support for structural and semantic integration as basis for thedevelopment of a powerful data model that could represent and resolveheterogeneities from both fields;

- supporting different views of the underlying systems will give us theflexibility to serve multiple purposes. Moreover, new types of information andrelationships can be added to the database easily as the component databases evolveor increase;

- modularity and extensibility ; to respond to changes in the differentenvironments that CIS/TK aims to integrate;

- OOP and knowledge-based approach to data integration.

6.2 Design and ImplementationGuided by these principles, these are the tools that we modeled, designed andimplemented for CIS/TK:

GLOBAL DATA MODEL- structural data model:: the global schema , designed as an extended ERM ,and the schema definition language to represent an integrated view of theunderlying databases;- semantic data model: the data translation facilities, mainly translationtables and an object-based data representation module to model datasemantic disparities;

DATA MANIPULATION LANGUAGE: development of a standard global datamanipulation language (GRL) to query the global schema;

MAPPINGS TO THE LOCAL SYSTEMS: the mappings from the global schema tolocal schemas (component systems) is done through the relational logical tables;

QUERY PROCESSING: queries submitted against the global schema are routed to theappropriate sources by the GQP. The access plan created by the router is such thatwill allow us to get the information by accessing the minimum number of sources;

KNOWLEDGE-BASED PROCESING TECHNIQUES: the LQPs are the modules thatencapsulate the knowledge necessary to access and query the remote systems. TheLQP is the module where the knowledge on local DBMS, local query language andlocal host machines; is stored. Based on that knowledge, it translates (syntax andsemantics) a global query into a query executable at the remote site;

Page 36: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

COMMUNICATION SERVICES: the network services are provided by theCommunication Server. It manages and controls routines for connection, accessand disconnection to the remote databases, communication protocols and datacommunications services;

DATA CATCHING: data catching is done and managed by the LQPM module; infact, data catching is done in CIS/TK for two different reasons; first, for long termstorage of information obtained in queries to the remote systems; second, as amechanism for run-time storage of part of a query (predicate) that cannot beprocessed due to constrains on the remote system's DQL;

SOURCE SELECTION: In the case that the same data can be obtained from differentsources, the user/system administrator has the option to select which one to go;

KOREL: OOP environment suited to the particular needs of CIS/TK.

6.3 Implementation Independent EvaluationThus far, we have evaluated CIS/TK from a design and an implementation point ofview. Now we turn to a different perspective that will give us an evaluation of ourintegration engine from an implementation independent point of view.

Heiler, Siegel, and Zdonik have defined integration as the means for combining orinterfacing data and functions from different systems into a cohesive set. From theirpoint of view, the goals of integration are to provide access to information that isstored in different forms and managed by different systems.

According to these definitions, they describe the aspects of information systemsintegration (ISI) as a multidimensional space, in which each axis represents someaspect in which heterogeneities are hidden or visible.

The parameters identify the types of heterogeneity that are addressed by differentintegration frameworks. They are implementation independent parameters, that isto say, no considerations are given to global data model, manipulation language andnetwork services. Each variable represents one aspect of the information integrationand their range varies from transparent (no integration capabilities for thatparameter), to hidden (total integration for that parameter, all the heterogeneities ofthat parameter are hidden from the user point of view). The position of a givenintegration system with respect to these ISI axis will be an indication of whatservices can/cannot be provided by that system.

The five variables to integration engines evaluation that they proposed and theposition of our system on the integration space goes as follows:

(1) DATA MODEL TRANSPARENCYthis parameter looks at how terms in one model are mapped to the terms in theother model. CIS/TK approach is to automatically provide a global model with

Page 37: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

mappings from each of the heterogeneous data models to our single ERM. Thus,our system is positioned at the hidden region of the data model axis;

(2) SCHEMA TRANSPARENCYschema transparency measures the heterogeneity in the structure of semantics ofdata elements. CIS/TK automatically generates and invokes the mappings that, forexample, allow the user to perform arithmetic operations on numeric data that arestored in different units. Thus is in the hidden region of this axis;

(3) SOURCE TRANSPARENCYthis parameter measures the degree in which invocation and control of executionand the steps needed to capture results affect user. In CIS/TK all those proceduresare hidden to the user thus, the system is in the hidden region of this axis;

(4) DEPENDENCY/CONTRAIN TRANSPARENCYn/a (no transactions, updates,... capabilities were provided for CIS/TK)

(5) TIME/CONTEXT TRANSPARENCYCIS/TK hides the time/context property of data by providing automatic selection ofthe correct version based on user status, that this parameter tries to measure.

We have seen that all parameters of ISI fall, for CIS/TK, in the hidden region oftheir axis. Therefore, CIS/TK is a transparent ISI tool.

6.4 Future WorkCIS/TK was tested in our Laboratory running the PAS application. Also, it is beenused by organizations as integration engine running their own applications. Gainedfrom this background, now we address some issues that, with further research,would enhance the integration capabilities and efficiency of the system.

With respect to the data model, we designed as having two different parts, structuralmodel and semantic model. That scheme allows us to separate, for the componentsystems, system-specific knowledge from facts about the data contained in thosesystems. The advantage to this approach is that we can use the same procedures toaccess two different databases, if they share a common DBMS and/or host type.Besides that, the strong support of our model to data semantics promotes datarepresentation standards in a decentralized organization.

Also, CIS/TK ontology, defined by the application's entities and attributes, assumes ahomogeneous data type representation for a given concept across data bases whichlimits inferencing capabilities for IBDII (only intra-datatype 1:1 translations can beperformed).

Another structural limitation of the model is that there is no tool for automaticgeneration of the global schema; that it can be a problem when integrating largenumber of systems.

Page 38: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

With respect to queering capabilities, the optimization sources access is now basedon an algorithm that generates an access plan with a single criteria: to access theminimum number of databases.

With respect to extensibility, every time the system needs to access a new database anew LQP must be written. We need to develop tools for knowledge acquisition ofsystem knowledge such that the LQP generation could be automatically.

Page 39: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

Bibliography

[Bey90] Beyon, D., "Information and Data Modelling." Backwell ScientificPublications, 1990.

[BaL84] Batini, C. Lenzerini, M., "A Methodology for Data Schema Integration Inthe Entity Relationship Model." IEEE Transactions on Software Engineering, Vol Se-10 No. 4, November 1984.

[BLN86] Batini, C. Lenzerini, M. Navathe, S., "A Comparative Analysis ofMethodologies for Database Schema Integration."ACM Computing Surveys, Vol. 18,No. 4, December 1986.

[BPG87] Bhalla, S., Prasad, B., Gupta, A., and Madnick, S., "A framework andComparative Study of Heterogeneous Distributed Database Management Systems."in Knowledge-Based Integrated Information System Engineering (KBIISE) Project,Volume 3. Ed. Amar Gupta and Stuart Madnick. Cambridge: Massachusetts Instituteof Technology, 1987.

[Cha88] Champlin, A., "Interfacing Multiple Remote Databases in an Object-Oriented Framework." B.S. Thesis. Massachusetts Institute of Technology, 1988.

[Che76] Chen, P., "The Entity-Relationship Model. Towards a Unified View ofData." Transaction on Database Systems, 1(1), 1976.

[Che81] Chen, P., "Entity-Relationship Approach to Information Modelling andAnalysis." Elsevier Science Publishers, 1981.

[Dat86] Date, J., "An Introduction to Databases Systems." Volume I, Fourth Edition.Reading Massachusetts: Addison-Wesley, 1986.

[Ger89] Gerber, H. "Optimizing Information Retrival for Disparate Menu DrivenDatabase Systems." B. S. Thesis. Massachusetts Institute of Technology, 1989.

[Gre87] Gref, W., "Distributed Homogeneous Database Systems: A ComparisonBetween Oracle and Ingres." in Knowledge-Based Integrated Information SystemEngineering (KBIISE) Project, Volume 3. Ed. Amar Gupta and Stuart Madnick.Cambridge: Massachusetts Institute of Technology, 1987.

[Hor88] Horton, D. C., "An Object Oriented Approach towards Enhancing LogicalConnectivity in a Distributed Database Environment." M. S. Thesis. MassachusettsInstitute of Technology, 1988.

[KaS90] Kao, R, and Sim, T. " An enhanced Architecture for CompositeInformation Systems." B. S. Thesis, Massachusetts Institute of Technology, 1990.

Page 40: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

[LaR85] Landers, T., and Rosenberg, R., "An Overview of Multibase." ComputerCorporation of America, 575 Technology Square, Cambridge, MA 02139, U.S.A., 1985.

[MaW88a] Madnick, S., and Wang, R., "A Framework of Composite InformationSystems for Strategic Advantage." in Connectivity Among Information Systems,Volume I. Ed. Richard Wang and Stuart Madnick. Cambridge: MassachusettsInstitute of Technology, 1988.

[MaW88a] Madnick, S., and Wang, R., "Evolution Towards Strategic Applicationsof Databases Through Composite Information Systems." in Connectivity AmongInformation Systems, Volume I. Ed. Richard Wang and Stuart Madnick.Cambridge: Massachusetts Institute of Technology, 1988.

[Mak90] Makatiani, B., "Enhancement of the Communication Server in theComposite Information System Tool Kit." B. S. Thesis, Massachusetts Institute ofTechnology, 1990.

[Rig90] Rigaldies, B., "Technologies and Policies for the Development ofComposite Information Systems in Decentralized Organizations." CISL WorkingPaper 90-10, Massachusetts Institute of Technology, 1990.

[Ros8l] Ross, R., "Data Dictionaries and Data Administration: Concepts andPractices for Data Resource Management." New York: Amacon, 1981.

[RuS90] Rumble, J. and Smith, F., "Database Systems in Science and Engineering."Adam Hilger 1990.

[SiM89] Siegel, M., and Madnick, S., "Identification and Reconciliation of SemanticConflicts Using Metadata." CISL Working Paper No. 89-09, Massachusetts Instituteof Technology, 1989.

[SMG91] Siegel, M. Madnick, S. Gupta, A., "Composite Information Systems:Resolving Semantic Heterogeneities." CISL Working Paper No.91-21, MassachusettsInstitute of Technology, 1991.

[SyT82] Symons, C., and Tijmas, P., "A Systematic and Practical Approach to theDefinition of Data." The Computer Journal, British Computer Society, Vol. 25, No.4,1982.

[Teo90] Teorey, T., "Database Modelling and Design. The Entity-RelationshipApproach." Morgan Kaufman Publishers, Inc., 1990.

[Tun90] Tung, M., "Local Query Processor Manager for the Composite InformationSystem / Tool Kit." CISL Working Paper 90-18, Massachusetts Institute ofTechnology, 1990.

Page 41: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

[Wie87] Wiederhold, G., "KSYS: An Architecture for Integration Databases andknowledge Bases.", 1987.

[Wo189] Won, T., "Data connectivity for the Composite Information Systems /Tool Kit." B. S. Thesis, Massachusetts Institute of Technology, 1989.

[Wo290] Won, K., " Introduction to Object-Oriented Databases." The MIT Press,1990.

Page 42: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

Appendix No. 1:

CIS/TK Tables

Page 43: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

ALUMNI DATABASE

-TABLE DRIVERS (on mit2e, alumnidb):ALUMNITBSICCODETBPOSITIONTBSICNUMTBCOUNTRYTB

PORTFOLIO DATABASE

-TABLE DRIVERS (on mit2e, portfoliodb):CISTKPORTF

RECRUIT DATABASE

-REMOTE TABLE DRIVERS (instances of 'ORACLE-RT):"COMPANYTBL"

-PSEUDO TABLE DRIVERS (on mit2e, recruitdb ):"PSEUDOCOMPTB"

I. P. SHARP DATABASE

-TABLE DRIVERS:DISC TIMESERIESDISC STATICIPCURRENCY

-AUXILIARY TABLE DRIVERS (on mit2e, discdb and auxiliary):AUX TIMESERIESAUX STATICCURRAUX

PSEUDO TABLE DRIVERS (on mit2e, disedb and cumrncyJb):PSEUDOTIMESERIESPSEUDO STATICPSEUDO'CURRENCY

Page 44: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

FINSBURY DATALINE/CODE DATABASE- --------------------- -----------------------------------------------------REMOTE TABLES DRIVERS (INSTANCES of 'DATALINE')

DATLN INCOME : to access the income statementDATLN BALANCE : to access the balance statementDATLNFINANCING: to access the financing table statementDATLN-RATIOS : to access the financial ratios statement

-REMOTE TABLE DRIVER (INSTANCE OF 'FINSBURY-CODE)DATLNCODE : to access the code

-AUXILIARY TABLE DRIVERS (on mit2e, datalinedb):AUXINCOME : to access the auxiliary table forAUXBALANCE : to access the auxiliary table forAUXFINANCING: to access the auxiliary table for-AUXRATIOS : to access the auxiliary table for

-PSEUDO TABLES DRIVERS (on mit2e, datalinedb):PSEUDOINCOME : to access the pseudo tablePSEUDOBALANCE : to access the pseudo tablePSEUDOFINANCING: to access the pseudo tablePSEUDO RATIOS : to access the pseudo tablePSEUDONAMES : to access the pseudo table

forforforforfor

thethethethe

thethethethethe

incolebal. sheetfirancing tab.finan. ratios

income state.balance sheetfinancing tab.fin. ratiosdataline code

Page 45: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

SYSTEM TABLES

LQP MANAGER TABLES (on mit2e, lqpmdb)--------------------------------------------------------------------------

ACTIVE LQPINIT ACTIVEINIT STATUSLQPSTATUS

APPLICATION SYNONYM AND TRANSLATION TABLES (USED BY THE PAS)

SYNONYM TABLES FOR THE PAS--------------------------------------------------------------------------

GLOBAL SYNONYM TABLELOCAL SYNONYM TABLE FOR RECRUITLOCAL SYNONYM TABLE FOR ALUMNILOCAL SYNONYM TABLE FOR I. P. SHARPLOCAL SYNONYM TABLE FOR FINSBURYLOCAL SYNONYM TABLE FOR ZIPCODES

TRANSLATION TABLE FOR THE PAS

TRANS-TB

SEMANTICS MAPPING FACILITY TABLES:

DATA TYPE AND ATTIBUTES CATALOGUES

- gqp:

GLOBAL DATA-TYPE CATALOGUE

-I.P. sharp:DISC-DATA-TYPE-CATALOGUE

Page 46: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DISC-ATTRIBUTES-CATALOGUEDISC-TIMESERIES-ATTRIBUTE-CATALOGUEDISC-STATIC-ATTRIBUTES-CATALOGUEDISC-AUX-DATA-TYPE-CATALOGUEDISC-AUX-ATTRIBUTES-CATALOGUEAUX-DISC-TIMESERIES-ATTRIBUTES-CATALOGUEAUX-DISC-STATIC-ATTRIBUTES-CATALOGUE

Finsbury dataline:DATLN-DATA-TYPE-CATALOGUEDATLN-ATTRIBUTES-CATALOGUE

Finsbury code:DATLN-DATA-TYPE-CATALOGUE (same than above)DATLN-NAMES-ATTRIBUTES-CATALOGUEDATLN-INCOME-ATTRIBUTES-CATALOGUEDATLN-BALANCE-ATTRIBUTES-CATALOGUEDATLN-FINANCING-ATTRIBUTES-CATALOGUEDATLN-RATIOS-ATTRIBUTES-CATALOGUECOLUMNS-ATT-ATTRIBUTES-CATALOGUEAUX-DATLN-INCOME-ATTRIBUTES-CATALOGUEAUX-DATLN-BALANCE-ATTRIBUTES-CATALOGUEAUX-DATLN-FINANCING-ATTRIBUTES-CATALOGUEAUX-DATLN-RATIOS-ATTRIBUTES-CATALOGUEDATLNSCHEMA (mapping local one-word to remote multiple-word aLt.)

-I. P. Sharp Currency:CURRENCY-DATA-TYPE-CATALOGUECURRENCY-ATTRIBUTES-CATALOGUEAUX-CURRENCY-ATTRIBUTES-CATALOGUE

-RecruitRECRUIT-DATA-TYPE-CATALOGUERECRUIT-ATTRIBUTES-CATALOGUE

-AlumniALUMNITB-ATTRIBUTES-CATALOGUECOUNTRYTB-ATTRIBUTES-CATALOGUEPOSITIONTB-ATTRIBUTES-CATALOGUESICCODETB-ATTRIBUTES-CATALOGUESICNUMTB-ATTRIBUTES-CATALOGUE

-PortfolioPORTFOLIOD8-ATTRIBUTES-CATALOGUE

Page 47: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

Appendix No. 2:

Global Schema for the PAS Application

Page 48: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

'irst-name (A/A)last-name (A/A)

degree (A/A)

address-I (A/A)

address-2 (A/A)

address-3 (A/A)zip-code (A/A)

country-name (A/C)company (A/A)

" position (AP)

position-code (A/A. A/P)- country-code (A, A/C)" sequence-num (A/A, A/N)-sic-code (A/N)

ALUMNI - workscompany

Lic-code

SiC (A/N)indust r#-name

(

1'

(I

Global Schema: Entity-Relation Representation-name (ar, VD, F/E)

address-I (Iw)

address-2 (Iw)

address-3 (i)ts (/D)

. zip-code (/I).telephone (iw)

exchange (I/I). employees (i)country (/, F/E)position (Rr)

visiting-date (Ri)

sic-code (Rrr)

01 \ jompany-name

convert > - - FINANCEIcurrency-code company-name (i& V/D, F/B. F/m, F/E)

crcc currency-code (/B Fmrevenue (I/D, F/M)

c rode overseas-revenue (F/B) POprofit (/D. F/M) - CO

It I-,N(.'Yyear )(1, )I/M. F/B) -

od (1/) ~ periodending </). /M i/n) - cliwhai~inge-ramtc ii/country (1/1). IA/. I/, F/I, ./ti . cli

total-assets (i/I), F/i)ate r rnt)reporting-date (/incapital (1 /1). I./Ml

title (Rr)

status (R/rnum-position (Rr)

POSITION

ritle

rompany-name

RTFOI0 ---mpany-name (P/K)

(P/K)

ent-name <P/K)ent-state (P/K)

Local Databasesand Tables:

A: ALUMNIA/A: ALUMNITBA/C: COUNTRYTBA/N: SICNUMTBA/P: POSmONTB

A/S: SICCODETBF: FINSBURYF/B: BALANCEF/E: NAMEF/M: INCOME

I: L P. SHARPI/I: DISC-STATICI/D: DISC-TIMESERIES/Y: IP-CURRENCY

P: PORTFOLIOP/K: CISTKPOR'I

R: RECRUITR/l: COMPANYTBL

GlI)I:AL SCHEMAANI)

UNDER LYING DA TASTRUCTURE

(A/

Page 49: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

Appendix No. 3:

Sample Query

Page 50: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1)

DRIBBLE FOR FIRST QUERY

This is query one: Find the recruiting data of companies from the "automotive" industry thathave Sloan alumni.

The query will deal with three entities and several tables defined in the various databases itwill go to, to get pieces of the data needed.

ENTITY LOP-SET TABLE LOP/COMMENT

(1) industry siccodetbset

(2) alumni sicnumtbset &alumnitbset

(3) company recruitset

siccodetb(local database)sicnumtb &alumnitb(local database)

companytbl(Oracle-RT database)

, ,ccodetb-mit2e/all infocomes from one table.sicnumtb-mit2e &alumnitb-mit2e/here theinfo comes from 2 tables sothe data is "merged".recruit-ibmrt/all infocomes from one table.

The following is a LISP output for functions executed in the query execution.Here is the query itself in the global query language.>qul(JOIN (SELECT INDUSTRY NIL WHERE (= INDUSTRY-NAME "Automotive"))

(JOIN (SELECT ALUMNI (COMPANY SICCODE))(SELECT COMPANY (POSITION VISITINGDATE))))

Lets run the query......>(gqp qul)

1> (PARSER (JOIN (SELECT INDUSTRY NIL WHERE(= INDUSTRY-NAME "Automotive"))

(JOIN (SELECT ALUMNI (COMPANY SICCODE))(SELECT COMPANY (POSITION VISITINGDATE)))))

PARSING QUERY...

<1 (PARSER (JOIN (GET-TABLE SICCODETB_SET SICCODETB(INDUSTRY SICCODE) WHERE(= INDUSTRY "Automotive"))

(JOIN (MERGE (GET-TABLE SICNUMTBSET SICNUMTB(SICCODE SEQUENCENUM))

(GET-TABLE ALUMNITBSET ALUMNITB(COMPANYNAME SEQUENCE NUM):SYNS((COMPANYNAME COMPNAMESSYNTB2))

:TRANS((COMPANYNAME STANDARDIZENAME)))

ON(= (SICNUMTB_SET SICNUMTB SEQUENCE_- NUM)

(ALUMNITBSET ALUMNITB SEQUENCENUM)))(GET-TABLE RECRUITSET "companytbl"

(COMPANYNAME POSITION VISITDAY):SYNS

The parser returnsthree things.

(1) It has figured outwhere each sub-queryshould go and hasindicated wheresynonyms and translat--ions will be applied.

Page 51: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 2

((COMPANYNAME COMPNAMESSYNTB1)):TRANS((COMPANYNAME STANDARDIZENAME)))

ON (= (ALUMNI COMPANY) (COMPANY NAME)))ON (= (ALUMNI SICCODE) (INDUSTRY SICCODE)))

(TABLE (COMPANY (VISITINGDATE (2) This is the info.(RECRUITSET "companytbl" VISITDAY)) about mappings bet-

(POSITION -ween the g-schema(RECRUITSET "companytbl" POSITION)) entities (from the

(NAME (DATLNCODESET NAMES COMPANYNAME)query) and the(DISCSTATICSET DISCSTATIC CO) locations (sets and

(RECRUITSET "companytbl" tables). This one is tCOMPANYNAME))) schema to locations

(ALUMNI (COMPANY (ALUMNITBSET ALUMNITB mapping.COMPANYNAME))

(SIC_CODE (SICNUMTBSET SICNUMTB SICCODE)))(INDUSTRY

(SICCODE (SICCODETBSET SICCODETB SICCODE))(INDUSTRY-NAME

(SICCODETBSET SICCODETB INDUSTRY))))(TABLE (DATLNCODESET (3)This is the locatio

(NAMES (COMPANYNAME (COMPANY NAME)))) to schema mapping.(DISC STATICSET (DISCSTATIC (CO (COMPANY NAME))))The schema hert

(RECRUITSET pertains only to thos("companytbl" portions relavent to

(VISITDAY (COMPANY VISITINGDATE)) query.(POSITION (COMPANY POSITION))(COMPANYNAME (COMPANY NAME))))

(ALUMNITBSET(ALUMNITB (COMPANYNAME (ALUMNI COMPANY))))

(SICNUMTBSET(SICNUMTB (SICCODE (ALUMNI SICCODE))))

(SICCODETBSET(SICCODETB (SICCODE (INDUSTRY SICCODE))

(INDUSTRY (INDUSTRY INDUSTRY-NAME))))))1> (ROUTER (JOIN (GET-TABLE SICCODETB_SET SICCODETB Entering the router..

(INDUSTRY SICCODE) WHERE which applies sub-(= INDUSTRY "Automotive")) -queries to componen

(JOIN (MERGE (GET-TABLE SICNUMTBSET SICNUMTB databases.(SICCODE SEQUENCENUM))

(GET-TABLE ALUMNITBSET ALUMNITB(COMPANYNAME SEQUENCE _NUM):SYNS((COMPANYNAME COMPNAMESSYNTB2))

:TRANS((COMPANYNAME STANDARDIZENAME)))

ON(= (SICNUMTBSET SICNUMTB SEQUENCENUM)

(ALUMNITBSET ALUMNITB SEQUENCENUM)))(GET-TABLE RECRUITSET "companytbl"

(COMPANYNAME POSITION VISITDAY):SYNS((COMPANYNAME COMPNAMESSYNTB1)):TRANS((COMPANYNAME STANDARDIZE NAME)))

ON (= (ALUMNI COMPANY) (COMPANY NAME)))ON (= (ALUMNI SICCODE) (INDUSTRY SICCODE)))

he

ns

ethe

.

t

Page 52: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 3

This part had the mappings, the same info. that we got from the parser.

2> (QUERYROUTER(GET-TABLE SICCODETB_SET SICCODETB (INDUSTRY SICCODE) WHERE(= INDUSTRY "Automotive")))

3> (ROUTETABLE routetable recieves a sub-query defined over 1 table.(GET-TABLE SICCODETB_SET SICCODETB (INDUSTRY SICCODE)

WHERE (= INDUSTRY "Automotive"))) Here it is siccodetb set.4> (SENDMESSAGE SICCODETBSET :GET-DATA sendmessage...

GLOBAL-DATA-TYPE-CATALOGUE to siccodetb-object-seL(SICCODETB (INDUSTRY SICCODE)

(= INDUSTRY "Automotive")))5> (GET-LQPM-SICCODETB-DATA GLOBAL-DATA-TYPE-CATALOGUE

(SICCODETB (INDUSTRY SICCODE) Entering entity number 1(= INDUSTRY "Automotive"))) i.e. industry.

6> (GET-LQP "siccodetbset") First get the selected LQP.7> (GET-INFORMIX2E-DATA NIL

(ACTIVE_LQP (SELECTEDLQP)(= LQPSET "siccodetb set")))

8> (MAIN-GET-DATA-WITHOUT-CONVERSION WORKING-ACTIVE(ACTIVE_LQP (SELECTED LQP)

(= LQP_SET "siccodetbset")))9> (LQP-FETCH-DATA GET-DATA

(ACTIVELQP (SELECTED_LQP)(= LQP_SET "siccodetb set")))

1 row(s) unloaded.

<9 (LQP-FETCH-DATA Selected LQP has been fetched.(("SELECTEDLQP") ("siccodetb-mit2e")))

<8 (MAIN-GET-DATA-WITHOUT-CONVERSION(("SELECTEDLQP") ("siccodetb-mit2e"))(("SELECTEDLQP") ("siccodetb-mit2e")))

<7 (GET-INFORMIX2E-DATA(("SELECTEDLQP") ("siccodetb-mit2e")))

<6 (GET-LQP SICCODETB-MIT2E 15)6> (GET-INFORMIX2E-DATA GLOBAL-DATA-TYPE-CATALOGUE Now get the

("siccodetb" (INDUSTRY SICCODE) actual data.(= INDUSTRY "Automotive")) (industry & sic_code).

:DOWNLOAD YES)

7> (MAIN-GET-DATA-WITHOUT-CONVERSION SICCODETB-MIT2E("siccodetb" (INDUSTRY SICCODE)(= INDUSTRY "Automotive")))

8> (LQP-FETCH-DATA GET-DATA("siccodetb" (INDUSTRY SICCODE)(= INDUSTRY "Automotive")))

CONNECTING TO LOCAL INFORMIX2E DATABASE.....

1 row(s) unloaded.

Page 53: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 4

<8 (LQP-FETCH-DATA Data has been fetched.(("INDUSTRY" "SICCODE") ("Automotive" "371")))

<7 (MAIN-GET-DATA-WITHOUT-CONVERSION(("INDUSTRY" "SICCODE") ("Automotive" "371"))(("INDUSTRY" "SICCODE") ("Automotive" "371")))

<6 (GET-INFORMIX2E-DATA((:FORMATTED

(("INDUSTRY" "SICCODE") ("Automotive" "371")))(:UNFORMATTED

(("INDUSTRY" "SICCODE") ("Automotive" "371")))))<5 (GET-LQPM-SICCODETB-DATA siccodetb-data has been obtained.

(("INDUSTRY" "SICCODE") ("Automotive" "371")))<4 (SENDMESSAGE

(("INDUSTRY" "SICCODE") ("Automotive" "371")))<3 (ROUTETABLE

(((SICCODETBSET SICCODETB INDUSTRY)(SICCODETBSET SICCODETB SICCODE))

("Automotive" "371")))<2 (QUERYROUTER Back to query.router.

(((SICCODETBSET SICCODETB INDUSTRY)(SICCODETBSET SICCODETB SICCODE))

("Automotive" "371")))2> (QUERYROUTER query-router goes out again but this

(MERGE (GET-TABLE SICNUMTBSET SICNIUMTB time the data to be fetched(SICCODE SEQUENCENUM) WHERE (= SICCODE "371"))comes from 2 tables

(GET-TABLE ALUMNITBSET ALUMNITB so a "merge" will be used.(COMPANYNAME SEQUENCENUM):SYNS((COMPANYNAME COMPNAMESSYNTB2)):TRANS((COMPANYNAME STANDARDIZENAME)))

ON(= (SICNUMTBSET SICNUMTB SEQUENCE NUM)

(ALUMNITBSET ALUMNITB SEQUENCENUM))))3> (ROUTEMERGE routemerge....getting into entity

(MERGE (GET-TABLE SICNUMTBSET SICNUMTB number two (alumni).(SICCODE SEQUENCENUM) WHERE(= SIC CODE "371"))

(GET-TABLE ALUMNITBSET ALUMNITB(COMPANYNAME SEQUENCENUM) :SYNS((COMPANYNAME COMPNAMESSYNTB2)):TRANS((COMPANYNAME STANDARDIZENAME)))

ON(= (SICNUMTBSET SICNUMTB SEQUENCENUM)

(ALUMNITBSET ALUMNITB SEQUENCENUM))))4> (ROUTETABLE First part of the merge.

(GET-TABLE SICNUMTBSET SICNUMTB (SICCODE SEQUENCENUM)WHERE (= SICCODE "371")))

5> (SENDMESSAGE SICNUMTBSET :GET-DATA sendmessage...GLOBAL-DATA-TYPE-CATALOGUE to sicnumtb-object-set.

(SICNUMTB (SICCODE SEQUENCENUM) (= SICCODE "371")))6> (GET-LQPM-SICNUMTB-DATA GLOBAL-DATA-TYPE-CATALOGUE

(SICNUMTB (SICCODE SEQUENCENUM)(= SICCODE "371")))

7> (GET-LQP "sicnumtb_set") Get the selected LQP first.

Page 54: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 5

8> (GET-INFORMIX2E-DATA NIL(ACTIVELQP (SELECTED LQP)

(= LQP_SET "sicnumtb set")))

9> (MAIN-GET-DATA-WITHOUT-CONVERSION WORKING-ACTIVE(ACTIVELQP (SELECTEDLQP)

(= LQPSET "sicnumtb_set")))10> (LQP-FETCH-DATA GET-DATA

(ACTIVELQP (SELECTEDLQP)(= LQPSET "sicnumtb_set")))

1 row(s) unloaded. Here, the actual fetch takes placefrom the local tables.

<10 (LQP-FETCH-DATA Selected LQP has been fetched.(("SELECTEDLQP") ("sicnumtb-mit2e")))

<9 (MAIN-GET-DATA-WITHOUT-CONVERSION(("SELECTEDLQP") ("sicnumtb-mit2e"))(("SELECTEDLQP") ("sicnumtb-mit2e")))

<8 (GET-INFORMIX2E-DATA(("SELECTEDLQP") ("sicnumtb-mit2e")))

<7 (GET-LQP SICNUMTB-MIT2E 14)7> (GET-INFORMIX2E-DATA GLOBAL-DATA-TYPE-CATALOGUE Now get the

("sicnumtb" (SICCODE SEQUENCE_NUM) actual data.(= SICCODE "371"))

:DOWNLOAD YES)

8> (MAIN-GET-DATA-WITHOUT-CONVERSION SICNUMTB-MIT2E("sicnumtb" (SIC CODE SEQUENCENUM)(= SICCODE "371")))

9> (LQP-FETCH-DATA GET-DATA("sicnumtb" (SICCODE SEQUENCENUM)(= SICCODE "371")))

CONNECTING TO LOCAL INFORMIX2E DATABASE....

17 row(s) unloaded.

<9 (LQP-FETCH-DATA(("SICCODE" "SEQUENCENUM")("371" "1969052618") ("371" "1974294041")("371" "1971795708") ("371" "1978214160")("371" "1979710825") ("371" "1979608047")("371" "1976748856") ("371" "1967646543")("371" "1973511675") ("371" "1966576820")("371" "1978432928") ("371" "1971295643")("371" "1971273197") ("371" "1969617345")("371" "1965616543") ("371" "1965388889")

("371" "1965028635")))<8 (MAIN-GET-DATA-WITHOUT-CONVERSION

(("SIC CODE" "SEQUENCENUM")("371" "1969052618") ("371" "1974294041")("371" "1971795708") ("371" "1978214160")("371" "1979710825") ("371" "1979608047")("371" "1976748856") ("371" "1967646543")

The data has been fetched.

Page 55: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 6

("371" "1973511675") ("371" "1966576820")("371" "1978432928") ("371" "1971295643")("371" "1971273197") ("371" "1969617345")("371" "1965616543") ("371" "1965388889")

("371" "1965028635"))(("SICCODE" "SEQUENCENUM")("371" "1969052618") ("371" "1974294041")("371" "1971795708") ("371" "1978214160")("371" "1979710825") ("371" "1979608047")("371" "1976748856") ("371" "1967646543")("371" "1973511675") ("371" "1966576820")("371" "1978432928") ("371" "1971295643")("371" "1971273197") ("371" "1969617345")("371" "1965616543") ("371" "1965388889")

("371" "1965028635")))<7 (GET-INFORMIX2E-DATA

((:FORMATTED(("SICCODE" "SEQUENCENUM")("371" "1969052618") ("371" "1974294041")("371" "1971795708") ("371" "1978214160")("371" "1979710825") ("371" "1979608047")("371" "1976748856") ("371" "1967646543")("371" "1973511675") ("371" "1966576820")("371" "1978432928") ("371" "1971295643")("371" "1971273197') ("371" "1969617345")("371" "1965616543") ("371" "1965388889")("371" "1965028635")))

(:UNFORMATTED(("SICCODE" "SEQUENCENUM")("371" "1969052618") ("371" "1974294041")("371" "1971795708") ("371" "1978214160")("371" "1979710825") ("371" "1979608047")("371" "1976748856") ("371" "1967646543")("371" "1973511675") ("371" "1966576820")("371" "1978432928") ("371" "1971295643")("371" "1971273197') ("371" "1969617345")("371" "1965616543") ("371" "1965388889")("371" "1965028635")))))

<6 (GET-LQPM-SICNUMTB-DATA Data has been fetched from(("SICCODE" "SEQUENCENUM") ("371" "1969052618") sicumtb (sic-code &("371" "1974294041") ("371" "1971795708") sequence-number)("371" "1978214160") ("371" "1979710825")("371" "1979608047") ("371" "1976748856")("371" "1967646543") ("371" "1973511675")("371" "1966576820") ("371" "1978432928")("371" "1971295643") ("371" "1971273197')("371" "1969617345") ("371" "1965616543")("371" "1965388889") ("371" "1965028635")))

<5 (SENDMESSAGE send-message returning...(("SICCODE" "SEQUENCENUM") ("371" "1969052618")("371" "1974294041") ("371" "1971795708")("371" "1978214160") ("371" "1979710825")("371" "1979608047") ("371" "1976748856")("371" "1967646543") ("371" "1973511675")

Page 56: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 7

("371" "1966576820") ("371" "1978432928")("371"-"1971295643") ("371" "1971273197")("371" "1969617345") ("371" "1965616543")("371" "1965388889") ("371" "1965028635")))

<4 (ROUTETABLE First part of the merge has been done.(((SICNUMTBSET SICNUMTB SICCODE)(SICNUMTBSET SICNUMTB SEQUENCE NUM))("371" "1969052618") ("371" "1974294041")("371" "1971795708") ("371" "1978214160")("371" "1979710825") ("371" "1979608047")("371" "1976748856") ("371" "1967646543")("371" "1973511675") ("371" "1966576820")("371" "1978432928") ("371" "1971295643")("371" "1971273197") ("371" "1969617345")("371" "1965616543") ("371" "1965388889")

("371" "1965028635")))4> (QUERYROUTER query-router goes out again.

(GET-TABLE ALUMNITBSET ALUMNITB(COMPANYNAME SEQUENCENUM) WHERE(OR (= SEQUENCENUM "1969052618")

(OR (= SEQUENCENUM "1974294041")(OR (= SEQUENCENUM "1971795708")

(OR (= SEQUENCENUM "1978214160")(OR (= SEQUENCENUM "1979710825")(OR (= SEQUENCENUM "1979608047')(OR (= SEQUENCENUM "1976748856")(OR (= SEQUENCENUM "1967646543")

(OR(= SEQUENCENUM "1973511675")

(OR(= SEQUENCENUM "1966576820")

(OR(= SEQUENCENUM "1978432928")

(OR(= SEQUENCE_NUM"1971295643")

(OR(= SEQUENCENUM"1971273197')

(OR(= SEQUENCENUM"1969617345")

(OR(= SEQUENCENUM"1965616543")

(OR(= SEQUENCENUM"1965388889")(= SEQUENCENUM

"1965028635"))))))))))))))))):SYNS ((COMPANYNAME COMPNAMESSYNTB2)):TRANS((COMPANYNAME STANDARDIZE_NAME))))

5> (ROUTETABLE Second part of the merge....(GET-TABLE ALUMNITBSET ALUMNITB

Page 57: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 8

(COMPANYNAME SEQUENCENUM) WHERE(OR (1 SEQUENCENUM "1969052618")

(OR (= SEQUENCENUM "1974294041")(OR (= SEQUENCENUM "1971795708")

(OR (= SEQUENCENUM "1978214160")(OR (= SEQUENCENUM "1979710825")(OR (= SEQUENCENUM "1979608047')

(OR(= SEQUENCENUM "1976748856")

(OR(= SEQUENCENUM "1967646543")

(OR(= SEQUENCENUM "1973511675")

(OR(= SEQUENCENUM"1966576820")

(OR(= SEQUENCENUM"1978432928")

(OR(= SEQUENCE_NUM"1971295643")

(OR(= SEQUENCENUM"1971273197')

(OR(= SEQUENCENUM"196%17345")

(OR(= SEQUENCENUM"1965616543")

(OR(= SEQUENCENUM"1965388889")(= SEQUENCENUM

"1965028635"))))))))))))))))):SYNS ((COMPANYNAME COMPNAMESSYNTB2)):TRANS((COMPANYNAME STANDARDIZENAME))))

6> (SEND_MESSAGE ALUMNITBSET :GET-DATA sendmessage ...GLOBAL-DATA-TYPE-CATALOGUE alumni-objectset.(ALUMNITB (COMPANYNAME SEQUENCENUM)

(OR (= SEQUENCENUM "1969052618")(OR (= SEQUENCENUM "1974294041")

(OR (= SEQUENCENUM "1971795708")(OR (= SEQUENCENUM "1978214160")

(OR(= SEQUENCENUM "1979710825")

(OR(= SEQUENCENUM "1979608047')

(OR(= SEQUENCENUM "1976748856")

(OR(= SEQUENCENUM"1967646543")

Page 58: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

IW1nrDN~flOHS =)NO)

WnF1NHDN~n0HS =)NO)

NnN3DN31OHS =)NO)

NnN7DN3nOHS =)NO)

WnflaDNaflOHS =)NO)

(a9f9961.. NfIN73DNEVhORS =)NO)

(-9'8LZ9L6L,, InN HDN~n03S =)NO)

(.,LW'896L6 1. NN DN~flOHS =)NO)

(aiSZSO~IL6L61., i NN DNafl OHS N)O)(88 9HVIL6 1 . MP4LN DNafl0HS N) O)

(.S8OL6LIL61.. INNDNaF1OHS =) NO)(AtOi6Zt'L611 INNDN~HOHS =) NO)(.,819ZS696L 1, INNDN~flOHS =) NO)

(WnflN 9DNafl0S Th4VN X-NVdAOD) auINnv)

(.6888C96 1Wnf1NDN31O3S =)

NO)

NNan~fORS =)No)

(.1SKZ196961.1NnKaDN~f1O3S =)

NO)

MI-f1Na~n OHS =)NO)

IAnfN aDNsMOES =)HO)

(..8Z6CNL6TLWnN-DN~n0HS =)

HO)(. 1OZS9ZS996I 1.

Wn-KFINDN3IfOgs =)HO)

vnNn-aN~n03s =) -NO)

6 (mO) ,kHIO ISUHHJ NIHI H- q1rIHU

Page 59: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 10

(OR- (= SEQUENCENUM

"1%9617345")(OR

(= SEQUENCENUM"1965616543")

(OR(= SEQUENCENUM"1965388889")(=SEQUENCENUM

"1965028635")))))))))))))))))))8> (GET-LQP "alumnitbset")

9> (GET-INFORMIX2E-DATA NIL(ACTIVELQP (SELECTEDLQP)

(= LQP_SET "alumnitbset")))

10> (MAIN-GET-DATA-WITHOUT-CONVERSIONWORKING-ACTIVE(ACTIVE LQP (SELECTEDLQP)

(= LQPSET "alumnitb_set")))11> (LQP-FETCH-DATA GET-DATA

(ACTVELQP (SELECTEDLQP)(= LQPSET "alumnitb_set")))

1 row(s) unloaded.

First get the selected LQP.

<11 (LQP-FETCH-DATA Selected LQP has been fetched.(("SELECTEDLQP") ("alumnitb-mit2e"))) It is "alumnitb-mit2e".

<10 (MAIN-GET-DATA-WITHOUT-CONVERSION(("SELECTEDLQP") ("alumnitb-mit2e"))(("SELECTEDLQP") ("alumnitb-mit2e")))

<9 (GET-INFORMIX2E-DATA(("SELECTEDLQP") ("alumnitb-mit2e")))

<8 (GET-LQP ALUMNITB-MIT2E 14)8> (GET-INFORMIX2E-DATA GLOBAL-DATA-TYPE-CATALOGUE Now get the

("alumnitb" (COMPANYNAME SEQUENCENUM) actual data.(OR (= SEQUENCENUM "1969052618")

(OR (= SEQUENCENUM "1974294041")(OR (= SEQUENCENUM "1971795708")

(OR (= SEQUENCENUM "1978214160")(OR (= SEQUENCENUM "1979710825")(OR (= SEQUENCENUM "1979608047')

(OR(= SEQUENCENUM "1976748856")

(OR(= SEQUENCENUM "1%7646543")

(OR(= SEQUENCENUM "1973511675")

(OR(= SEQUENCENUM"1966576820")

(OR(= SEQUENCENUM"1978432928")

Page 60: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

WIN - aNanORs =)NLO)

_ . 1SKLt96961.WnfNnOgS =)

NIO)_(..L6LZIL6L 1,

IAUnK DN~f1O9S =)NIO)

NOnNanO3S =)NLO)

WNO4-Nn03OS =)NIO)

PONDNF1Oes =)NIO)

NN D af1CSS =)NIO)

WNON 3D:N3f1OHS =)NaO)

(-99M8IL9L61 NflN RDNEM03S =)NIO)

(8Lf'O8O6L6 1. WnON DN~f1OS =)NiO)

(.49Z8OtL6L6L 1 NF WnsCDNIC)HOS =)NIO)

(. 1 KItZ8L6T.. IPCflN EMfl S =) NLO)(*a8OL96LIL61. WflN nDN~HOIS =) NO0)

(.114OIt6Zt'L6T. NN IOHS =) N10)(.,8t9ZSO696L 1 , WNNDN~IOO3S =) N10)

(NON HDNan03S 3WVN7 ANVCIOD) 1 qj1P-""",.)HZI-iIN~nrIV N0IS 1IEANOD-mOHLIM-VJLVu-J3ED-MN ) <6

(Salk UIVOX1NAMOG:

_ (.68MC961..WnON HDN3.n0HS =)

N10)_ (CKF919S96L 1,

WnON DNHOO3S =)NO0)

(.,SKL196961.NNn0fOS =)

NO0)(.&61UZLZ1..

WNN nN30S =)NO0)

(..t96ZIL6T..WNs-N naO)aS =)

NO0)

I I (IFO) 1kNflO ISH HHJJ 10d Haiaaiaa

Page 61: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 12

"1965616543")(OR

(= SEQUENCENUM"1965388889")(= SEQUENCENUM

"1965028635")))))))))))))))))))10> (LQP-FETCH-DATA GET-DATA

("alumnitb" (COMPANYNAME SEQUENCE_NUM)(OR (= SEQUENCENUM "1969052618")

(OR (= SEQUENCENUM "1974294041")(OR (= SEQUENCENUM "1971795708")(OR (= SEQUENCENUM "1978214160")

(OR(= SEQUENCENUM "1979710825")

(OR(= SEQUENCENUM "1979608047')

(OR(= SEQUENCENUM "1976748856")

(OR(= SEQUENCENUM"1967646543")

(OR(= SEQUENCENUM"1973511675")

(OR(= SEQUENCENUM"1966576820")

(OR(= SEQUENCENUM"1978432928")

(OR(= SEQUENCENUM"1971295643")

(OR(= SEQUENCENUM"1971273197')

(OR(= SEQUENCENUM"1969617345")

(OR(= SEQUENCE_NUM"1965616543")

(OR(= SEQUENCENUM"1965388889")(= SEQUENCENUM

"1965028635")))))))))))))))))))

Now fetch the datafrom the local tables.

CONNECTING TO LOCAL INFORMIX2E DATABASE....

17 row(s) unloaded.

<10 (LQP-FETCH-DATA(("COMPANYNAME" "SEQUENCENUM")

The fetching of the datatakes place here.The data has been fetched.

Page 62: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 13

("Ford Motor Co" "1965028635")("ford Motor Co" "1965388889")("Ford Motor Co" "1965616543")("General Motors Corp" "1967646543")("Tartan Transportation Sys" "1969052618")

("Ford Motor Co" "1969617345")("Ford Motor Co" "1971273197')("Ford Motor Co" "1971295643")("General Motors Corp" "1976748856")("Ford Motor Co" "1978432928")("General Motors Corp" "1979608047')("Geni Motors Corp" "1979710825")("Ford Espana SA" "1973511675")("General Motors Corp" "1978214160")("Holman Enterprises" "1971795708")("Valeo SA" "1974294041")("Ford Motor Co" "1966576820")))

<9 (MAIN-GET-DATA-WITHOUT-CONVERSION(("COMPANYNAME" "SEQUENCE_NUM")("Ford Motor Co" "1965028635")("Ford Motor Co" "1965388889")("Ford Motor Co" "1965616543")("General Motors Corp" "1967646543")("Tartan Transportation Sys" "1969052618")

("Ford Motor Co" "1969617345")("Ford Motor Co" "1971273197")("Ford Motor Co" "1971295643")("General Motors Corp" "1976748856")("Ford Motor Co" "1978432928")("General Motors Corp" "1979608047')("Genl Motors Corp" "1979710825")("Ford Espana SA" "1973511675")("General Motors Corp" "1978214160")("Holman Enterprises" "1971795708")("Valeo SA" "1974294041")("Ford Motor Co" "1966576820"))(("COMPANYNAME" "SEQUENCE_NUM")("Ford Motor Co" "1965028635")("Ford Motor Co" "1965388889")("Ford Motor Co" "1965616543")("General Motors Corp" "1967646543")("Tartan Transportation Sys" "1969052618")

("Ford Motor Co" "1969617345")("Ford Motor Co" "1971273197')("Ford Motor Co" "1971295643")("General Motors Corp" "1976748856")("Ford Motor Co" "1978432928")("General Motors Corp" "1979608047')("Gen Motors Corp" "1979710825")("Ford Espana SA" "1973511675")("General Motors Corp" "1978214160")("Holman Enterprises" "1971795708")("Valeo SA" "1974294041")("Ford Motor Co" "1966576820")))

Page 63: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 14

<8 (GET-INFORMIX2E-DATA((:FORMATTED

(("COMPANYNAME" "SEQUENCENUM")("Ford Motor Co" "1965028635")("Ford Motor Co" "1965388889")("Ford Motor Co" "1965616543")("General Motors Corp" "1967646543")("Tartan Transportation Sys" "1969052618")

("Ford Motor Co" "1969617345")("Ford Motor Co" "1971273197')("Ford Motor Co" "1971295643")("General Motors Corp" "1976748856")("Ford Motor Co" "1978432928")("General Motors Corp" "1979608047')("Geni Motors Corp" "1979710825")("Ford Espana SA" "1973511675")("General Motors Corp" "1978214160")("Holman Enterprises" "1971795708")("Valeo SA" "1974294041")("Ford Motor Co" "1966576820")))

(:UNFORMA TTED(("COMPANYNAME" "SEQUENCENUM")("Ford Motor Co" "1965028635")("Ford Motor Co" "1965388889")("Ford Motor Co" "1965616543")("General Motors Corp" "1967646543")("Tartan Transportation Sys" "1969052618")

("Ford Motor Co" "1969617345")("Ford Motor Co" "1971273197')("Ford Motor Co" "1971295643")("General Motors Corp" "1976748856")("Ford Motor Co" "1978432928")("General Motors Corp" "1979608047')("Gen1 Motors Corp" "1979710825")("Ford Espana SA" "1973511675")("General Motors Corp" "1978214160")("Holman Enterprises" "1971795708")("Valeo SA" "1974294041")("Ford Motor Co" "1966576820")))))

<7 (GET-LQPM-ALUMNITB-DATA(("COMPANYNAME" "SEQUENCENUM")("Ford Motor Co" "1965028635")("Ford Motor Co" "1965388889")("Ford Motor Co" "1965616543")("General Motors Corp" "1967646543")("Tartan Transportation Sys" "1969052618")

("Ford Motor Co" "1969617345")("Ford Motor Co" "1971273197")("Ford Motor Co" "1971295643")("General Motors Corp" "1976748856")("Ford Motor Co" "1978432928")("General Motors Corp" "1979608047')("Geni Motors Corp" "1979710825")("Ford Espana SA" "1973511675")

alumnitb-datahas been obtained.

Page 64: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 15

("General Motors Corp" "1978214160")("Holman Enterprises" "1971795708")("Valeo SA" "1974294041")("Ford Motor Co" "1966576820")))

<6 (SENDMESSAGE send message returning..(("COMPANYNAME" "SEQUENCE_NUM")("Ford Motor Co" "1965028635")("Ford Motor Co" "1965388889")("Ford Motor Co" "1965616543")("General Motors Corp" "1967646543")("Tartan Transportation Sys" "1969052618")("Ford Motor Co" "1969617345")("Ford Motor Co" "1971273197')("Ford Motor Co" "1971295643")("General Motors Corp" "1976748856")("Ford Motor Co" "1978432928")("General Motors Corp" "1979608047')("Geni Motors Corp" "1979710825")("Ford Espana SA" "1973511675")("General Motors Corp" "1978214160")("Holman Enterprises" "1971795708")("Valeo SA" "1974294041")("Ford Motor Co" "1966576820")))

<5 (ROUTETABLE Second part of the merge has been done.(((ALUMNITBSET ALUMNITB COMPANY NAME)(ALUMNITBSET ALUMNITB SEQUENCENUM))

("FORD MOTOR CO" "1965028635")("FORD MOTOR CO" "1965388889")("FORD MOTOR CO" "1965616543")("GENERAL MOTORS CORP" "1967646543")("Tartan Transportation Sys" "1969052618")("FORD MOTOR CO" "1969617345")("FORD MOTOR CO" "1971273197")("FORD MOTOR CO" "1971295643")("GENERAL MOTORS CORP" "1976748856")("FORD MOTOR CO" "1978432928")("GENERAL MOTORS CORP" "1979608047")("GENERAL MOTORS CORP" "1979710825")("FORD MOTOR CO" "1973511675")("GENERAL MOTORS CORP" "1978214160")("Holman Enterprises" "1971795708")("Valeo SA" "1974294041")("FORD MOTOR CO" "1966576820")))

<4 (QUERYROUTER(((ALUMNITBSET ALUMNITB COMPANY NAME)(ALUMNITBSET ALUMNITB SEQUENCENUM))

("FORD MOTOR CO" "1965028635")("FORD MOTOR CO" "1965388889")("FORD MOTOR CO" "1965616543")("GENERAL MOTORS CORP" "1967646543")("Tartan Transportation Sys" "1969052618")("FORD MOTOR CO" "1969617345")("FORD MOTOR CO" "1971273197")("FORD MOTOR CO" "1971295643")

Page 65: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 16

("GENERAL MOTORS CORP" "1976748856")("FORD-MOTOR CO" "1978432928")("GENERAL MOTORS CORP" "1979608047')("GENERAL MOTORS CORP" "1979710825")("FORD MOTOR CO" "1973511675")("GENERAL MOTORS CORP" "1978214160")("Holman Enterprises" "1971795708")("Valeo SA" "1974294041")("FORD MOTOR CO" "1966576820")))

<3 (ROUTEMERGE Now merge the data from the 2 tables.(((SICNUMTBSET SICNUMTB SICCODE) We have finished with entity number(SICNUMTBSET SICNUMTB SEQUENCE NUM) two: alumni.(ALUMNITBSET ALUMNITB COMPANY_NAME)(ALUMNITBSET ALUMNITB SEQUENCE NUM))("371" "1969052618" "Tartan Transportation Sys""1969052618")("371" "1974294041" "Valeo SA" "1974294041")("371" "1971795708" "Holman Enterprises" "1971795708")("371" "1978214160" "GENERAL MOTORS CORP" "1978214160")("371" "1979710825" "GENERAL MOTORS CORP" "1979710825")("371" "1979608047" "GENERAL MOTORS CORP" "1979608047")("371" "1976748856" "GENERAL MOTORS CORP" "1976748856")("371" "1967646543" "GENERAL MOTORS CORP" "1967646543")("371" "1973511675" "FORD MOTOR CO" "1973511675")("371" "1966576820" "FORD MOTOR CO" "1966576820")("371" "1978432928" "FORD MOTOR CO" "1978432928")("371" "1971295643" "FORD MOTOR CO" "1971295643")("371" "1971273197" "FORD MOTOR CO" "1971273197")("371" "1969617345" "FORD MOTOR CO" "1969617345")("371" "1965616543" "FORD MOTOR CO" "1965616543")("371" "1965388889" "FORD MOTOR CO" "1965388889")("371" "1965028635" "FORD MOTOR CO" "1965028635")))

<2 (QUERYROUTER Returning the data for the sub-query(((SICNUMTBSET SICNUMTB SICCODE) to the router.(SICNUMTBSET SICNUMTB SEQUENCENUM)(ALUMNITBSET ALUMNITB COMPANY_NAME)(ALUMNITBSET ALUMNITB SEQUENCENUM))("371" "1969052618" "Tartan Transportation Sys""1969052618")("371" "1974294041" "Valeo SA" "1974294041")("371" "1971795708" "Holman Enterprises" "1971795708")("371" "1978214160" "GENERAL MOTORS CORP" "1978214160")("371" "1979710825" "GENERAL MOTORS CORP" "1979710825")("371" "1979608047" "GENERAL MOTORS CORP" "1979608047")("371" "1976748856" "GENERAL MOTORS CORP" "1976748856")("371" "1967646543" "GENERAL MOTORS CORP" "1967646543")("371" "1973511675" "FORD MOTOR CO" "1973511675")("371" "1966576820" "FORD MOTOR CO" "1966576820")("371" "1978432928" "FORD MOTOR CO" "1978432928")("371" "1971295643" "FORD MOTOR CO" "1971295643")("371" "1971273197" "FORD MOTOR CO" "1971273197")("371" "1969617345" "FORD MOTOR CO" "1969617345")("371" "1965616543" "FORD MOTOR CO" "1965616543")("371" "1965388889" "FORD MOTOR CO" "1965388889")

Page 66: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 17

("371" "1965028635" "FORD MOTOR CO" "1965028635")))2> (QUERYROUTER query_router sends out anothe

(GET-TABLE RECRUITSET "companytbl" This is entity(COMPANYNAME POSITION VISITDAY) WHERE(OR (= COMPANYNAME "Tartan Transportation Sys")

(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME "Holman Enterprises")

(OR (= COMPANYNAME "GENERAL MOTORS CORP")(= COMPANYNAME "FORD MOTOR CO")))))

:SYNS ((COMPANYNAME COMPNAMES SYNTB1)) :TRANS((COMPANYNAME STANDARDIZENAME))))

3> (ROUTETABLE(GET-TABLE RECRUITSET "companytbl"

(COMPANYNAME POSITION VISITDAY) WHERE(OR (= COMPANYNAME "Tartan Transportation Sys")

(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME "Holman Enterprises")

(OR (= COMPANYNAME "GENERAL MOTORS CORP")(= COMPANYNAME "FORD MOTOR CO")))))

:SYNS ((COMPANYNAME COMPNAMESSYNTB1)) :TRANS((COMPANYNAME STANDARDIZENAME))))

4> (SENDMESSAGE RECRUITSET :GET-DATA sendGLOBAL-DATA-TYPE-CATALOGUE recru

("companytbl" (COMPANYNAME POSITION VISITDAY)(OR (= COMPANYNAME "Tartan Transportation Sys")

(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME "Holman Enterprises")

(OR (= COMPANYNAME"General Motors Corporation")

(= COMPANYNAME"The Ford Motor Company")))))))

5> (GET-LQPM-RECRUIT-DATA GLOBAL-DATA-TYPE-CATALOGUE("companytbl" (COMPANYNAME POSITION VISITDAY)(OR (= COMPANYNAME "Tartan Transportation Sys")

(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME "Holman Enterprises")(OR (= COMPANYNAME

"General Motors Corporation")(= COMPANYNAME

"The Ford Motor Company")))))))6> (GET-LQP "recruitset") First get the s

7> (GET-INFORMIX2E-DATA NIL(ACTIVE_LQP (SELECTEDLQP)

(= LQPSET "recruitset")))

r sub-query.3 (company).

message.. toit-object_set.

elected LQP.

8> (MAIN-GET-DATA-WITHOUT-CONVERSION WORKING-ACTIVE(ACTIVE_LQP (SELECTEDLQP)

(= LQP_SET "recruitset")))9> (LQP-FETCH-DATA GET-DATA

(ACTIVELQP (SELECTED LQP)(= LQP_SET "recruitset")))

Fetching the selcted LQP fromthe local tables.

1 row(s) unloaded.

Page 67: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 18

<9 (LQP-FETCH-DATA(("SELECTED LQP") ("recruit-ibmrt"))) Selected

<8 (MAIN-GET-DATA-WITHOUT-CONVERSION it is "recr(("SELECTEDLQP") ("recruit-ibmrt"))(("SELECTEDLQP") ("recruit-ibmrt")))

<7 (GET-INFORMIX2E-DATA(("SELECTEDLQP") ("recruit-ibmrt")))

<6 (GET-LQP RECRUIT-IBMRT 13)6> (GET-ORACLERT-DATA GLOBAL-DATA-TYPE-CATALOGUE

("companytbl" a(COMPANYNAME POSITION VISITDAY SCHEDULE) re(OR (= COMPANYNAME "Tartan Transportation Sys") m

(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME "Holman Enterprises")(OR (= COMPANYNAME

"General Motors Corporation")(= COMPANYNAME

"The Ford Motor Company")))))):DOWNLOAD YES)

LQP has been fetched,uit-ibnrt".

Now get thectual data from themote ORACLE RTachine.

7> (MAIN-GET-DATA-WITH-CONVERSION RECRUIT-IBMRTGLOBAL-DATA-TYPE-CATALOGUE("companytbl"

(COMPANYNAME POSITION VISITDAY SCHEDULE)(OR (= COMPANYNAME "Tartan Transportation Sys")

(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME "Holman Enterprises")(OR (= COMPANYNAME

"General Motors Corporation")(= COMPANYNAME

"The Ford Motor Company")))))))8> (BUILD-LQP-DATA-CONVERSION-STRATEGY Building the

GLOBAL-DATA-TYPE-CATALOGUE) conversion strategy.<8 (BUILD-LQP-DATA-CONVERSION-STRATEGY

((DATE ((SENDER-SEMANTICS((FORMAT DATE-MM/DD/YYYY)))

(RECEIVER-SEMANTICS((FORMAT DATE-MONTH-DAY)))))))

8> (LQP-FETCH-DATA GET-DATA A call to fetch the data...("companytbl" but first some info is needed

(COMPANYNAME POSITION VISITDAY SCHEDULE) from the remote(OR (= COMPANYNAME database about the no. of

'Tartan Transportation Sys") rows and column lengths of the(OR (= COMPANYNAME "Valeo SA") attribute headings.(OR (= COMPANYNAME

"Holman Enterprises")(OR

(= COMPANYNAME"General Motors Corporation")

(= COMPANYNAME"The Ford Motor Company")))))))

CONNECTING TO REMOTE ORACLE RT MACHINE....

Page 68: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 19

9> (GET-ORACLERT-NUMBER-ROWS Get the no. of rows.("companytbl"

(COMPANYNAME POSITION VISITDAY SCHEDULE)(OR (= COMPANYNAME

"Tartan Transportation Sys")(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME

"Holman Enterprises")(OR

(= COMPANYNAME"General Motors Corporation")

(= COMPANY NAME"The Ford Motor Company"))))))

10> (LQP-FETCH-DATA GET-NUMBER-ROWS("companytbl"

(COMPANYNAME POSITION VISITDAY SCHEDULE)(OR (= COMPANYNAME

"Tartan Transportation Sys")(OR (= COMPANYNAME "Valeo SA")(OR

(= COMPANYNAME"Holman Enterprises")

(OR(= COMPANYNAME

"General Motors Corporation")(= COMPANY NAME

"The Ford Motor Company")))))))

Escape character is '-'.Trying...Connected to donner.IBM RT PC Advanced Interactive Executive Operating System74X9995 (c) Copyright IBM Corp. 1985, 1986(/dev/ptsO)login:rwangrwangPassword:cis15579

*Please keep your use to non-peak hours.*Peak hours are 9:00-18:00 local time.

* MASSACHUSETTS INSTITUTE OF TECHNOLOGY - AIX *

* WELCOME TO 15.579 and the IBM RT!* COMMUNICATIONS & CONNECTIVITY

*RT $cd /u/class2

* cd /u/class2*RT $sqlcmdsqlcmd

Page 69: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1)

SQL/RT Command Line Interface74X9997 (C) COPYRIGHT IBM CORP. 1985, 1986(C) COPYRIGHT ORACLE CORP., CALIFORNIA, USA 1985, 1986ALL RIGHTS RESERVEDLICENSED MATERIALS - PROPERTY OF IBM AND ORACLE CORP.>set pagesize 55set pagesize 55

>SELECT count(company-.name)SELECT count(company-name)

2FROM companytbl WHERE (COMPANYNAME = 'Tartan Transportation Sys')FROM companytbl WHERE (COMPANYNAME = 'Tartan Transportation Sys')

3OR((COMPANYNAME = 'Valeo SA')

OR ((COMPANYNAME = 'Valeo SA')

40R ((COMPANYNAME = 'Holman Enterprises')OR ((COMPANYNAME = 'Holman Enterprises')

5OR ((COMPANYNAME = 'General Motors Corporation')OR ((COMPANYNAME = 'General Motors Corporation')

60R (COMPANYNAME = 'The Ford Motor Company'))));OR (COMPANYNAME = 'The Ford Motor Company'))));

COUNT(COMPANYNAME)

5

1 record selected.

<10 (LQP-FETCH-DATA Number of rows = "5".(("COUNT(COMPANYNAME)") ("5")))

<9 (GET-ORACLERT-NUMBER-ROWS(("COUNT(COMPANYNAME)") ("5")))

>set pagesize 55set pagesize 55

>SELECT NVL(company-name, 'N/A')SELECT NVL(company-name, 'N/A')

2company-name, NVL(position, 'N/A')companyname, NVL(position, 'N/A')

3position, NVL(visit_day, 'N/A')

Page 70: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 21

position, NVL(visit_day, 'N/A')

4visitday, NVL(schedule, 'N/A')visit-day, NVL(schedule, 'N/A')

5schedule FROM companytbl WHERE (COMPANYNAME = 'Tartan Transportation Sys')schedule FROM companytbl WHERE (COMPANYNAME = 'Tartan Transportation Sys')

6OR ((COMPANYNAME = 'Valeo SA')OR ((COMPANYNAME = 'Valeo SA')

70R ((COMPANYNAME = 'Holman Enterprises')OR ((COMPANYNAME = 'Holman Enterprises')

80R ((COMPANYNAME = 'General Motors Corporation')OR ((COMPANYNAME = 'General Motors Corporation')

90R (COMPANYNAME = 'The Ford Motor Company'))));OR (COMPANYNAME = 'The Ford Motor Company'))));

COMPANYNAME POSITION VISITDAY SCH

The Ford Motor Company finance January 28 1The Ford Motor Company corporate planning January 28 1The Ford Motor Company corporate planning February 11 1The Ford Motor Company operations research February 18 1General Motors Corporation manager February 9 1

5 records selected.

9> (GET-ORACLERT-COLUMNS "companytbl") Now get column lengths.10> (LQP-FETCH-DATA GET-COLUMNS "companytbl")

>set pagesize 55set pagesize 55

>select col$name, col$length from columns [companytbl];select col$name, col$length from columns [companytbl];

COL$NAME

COL$LENGTH

COMPANYNAME30

Page 71: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 22

SCHEDULE2

STATUS1

VISIT DAY12

POSITION20

SICCODE5

6 records selected.

<10 (LQP-FETCH-DATA(("col$name" "col$length")("COMPANYNAME" "30") ("SCHEDULE" "2")("STATUS" "1") ("VISITDAY" "12")("POSITION" "20") ("SICCODE" "5")))

<9 (GET-ORACLERT-COLUMNS Column length info. obtained.(("col$name" "col$length")("COMPANYNAME" "30") ("SCHEDULE" "2")("STATUS" "1") ("VISITDAY" "12")("POSITION" "20") ("SICCODE" "5")))

<8 (LQP-FETCH-DATA Data has been fetched.(("COMPANYNAME" "POSITION" "VISITDAY"

"SCHEDULE")("The Ford Motor Company" "finance""January 28" "1")("The Ford Motor Company" "corporate planning""January 28" "1")("The Ford Motor Company" "corporate planning""February 11" "1")("The Ford Motor Company" "operations research""February 18" "1")("General Motors Corporation" "manager""February 9" "1")))

8> (LQP-DATA-FILTER Convert the data list, and see(("COMPANYNAME" "POSITION" "VISITDAY"data needs to be downloaded.

"SCHEDULE")("The Ford Motor Company" "finance""January 28" "1")("The Ford Motor Company" "corporate planning""January 28" "1")("The Ford Motor Company" "corporate planning""February 11" "1")("The Ford Motor Company" "operations research"

if

Page 72: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 23

"February 18" "1")("General Motors Corporation" "manager""February 9" "1"))

("companytbl"(COMPANYNAME POSITION VISITDAY SCHEDULE)

(OR (= COMPANYNAME"Tartan Transportation Sys")

(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME

"Holman Enterprises")(OR

(= COMPANYNAME"General Motors Corporation")

(= COMPANYNAME'The Ford Motor Company"))))))

("companytbl"(COMPANYNAME POSITION VISITDAY SCHEDULE)

(OR (= COMPANYNAME"Tartan Transportation Sys")

(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME

"Holman Enterprises")(OR

(= COMPANYNAME"General Motors Corporation")

(= COMPANYNAME'The Ford Motor Company"))))))

("companytbl"(COMPANYNAME POSITION VISITDAY SCHEDULE)

(OR (= COMPANYNAME"Tartan Transportation Sys")

(OR (= COMPANYNAME "Valeo SA")(OR (= COMPANYNAME

"Holman Enterprises")(OR

(= COMPANYNAME"General Motors Corporation")

(= COMPANYNAME"The Ford Motor Company"))))))

NIL NIL GLOBAL-DATA-TYPE-CATALOGUE Data will not be downloaded((DATE ((SENDER-SEMANTICS because predicates 1 & 2

((FORMAT DATE-MM/DD/YYYY))) are NIL.(RECEIVER-SEMANTICS

((FORMAT DATE-MONTH-DAY)))))))9> (LQP-CONVERT-DATA-LIST Convert the data-list into

(("COMPANYNAME" "POSITION" "VISITDAY' sender-semantics."SCHEDULE")("The Ford Motor Company" "finance""January 28" "1")

("The Ford Motor Company""corporate planning" "January 28" "1")

("The Ford Motor Company""corporate planning" "February 11" "1")

("The Ford Motor Company"

Page 73: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 24

"operations research" "February 18" "1")("General Motors Corporation" "manager""February 9" "1")):TARGET SENDER-SEMANTICS:A'ITRIBUTES(COMPANYNAME POSITION VISITDAY SCHEDULE)

:STRATEGY((DATE ((SENDER-SEMANTICS

((FORMAT DATE-MM/DD/YYYY)))(RECEIVER-SEMANTICS

((FORMAT DATE-MONTH-DAY)))))))10> (CONVERT-DATE-FORMAT-OF-DATA-LIST Here date format is being

(((DATE ((VALUE "January 28") converted, from month, day(FORMAT DATE-MONTH-DAY)))) to M DD/YY.

((DATE ((VALUE "January 28")(FORMAT DATE-MONTH-DAY))))

((DATE ((VALUE "February 11")(FORMAT DATE-MONTH-DAY))))

((DATE ((VALUE "February 18")(FORMAT DATE-MONTH-DAY))))

((DATE ((VALUE "February 9")(FORMAT DATE-MONTH-DAY))))))

<10 (CONVERT-DATE-FORMAT-OF-DATA-LIST(((DATE ((VALUE "01/28/1991")

(FORMAT DATE-MM/DD/YYYY))))((DATE ((VALUE "01/28/1991")

(FORMAT DATE-MM/DD/YYYY))))((DATE ((VALUE "02/11/1991")

(FORMAT DATE-MM/DD/YYYY))))((DATE ((VALUE "02/18/1991")

(FORMAT DATE-MM/DD/YYYY))))((DATE ((VALUE "02/09/1991")

(FORMAT DATE-MM/DD/YYYY))))))<9 (LQP-CONVERT-DATA-LIST Converted data-list

(("COMPANYNAME" "POSITION" "VISITDAY" returning."SCHEDULE")("The Ford Motor Company" "finance""01/28/1991" "1")("The Ford Motor Company""corporate planning" "01/28/1991" "1")

("The Ford Motor Company""corporate planning" "02/11/1991" "1")

('"The Ford Motor Company""operations research" "02/18/1991" "1")

("General Motors Corporation" "manager""02/09/1991" "1")))

<8 (LQP-DATA-FILTER(("COMPANYNAME" "POSITION" "VISITDAY"

"SCHEDULE")("The Ford Motor Company" "finance""01/28/1991" "1")("The Ford Motor Company" "corporate planning""01/28/1991" "1")("The Ford Motor Company" "corporate planning""02/11/1991" "1")

Page 74: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 25

("The Ford Motor Company" "operations research""02/18/1991" "1")("General Motors Corporation" "manager""02/09/1991" "1")))

<7 (MAIN-GET-DATA-WITH-CONVERSION(("COMPANYNAME" "POSITION" "VISITDAY"

"SCHEDULE")("The Ford Motor Company" "finance" "01/28/1991"

"1")("The Ford Motor Company" "corporate planning""01/28/1991" "1")("The Ford Motor Company" "corporate planning""02/11/1991" "1")("The Ford Motor Company" "operations research""02/18/1991" "l")("General Motors Corporation" "manager""02/09/1991" "1"))(("COMPANYNAME" "POSITION" "VISITDAY"

"SCHEDULE")("The Ford Motor Company" "finance" "January 28"

"1")("The Ford Motor Company" "corporate planning""January 28" "1")("The Ford Motor Company" "corporate planning""February 11" "1")("The Ford Motor Company" "operations research""February 18" "1")("General Motors Corporation" "manager""February 9" "1")))

<6 (GET-ORACLERT-DATA((:FORMATTED

(("COMPANYNAME" "POSITION" "VISITDAY""SCHEDULE")("The Ford Motor Company" "finance""01/28/1991" "1")("The Ford Motor Company" "corporate planning""01/28/1991" "1")("The Ford Motor Company" "corporate planning""02/11/1991" "1")

("The Ford Motor Company""operations research" "02/18/1991" "1")

("General Motors Corporation" "manager""02/09/1991" "1")))

(:UNFORMATTED(("COMPANYNAME" "POSITION" "VISITDAY"

"SCHEDULE")("The Ford Motor Company" "finance""January 28" "1")("The Ford Motor Company" "corporate planning""January 28" "1")("The Ford Motor Company" "corporate planning""February 11" "1")

("The Ford Motor Company""operations research" "February 18" "1")

Page 75: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1)

("General Motors Corporation" "manager""February 9" "1")))))

<5 (GET-LQPM-RECRUIT-DATA Recruit data that has been(("COMPANYNAME" "POSITION" "VISIT DAY") obtained and converted is

("The Ford Motor Company" "finance" "01 /28/1991") returned.("The Ford Motor Company" "corporate planning""01/28/1991")("The Ford Motor Company" "corporate planning""02/11/1991")("The Ford Motor Company" "operations research""02/18/1991")("General Motors Corporation" "manager" "02/09/1991")))

<4 (SENDMESSAGE send_message returning.(("COMPANYNAME" "POSITION" "VISIT DAY")

("The Ford Motor Company" "finance" "01/28/1991")("The Ford Motor Company" "corporate planning""01/28/1991")("The Ford Motor Company" "corporate planning""02/11/1991")("The Ford Motor Company" "operations research""02/18/1991")("General Motors Corporation" "manager" "02/09/1991")))

<3 (ROUTETABLE We have finished with(((RECRUITSET "companytbl" COMPANYNAME) entity number three.(RECRUITSET "companytbl" POSITION)(RECRUITSET "companytbl" VISITDAY))

("FORD MOTOR CO" "finance" "01/28/1991")("FORD MOTOR CO" "corporate planning" "01/28/1991")("FORD MOTOR CO" "corporate planning" "02/11/1991")("FORD MOTOR CO" "operations research" "02/18/1991")("GENERAL MOTORS CORP" "manager" "02/09/1991")))

<2 (QUERYROUTER Back to query-router.(((RECRUITSET "companytbl" COMPANYNAME)(RECRUITSET "companytbl" POSITION)(RECRUITSET "companytbl" VISITDAY))

("FORD MOTOR CO" "finance" "01/28/1991")("FORD MOTOR CO" "corporate planning" "01/28/1991")("FORD MOTOR CO" "corporate planning" "02/11/1991")("FORD MOTOR CO" "operations research" "02/18/1991")("GENERAL MOTORS CORP" "manager" "02/09/1991")))

<1 (ROUTER (((ALUMNI COMPANY) (ALUMNI SICCODE) (COMPANY POSITION)(COMPANY VISITINGDATE))("GENERAL MOTORS CORP" "371" "manager" "02/09/1991")("FORD MOTOR CO" "371" "finance" "01/28/1991")("FORD MOTOR CO" "371" "corporate planning" "01/28/1991")("FORD MOTOR CO" "371" "corporate planning" "02/11/1991")("FORD MOTOR CO" "371" "operations research""02/18/1991")))

(((ALUMNI COMPANY) (ALUMNI SICCODE) (COMPANY POSITION)(COMPANY VISITINGDATE))

("GENERAL MOTORS CORP" "371" "manager" "02/09/1991")("FORD MOTOR CO" "371" "finance" "01/28/1991")("FORD MOTOR CO" "371" "corporate planning" "01/28/1991")("FORD MOTOR CO" "371" "corporate planning" "02/11/1991")

Page 76: CIS/TK: An Object-Oriented Approach to Information Systems ...web.mit.edu/smadnick/www/wp2-old names/CIS#92-08.pdfthe environment and support to reconcile differences in data semantics

DRIBBLE FOR THE FIRST QUERY (QU1) 27

("FORD MOTOR CO" "371" "operations research" "02/18/1991"))

Data from all the sub-queries has been obtained and combined.