database modeling in uml

16

Click here to load reader

Upload: danilodsp

Post on 31-Dec-2015

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database Modeling in UML

Introduction

When it comes to flexible and efficienfor software systemsand architects are choices. From tperspective, the cbetween pure ObjecRelational hybrids, pcustom solutions bproprietary file foOLE structured stvendor aspect OraclPOET and others often-incompatible s

This article is aboutchoices, that is thobject-oriented classpurely relational dato imply this is simplest solution, bis one of the most that has the potenmisuse.

We will begin with two design domainbridge: firstly the obmodel as representesecondly the relation

For each domain wmain features that wWe will then lookand issues involvedthe class model to tincluding object pbehaviour, relatioobjects and object conclude with a re

By Geoffrey SpaOriginally published

Database Modelling in UMLrks, [email protected] : http://www.sparxsystems.com.auin Methods & Tools e-newsletter : http://www.martinig.ch/mt/index.html

providing reliable,t object persistence, today's designersfaced with manyhe technologicalhoice is usuallyt-Oriented, Object-ure Relational andased on open orrmats (eg. XML,orage). From thee, IBM, Microsoft,offer similar butolutions.

only one of thosee layering of an model on top of atabase. This is notthe only, best orut pragmatically itcommon, and onetial for the most

a quick tour of thes we are trying toject-oriented class

d in the UML, andal database model.

e look only at theill affect our task.

at the techniques in mapping fromhe database model,ersistence, objectnships betweenidentity. We will

view of the UML

Data Profile (as proposed by RationalSoftware).Some familiarity with object-orienteddesign, UML and relational databasemodelling is assumed.

The Class Model

The Class Model in the UML is themain artefact produced to represent thelogical structure of a software system.It captures the both the datarequirements and the behaviour ofobjects within the model domain. Thetechniques for discovering andelaborating that model are outside thescope of this article, so we will assumethe existence of a well designed classmodel that requires mapping onto arelational database.

The class is the basic logical entity inthe UML. It defines both the data andthe behaviour of a structural unit. Aclass is a template or model fromwhich instances or objects are createdat run time. When we develop a logicalmodel such as a structural hierarchy inUML we explicitly deal with classes.

When we work with dynamicdiagrams, such as sequence diagramsand collaborations, we work withobjects or instances of classes and theirinter-actions at run-time.

The principal of data hiding orencapsulation is based on localisationof effect. A class has internal dataelements that it is responsible for.Access to these data elements shouldbe through the class's exposedbehaviour or interface. Adherence to

Page 2: Database Modeling in UML

this principal results in moremaintainable code.

Behaviour

Behaviour is captured in the classmodel using the operations that aredefined for the class. Operations maybe externally visible (public), visible tochildren (protected) or hidden(private). By combining hidden datawith a publicly accessible interface andhidden or protected data manipulation,a class designer can create highlymaintainable structural units thatsupport rather than hinder change.

Relationships and Identity

Association is a relationship between 2classes indicating that at least one sideof the relationship knows about andsomehow uses or manipulates the otherside. This relationship may byfunctional (do something for me) orstructural (be something for me). Forthis article it is the structural

Perso

- Address: CA# Name:

St i# Age: double

+ getAge() : in+ setAge(n)+ getName() :

St i+ setName(s)

Attributes and optrun-time and the bj t

Person

A simple person classwith no state orbehaviour shown

Figure 1 - Classes, a

relationship that is most interesting: for

n

ddress

t

Class attributes :the encapsulateddata

Class operations:the behaviour

erations define the state of an objectcapabilities or behaviour of the

ttributes and operations

example an Address class may beassociated with a Person class. Themapping of this relationship into therelational data space requires somecare.

Aggregation is a form of associationthat implies the collection of one classof objects within another. Compositionis a stronger form of aggregation thatimplies one object is actuallycomposed of others. Like theassociation relationship, this implies acomplex class attribute that requirescareful consideration in the process ofmapping to the relational domain.

While a class represents the templateor model from which many objectinstances may be created, an object atrun time requires some means ofidentifying itself such that associatedobjects may act upon the correct objectinstance. In a programming languagelike C++, object pointers may bepassed around and held to allow

Page 3: Database Modeling in UML

objects access to a unique objectinstance.Often though, an object will bedestroyed and require that it be re-created as it was during its last activeinstance. These objects require astorage mechanism to save theirinternal state and associations into andto retrieve that state as required.

Inheritance provides the class modelwith a means of factoring out commonbehaviour into generalised classes thatthen act as the ancestors of manyvariations on a common theme.Inheritance is a means of managingboth re-use and complexity. As we willsee, the relational model has no directcounterpart of inheritance, whichcreates a dilemma for the datamodeller mapping an object modelonto a relational framework.

Navigation from one object at run timeto another is based on absolutereferences. One object has some formof link (a pointer or unique object ID)

with which to locate or re-create therequired object.

The Relational Model

The relational data model has beenaround for many years and has aproven track record of providingperformance and flexibility. It isessentially set based and has as itsfundamental unit the 'table', which iscomposed of a set of one or more'columns', each of which contains adata element.

Tables and Columns

A relational table is collection of oneor more columns each of which has aunique name within the table construct.Each column is defined to be of acertain basic data type, such as anumber, text or binary data. A tabledefinition is a template from whichtable rows are created, each row beingan instance of a possible table instance.

Parent

Person

Association capturesa having or usingrelationship betweenclasses

A class hierarchyshowing a generalisedperson class fromwhich other classesare derived

Aggregofcollecti

2 1..n

Figure 2 - UML C

Child

Familyation captures the concepton or composition between classes

The main relationships we areinterested in are Association,Aggregation and Inheritance.Thesedescribe the ways classes interactor relate to eachother

lass model notation

Page 4: Database Modeling in UML

Person ID Document

Share

Address

A Person mayreside at zeroor moreaddresses

An Address mayhave zero ormore Persons inresidence

A Person is composed ofa strict set of IDdocuments (having nelements)

A Person mayown a set ofShares

Three forms of the Aggregation relationship. The weak form isdepicted with an unfilled diamond head, the strong form(composition) with a filled head.

n1

0..n

0..1

0..n 0..n

Figure 3- Aggregation Relationships

Public Data Access

The relational model only offers apublic data access model. All data isequally exposed and open to anyprocess to update, query or manipulateit. Information hiding is unknown.

Behaviour

The behaviour associated with a tableis usually based on the business orlogical rules applied to that entity.

Constraints may be applied to columnsin the form of uniquenessrequirements, relational integrityconstraints to other tables/rows,allowable values and data types.

Triggers provide some additionalbehaviour that can be associated withan entity. Typically this is used toenforce data integrity before or afterupdates, inserts and deletes.

Database stored procedures provide ameans of extending databasefunctionality through proprietarylanguage extensions used to constructfunctional units (scripts). Thesefunctional procedures do not mapdirectly to entities, nor have a logicalrelationship to them.

Navigation through relational data setsis based on row traversal and tablejoins. SQL is the primary languageused to select rows and locateinstances from a table set.

Relationships and Identity

The primary key of a table provides theunique identifying value for aparticular row. There are two kinds ofprimary key that we are interested in:firstly the meaningful key, made up ofdata columns which have a meaningwithin the business domain, andsecond the abstract unique identifier,such as a counter value, which have no

Page 5: Database Modeling in UML

business meaning but uniquely identifya row. We will discuss this and theimplications of meaningful keys later.

A table may contain columns that mapto the primary key of another table.This relationship between tablesdefines a foreign key and implies astructural relationship or associationbetween the two tables.

Summary

From the above overview we can seethat the object model is based ondiscrete entities having both state(attributes/data) and behaviour, withaccess to the encapsulated datagenerally through the class publicinterface only. The relational modelexposes all data equally, with limitedsupport for associating behaviour withdata elements through triggers, indexesand constraints.

You navigate to distinct information inthe object model by moving fromobject to object using unique objectidentifiers and established objectrelationships (similar to a networkdata model). In the relational modelyou find rows by joining and filteringresult sets using SQL using generalisedsearch criteria.

Identity in the object model is either arun-time reference or persistent uniqueID (termed an OID). In the relationalworld, primary keys define theuniqueness of a data set in the overalldata space.

In the object model we have a rich setof relationships: inheritance,aggregation, association, composition,dependency and others. In therelational model we can really onlyspecify a relationship using foreignkeys.

Having looked at the two domains ofinterest and compared some of theimportant features of each, we willdigress briefly to look at the notationproposed to represent relational datamodels in the UML.

The UML Data Model Profile

The Data Model Profile is a proposedUML extension (and currently underreview - Jan 2001) to support themodelling of relational databases inUML. It includes custom extensionsfor such things as tables, data baseschema, table keys, triggers andconstraints. While this is not a ratifiedextension, it still illustrates onepossible technique for modelling arelational database in the UML.

Tables

Customer

A table in the UML Data Profile is aclass with the «Table» stereotype,displayed as above with a table icon inthe top right corner.

Columns

Customer

PK OID: intName: VARCHAR2Address: VARCHAR2

Database columns are modelled asattributes of the «Table» class. Forexample, the figure above shows someattributes associated with the Customertable. In the example, an object id hasbeen defined as the primary key, as

Page 6: Database Modeling in UML

well as two other columns, Name andAddress. Note that the example abovedefines the column type in terms of thenative DBMS data types.

Behaviour

So far we have only defines the logical(static) structure of the table; inaddition we should describe thebehaviour associated with columns,including indexes, keys, triggers,procedures & etc. Behaviour isrepresented as stereotyped operations.

The figure below shows our tableabove with a primary key constraintand index, both defined as stereotypedoperations:

Customer

PK OID: intName: VARCHAR2Address: VARCHAR2

+ «PK» idx_customer00()+ «index» idx_customer01()

Note that the PK flag on the column'OID' defines the logical primary key,while the stereotyped operation "«PK»idx_customer00" defines theconstraints and behaviour associatedwith the primary key implementation(that is, the behaviour of the primarykey).

Adding to our example, we may nowdefine additional behaviour such astriggers, constraints and storedprocedures as in the example below:

Customer

PK OID: intName: VARCHAR2Address: VARCHAR2

+ «PK» idx_customer00()+ «FK» idx_customer02()+ «Index» idx_customer01()+ «Trigger» trg_customer00()+ «Unique» unq_customer00()+ «Proc» spUpdateCustomer()+ «Check» chk_customer00()

The example illustrates the followingpossible behaviour:

1. A primary key constraint (PK);2. A Foreign key constraint (FK);3. An index constraint (Index);4. A trigger (Trigger);5. A uniqueness constraint (Unique);6. A stored procedure (Proc) - not

formally part of the data profile,but an example of a possiblemodelling technique; and a

7. Validity check (Check).

Using the notation provided above, it ispossible to model complex datastructures and behaviour at the DBMSlevel. In addition to this, the UMLprovides the notation to expressrelationships between logical entities.

Relationships

The UML data modelling profiledefines a relationship as a dependencyof any kind between two tables. It isrepresented as a stereotypedassociation and includes a set ofprimary and foreign keys.

The data profile goes on to require thata relationship always involves a parentand child, the parent defining aprimary key and the childimplementing a foreign key based onall or part of the parent primary key.

The relationship is termed 'identifying'if the child foreign key includes all the

Page 7: Database Modeling in UML

elements of the parent primary key and'non-identifying' if only some elementsof the primary key are included.

The relationship may includecardinality constraints and be modelledwith the relevant PK - FK pair namedas association roles. Figure 4 illustratesthis kind of relationship modellingusing UML.

The Physical Model

UML also provides some mechanismsfor representing the overall physicalstructure of the database, its contentsand deployed location. To represent aphysical database in UML, use astereotyped component as in the figurebelow:

«Database»MainOraDB

A component represents a discrete anddeployable entity within the model. Inthe physical model, a component maybe mapped on to a physical piece of

hardware (a 'node' in UML).

To represent schema within thedatabase, use the «schema» stereotypeon a package. A table may be placed ina «schema» to establish its scope andlocation within a database.

«schema»User

ChildGrandchildGrandparentParentPerson

Mapping from the Class Model tothe Relational Model

Having described the two domains ofinterest and the notation to be used, wecan now turn our attention as to how tomap or translate from one domain tothe other. The strategy and sequencepresented below is meant to besuggestive rather than proscriptive -adapt the steps and procedures to yourpersonal requirements and

Parent Child

An identifying relationship between child and parent, with role namesbased on primary to foreign key relationship.

PK_PersonID

2 «identifying»

FK_PersonID

0..n

Figure 4 - UML relationship

Page 8: Database Modeling in UML

environment.

1. Model Classes

Firstly we will assume we areengineering a new relational databaseschema from a class model we havecreated. This is obviously the easiestdirection as the models remain underour control and we can optimise therelational data model to the classmodel. In the real world it may be thatyou need to layer a class model on topof a legacy data model - a moredifficult situation and one that presentsits own challenges. For the currentdiscussion will focus on the firstsituation. At a minimum, your classmodel should capture associations,inheritance and aggregation betweenelements.

2. Identify persistent objects

Having built our class model we needto separate it into those elements thatrequire persistence and those that donot. For example, if we have designedour application using the Model-View-Controller design pattern, then onlyclasses in the model section wouldrequire persistent state.

3. Assume each persistent class mapsto one relational table

A fairly big assumption, but one thatworks in most cases (leaving theinheritance issue aside for themoment). In the simplest model a classfrom the logical model maps to arelational table, either in whole or inpart. The logical extension of this isthat a single object (or instance of aclass) maps to a single table row.

4. Select an inheritance strategy

Inheritance is perhaps the mostproblematic relationship and logical

construct from the object-orientedmodel that requires translating into therelational model. The relational spaceis essentially flat, every entity beingcomplete in its self, while the objectmodel is often quite deep with a well-developed class hierarchy.

The deep class model may have manylayers of inherited attributes andbehaviour, resulting in a final, fullyfeatured object at run-time. There arethree basic ways to handle thetranslation of inheritance to a relationalmodel:

1. Each class hierarchy has a singlecorresponding table that containsall the inherited attributes for allelements - this table is therefore theunion of every class in thehierarchy. For example, Person,Parent, Child and Grandchild mayall form a single class hierarchy,and elements from each will appearin the same relational table;

2. Each class in the hierarchy has acorresponding table of only theattributes accessible by that class(including inherited attributes). Forexample, if Child is inherited fromPerson only, then the table willcontain elements of Person andChild only;

3. Each generation in the classhierarchy has a table containingonly that generation's actualattributes. For example, Child willmap to a single table with Childattributes only

There are cases to be made for eachapproach, but I would suggest thesimplest, easiest to maintain and lesserror prone is the third option. The firstoption provides the best performanceat run-time and the second is acompromise between the first and last.

Page 9: Database Modeling in UML

The first option flattens the hierarchyand locates all attributes in one table -convenient for updates and retrievalsof any class in the hierarchy, butdifficult to authenticate and maintain.Business rules associated with a roware hard to implement, as each rowmay be instantiated as any object in thehierarchy. The dependencies betweencolumns can become quitecomplicated. In addition, an update toany class in the hierarchy willpotentially impact every other class inthe hierarchy, as columns are added,deleted or modified from the table.

The second option is a compromisethat provides better encapsulation andeliminates empty columns. However, achange to a parent class may need to

be replicated in many child tables.Even worse, the parental data in two ormore child classes may be redundantlystored in many tables; if a parent'sattributes are modified, there isconsiderable effort in locatingdependent children and updating theaffected rows.

The third option more accuratelyreflects the object model, with eachclass in the hierarchy mapped to itsown independent table. Updates toparents or children are localised in thecorrect space. Maintenance is alsorelatively easier, as any modificationof an entity is restricted to a singlerelational table also. The down side isthe need to re-construct the hierarchyat run-time to accurately re-create a

tbl_ParentAddressOID: VARCHARName: VARCHAR

PK OID: VARCHARSex: VARCHAR

Parent- OID: GUID# Name: String# Sex: Gender

+ setName(String)+ getName() : String+ setSex(String)+ getSex() : String

Address- OID: GUID# City: String# Phone: String# State: String# Street: String

+ getCity() : String+ getStreet() : String+ setCity(String)+ setStreet(String)

tbl_AddressCity: VARCHAR

PK OID: VARCHARPhone: VARCHARState: VARCHARStreet: VARCHAR

The Address association from the logical model becomes a foreign key relationship in the data model

A Parent class with unique ID (OID) and Name and Sex attributes maps to a relational table.

The Address class in the logical model becomes a table in the data model

<<realises>>

m_Address 0..n

1

<<realises>>

Figure 5 - Class to Table mapping

Page 10: Database Modeling in UML

child class's state. A Child object mayrequire a Person member variable torepresent their model parentage. Asboth require loading, two databasecalls are required to initialise oneobject. As the hierarchy deepens, withmore generations, the number ofdatabase calls required to initialise orupdate a single object increases.

It is important to understand the issuesthat arise when you map inheritanceonto a relational model, so you candecide which solution is right for you.

5. For each class add a uniqueobject identifier

In both the relational and the objectworld, there is the need to uniquelyidentify an object or entity.

In the object model, non-persistentobjects at run-time are typicallyidentified by direct reference or by apointer to the object. Once an object iscreated, we can refer to it by its run-time identity. However, if we write outan object to storage, the problem ishow to retrieve the exact same instanceon demand.

The most convenient method is todefine an OID (object identifier) that isguaranteed to be unique in thenamespace of interest. This may be atthe class, package or system level,depending on actual requirements.

An example of a system level OIDmight be a GUID (globally uniqueidentifier) created with Microsoft's'guidgen' tool; eg. {A1A68E8E-CD92-420b-BDA7-118F847B71EB}. A classlevel OID might be implemented usinga simple numeric (eg. 32 bit counter).

If an object holds references to otherobjects, it may do so using their OID.

A complete run-time scenario can thenbe loaded from storage reasonablyefficiently.

An important point about the OIDvalues above is that they have noinherent meaning beyond simpleidentity. They are only logical pointersand nothing more. In the relationalmodel, the situation is often quitedifferent.

Identity in the relational model isnormally implemented with a primarykey. A primary key is a set of columnsin a table that together uniquelyidentify a row. For example, name andaddress may uniquely identify a'Customer'. Where other entities, suchas a 'Salesperson', reference the'Customer', they implement a foreignkey based on the 'Customer' primarykey.

The problem with this approach for ourpurposes is the impact of havingbusiness information (such as customername and address) embedded in theidentifier. Imagine three or four tablesall have foreign keys based on thecustomer primary key, and a systemchange requires the customer primarykey to change (for example to include'customer type'). The work required tomodify both the 'customer' table andthe entities related by foreign key isquite large.

On the other hand, if an OID wasimplemented as the primary key andformed the foreign key for other tables,the scope of the change is limited tothe primary table and the impact of thechange is therefore much less.

Also, in practice, a primary key basedon business data may be subject tochange. For example a customer maychange address or name. In this casethe changes must be propagated

Page 11: Database Modeling in UML

correctly to all other related entities,not to mention the difficulty ofchanging information that is part of theprimary key.

An OID always refers to the sameentity - no matter what otherinformation changes. In the aboveexample, a customer may change nameor address and the related tablesrequire no change.

When mapping object models intorelational tables, it is often moreconvenient to implement absoluteidentity using OID's rather thanbusiness related primary keys. TheOID as primary and foreign keyapproach will usually give better loadand update times for objects andminimise maintenance effort. Inpractice, a business related primarykey might be replaced with:

1. A uniqueness constraint or indexon the columns concerned;

2. Business rules embedded in theclass behaviour;

3. A combination of 1 and 2.

Again, the decision to use meaningfulkeys or OID's will depend on the exactrequirements of the system beingdeveloped.

6. Map attributes to columns

In general we will map the simple dataattributes of a class to columns in therelational table. For example a text andnumber field may represent a person'sname and age respectively. This sort ofdirect mapping should pose noproblem - simply select the appropriatedata type in the vendor's relationalmodel to host your class attribute.

For complex attributes (ie. attributesthat are other objects) use the approach

detailed below for handlingassociations and aggregation.

7. Map associations to foreign keys

More complex class attributes (ie.those which represent other classes),are usually modelled as associations.An association is a structuralrelationship between objects. Forexample, a Person may live at anAddress. While this could be modelledas a Person has City, Street and Zipattributes, in both the object and therelational world we are inclined tostructure this information as a separateentity, an Address.

In the object domain an addressrepresents a unique physical object,possibly with a unique OID. In therelational, an address may be a row inan Address table, with other entitieshaving a foreign key to the Addressprimary key.

In both models then, there is thetendency to move the addressinformation into a separate entity. Thishelps to avoid redundant data andimproves maintainability.

So for each association in the classmodel, consider creating a foreign keyfrom the child to the parent table.

Page 12: Database Modeling in UML

8. Map Aggregation andComposition

Aggregation and compositionrelationships are similar to theassociation relationship and map totables related by primary-foreign keypairs. There are however, some pointsto bear in mind.

Ordinary aggregation (the weak form)models relationships such as a Personresides at one or more Addresses. Inthis instance, more than one personcould live at the same address, and ifthe Person ceased to exist, theAddresses associated with them wouldstill exist. This example parallels themany-to-many relationship inrelational terminology, and is usuallyimplemented as a separate tablecontaining a mapping of primary keysfrom one table to the primary keys ofanother.

A second example of the weak form ofaggregation is where an entity has useor exclusive ownership of another. Forexample, a Person entity aggregates aset of shares. This implies a Personmay be associated with zero or more

shares from a Share table, but eachShare may be associated with zero orone Person. If the Person ceases toexist, the Shares become un-owned orare passed to another Person. In therelational world, this could beimplemented as each Share having an'owner' column which stored a PersonID (or OID) .

The strong form of aggregation,however, has important integrityconstraints associated with it.Composition, implies that an entity iscomposed of parts, and those partshave a dependent relationship to thewhole. For example, a Person mayhave identifying documents such as aPassport, Birth Certificate, Driver'sLicense & etc. A Person entity may becomposed of the set of such identifyingdocuments. If the Person is deletedfrom the system, then the identifyingdocuments must be deleted also, asthey are mapped to a uniqueindividual.

If we ignore the OID issue for themoment, a weak aggregation could beimplemented using either anintermediate table (for the many-to-

Customer

PK OID: intName: VARCHAR2Address: VARCHAR2Salesperson: int

+ «PK» PK_Customer()+ «FK» FK_SalesPerson()

Salesperson

PK OID: intName: VARCHAR2Department: VARCHAR2

+ «PK» PK_Salesperson()

Relationships are based on the PK- FK pair. This example relates a Salesperson to aCustomer by the appropriate primary and foreign keys. The assumption is that acustomer may only be associated with one salesperson.

PK_Salesperson

1

FK_Salesperson

0..*

Figure 6 - Table relationships in UML

Page 13: Database Modeling in UML

many case) or with a foreign key in theaggregated class/table (one-to-manycase). In the case of the many-to-manyrelationship, if the parent is deleted,the entries in the intermediate table forthat entity must also be deleted also. Inthe case of the one-to-manyrelationship, if the parent is deleted,the foreign key entry (ie. 'owner') mustbe cleared.

In the case of composition, the use of aforeign key is mandatory, with theadded constraint that on deletion of theparent the part must be deleted also.Logically there is also the implicationwith composition that the primary keyof the part forms part of the primarykey of the whole - for example, aPerson's primary key may composed oftheir identifying documents ID's. Inpractice this would be cumbersome,but the logical relationship holds true.

9. Define relationship roles

For each association type relationship,each end of the relationship may befurther specified with role information.Typically, you will include the PrimaryKey constraint name and the ForeignKey Constraint name. Figure 6illustrates this concept. This logicallydefines the relationship between thetwo classes.

In addition, you may specify additionalconstraints (eg. {Not NULL}) on therole and cardinality constraints (eg.0..n).

10. Model behaviour

We now come to another difficultissue: whether to map some or all classbehaviour to the functional capabilitiesprovided by database vendors in theform of triggers, stored procedures,

uniqueness and data constraints, andrelational integrity.

A non-persistent object model wouldtypically implement all the behaviourrequired in one or more programminglanguages (eg. Java or C++). Eachclass will be given its requiredbehaviour and responsibilities in theform of public, protected and privatemethods.

Relational databases from differentvendors typically include some form ofprogrammable SQL based scriptinglanguage to implement datamanipulation. The two commonexamples are triggers and storedprocedures.

When we mix the object and relationalmodels, the decision is usually whetherto implement all the business logic inthe class model, or to move some tothe often more efficient triggers andstored procedures implemented in therelational DBMS.

From a purely object-oriented point ofview, the answer is obviously to avoidtriggers and stored procedures andplace all behaviour in the classes. Thislocalises behaviour, provides for acleaner design, simplifies maintenanceand provides good portability betweenDBMS vendors.

In the real world, the bottom line maybe scaling to 100's or 1000's oftransactions per second, somethingstored procedures and triggers arepurpose designed for.

If purity of design, portability,maintenance and flexibility are themain drivers, localise all behaviour inthe object methods.

If performance is an over-ridingconcern, consider delegating some

Page 14: Database Modeling in UML

behaviour to the more efficient DBMSscripting languages. Be aware thoughthat the extra time taken to integratethe object model with the storedprocedures in a safe way, includingissues with remote effects anddebugging, may cost more indevelopment time than simplydeploying to more capable hardware.

As mentioned earlier, the UML DataProfile provides the followingextensions (stereotyped operations)with which you can model DBMSbehaviour:

! Primary key constraint (PK);! Foreign key constraint (FK);! Index constraint (Index);! Trigger (Trigger);! Uniqueness constraint (Unique);! Validity check (Check).

11. Produce a physical model

In UML, the physical model describeshow something will be deployed intothe real world - the hardware platform,network connectivity, software,operating system, dll's and othercomponents. You produce a physicalmodel to complete the cycle - from aninitial use case or domain model,through the class model and datamodels and finally the deploymentmodel.

Typically for this model you willcreate one or more nodes that will hostthe database(s) and place DBMSsoftware components on them. If thedatabase is split over more than oneDBMS instance, you can assignpackages («schema») of tables to asingle DBMS component to indicatewhere the data will reside.

Conclusion

This concludes this short article ondatabase modelling using the UML. As

MainServer

«schema»System

sys_aliasessys_procssys_queriessys_tablessys_users

«schema»User

ChildGrandchildGrandparentParentPerson

«Database»MainOraDB

A Node is a physical piece of hardw are (such as a Unix server) on w hich components are deployed. The database component in this example is also mapped to tw o logical «schema», each of w hich contains a number of tables.

Figure 7 - The Physical Model

Page 15: Database Modeling in UML

you can see, there are quite a fewissues to consider when mapping fromthe object world to the relational. TheUML provides support for bridging thegap between both domains, andtogether with extensions such as theUML Data Profile is a good languagefor successfully integrating bothworlds.

References

Muller, Robert J., Database Design forSmarties, Morgan Kaufman, 1999.

Rational Software, The UML and DataModelling, Rational Software

Ambler, Scott W., Mapping Objects toRelational Databases, AmbySoft inc,1999

About the Author

Geoffrey Sparks is the director ofSparx Systems, an Australian companythat specialises in UML tools.

The Sparx Systems web site is at:www.sparxsystems.com.au

Geoffrey may be contacted [email protected]

Page 16: Database Modeling in UML

A quick summary guide to data modelling in UML

1. Create a class model for your development domain

2. Identify persistent classes from the model

3. Assume each persistent class in the model will map to one relational table

4. Select a suitable inheritance strategy for each class hierarchy

5. For each class add a unique ID (OID) or select a suitable primary key

6. For each class map simple data types to table columns

7. For each class, map complex attributes (association, aggregation) to PK/ FK pairs. Takespecial note of the strong and weak forms of aggregation.

8. For related classes, map PK, FK pairs naming the role ends according to selected key.

9. Label relationship roles with their appropriate cardinality and stereotype: <<identifying>>or <<non-identifying>>

10. Add stereotyped operations for table behaviour (keys, indexes, uniqueness, checks, andtriggers)

11. Divide persistent classes into logical schema

12. Create a deployment model and link database components to physical nodes