lesson 3: an introduction to data modeling
TRANSCRIPT
-
8/14/2019 Lesson 3: An Introduction to Data Modeling
1/23
Michael Brydon ([email protected])Last update: 02-May-01 1 o f 23
Lesson 3: An introduction to data modeling
3.1 Introduction: The importance ofconceptual models
Before you sit down in front of the keyboardand start creating a database application, it iscritical that you take a step back and consideryour business problemin this case, the kitchensupply scenario presented in Lesson 2 from aconceptual point of view. To facilitate this
process, a number of conceptual modeling techniques have been developed by computerscientists, psychologists, and consultants.
For our purposes, we can think of aconceptual model as a picture of theinformation system we are going to build.
To use an analogy, conceptual models areto information systems what blueprintsare to buildings.
There are many different conceptual modelingtechniques used in practice. Each techniqueuses a different set of symbols and may focus on
a different part of the problem (e.g., data,processes, information flows, objects, and soon). Despite differences in notation and focus,however, the underlying rationale forconceptual modeling techniques is always the
same: understand the problem before you startconstructing a solution.
There are two important things to keep in mindwhen learning about and doing data modeling:1. Data modeling is first and foremost a tool
for communication .Their is no single rightmodel. Instead, a valuable model highlightstricky issues, allows users, designers, andimplementors to discuss the issues using thesame vocabulary, and leads to better designdecisions.
2. The modeling process is inherentlyiterative: you create a model, check itsassumptions with users, make the necessarychanges, and repeat the cycle until you are
sure you understand the critical issues.In this background lesson, you are going to use adata modeling techniquespecifically, Entity-Relationship Diagrams (ERDs)to model thebusiness scenario from Lesson 2. The datamodel you create in this lesson will form the
foundation of the database that you usethroughout the remaining lessons.
?
http://scenario.pdf/http://scenario.pdf/http://scenario.pdf/http://scenario.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
2/23
Introduction: The importance of conceptualAn introduction to data modeling
3.1.1 What is data modeling?
A data model is a simply a diagram thatdescribes the most important things in your
business environment from a data-centric pointof view. To illustrate, consider the simple ERDshown in Figure 3.1 . The purpose of the diagramis to describe the relationship between the datastored about products and the data storedabout the organizations that supply theproducts.
3.1.1.1 Entities and attributes
The rectangles in Figure 3.1 are called entitytypes (typically shortened to entities) and
the ovals are called attributes . The entities arethe things in the business environment aboutwhich we want to store data. The attributesprovide us with a means of organizing andstructuring the data. For example, we need tostore certain information about the productsthat we sell, such as the typical selling price ofthe product (Unit price) and the quantity ofthe product currently in inventory (Qty onhand). These pieces of data are attributes ofthe Product entity.
It is important to note that the precise mannerin which data are used and processed within a
particular business application is a separateissue from data modeling. For example, thedata model says nothing about how the value ofQty on hand is changed over time. The focusin data modeling is on capturing data about theenvironment. You will learn how to change thisdata (e.g., process orders so that the inventory
values are updated) once you have masteredthe art of database design.
A data modeler assumes that if the rightdata is available, the other elements ofthe application will fall into placeeffortlessly and wonderfully. For now, thisis a good working assumption.
FIGURE 3.1: An ERD showing a relationshipbetween products and suppliers.
Product
suppliedby
Supplier
Unit price
ProductID
Qty on hand
Name
Address
Entity
Attributes
Relationship
Cardinality
?
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
3/23
Introduction: The importance of conceptualAn introduction to data modeling
3.1.1.2 Notation for relationships
In addition to entities and attributes, Figure 3.1 shows a relationship between the two entities
using a line and a diamond. The relationshipconstruct is usednot surprisinglyto indicatethe existence or absence of a relationshipbetween entities. A crows foot at either end ofa relationship line is used to denote thecardinality of the relationship.
For example, the crows foot on the productside of the relationship in Figure 3.1 indicatesthat a particular supplier may provide yourcompany with several different products, suchas bowls, spatulas, wire whisks and so on. Theabsence of a crows foot on the supplier sideindicates that each product in your inventory is
provided by a single supplier. Thus, therelationship in Figure 3.1 indicates that youalways buy all your wire whisks from the samecompany.
3.1.1.3 Modeling assumptions
The relationship shown in Figure 3.1 is calledone-to-many : each supplier supplies manyproducts (where many means any numberincluding zero) but each product is supplied byone supplier (where one means at mostone).
The decision to use a one-to-many relationshipreflects an assumption about the business
environment in which your wholesale companyoperates. However, it is easy to imagine adifferent environment in which each product issupplied by multiple suppliers. For example,many suppliers may carry a particular brand ofwire whisk. When you run out of whisks, it is upto you to decide where to place your order. Inother words, it is possible that a many-to-many relationship exists between suppliers andproducts.
If multiple supplier exist, attributes of theproduct, such as its price and product numbermay vary from supplier to supplier. In thissituation, the data requirements of a many-to-many environment are slightly more complexthan those of the one-to-many environment. Ifyou design and implement your database aroundthe one-to-many assumption but then discoverthat certain goods are supplied by multiplesuppliers, much effort is going to be required tofix the problem.
Herein lies the point of drawing an ERD:The diagram makes your assumptionsabout the relationships within a particularbusiness environment explicit before youstart building things.
3.1.1.4 The role of the modeler
In the environment used in these tutorials, youare the user, the designer, and the implementor
?
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
4/23
Introduction: The importance of conceptualAn introduction to data modeling
of the system. In a more realistic environment,however, these roles are played by differentindividuals (or groups) with differentbackgrounds and priorities. For example, acommon stereotype of implementors(programmers, database specialists, and so on)is that they seldom leave their cubicles tocommunicate with end-users of the softwarethey are writing. Similarly, it is generally safe toassume that users have no interest in, or
understanding of, low-level technical details(such as the cardinality of relationships onERDs, mechanisms to enforce referentialintegrity, and so on). Thus, it is up to thebusiness analyst to bridge the communicationgap between the different groups involved inthe construction, use, and administration of an
information system.As a business analyst (or more generally, adesigner), it is critical that you walk throughyour conceptual models with users and makesure that your modeling assumptions areappropriate. In some cases, you may have toexamine sample data from the existingcomputer-based or manual system to determinewhether (for instance) there are any productsthat are supplied by multiple suppliers.
At the modeling stage, making changes such asconverting a one-to-many relationship to amany-to-many relationship is trivialall that isrequired is the addition of a crows foot to one
end of the relationship, as shown in Figure 3.2 .In contrast, making the same change once youhave implemented tables, built a userinterface, and written code is a time-consumingand frustrating chore.
Generally, you can count on the 10
ruleof thumb when building software: thecost of making a change increases by anorder of magnitude for each stage of thesystems development lifecycle that youcomplete.
FIGURE 3.2: An ERD for an environment inwhich there is a many-to-many relationship
between products and suppliers.
Product
suppliedby
Supplier
Unit price
ProductID
Qty on hand
Name
Address
The addition of asecond crowsfoot transformsthe one-to-manyrelationship into amany-to-manyrelationship.
!
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
5/23
-
8/14/2019 Lesson 3: An Introduction to Data Modeling
6/23
Introduction: The importance of conceptualAn introduction to data modeling
as the labels make the diagram easier tointerpret.
To illustrate, consider the relationship betweenproducts and suppliers shown in Figure 3.1 . Therelationship is described by the verb phrasesupplied by. Although one could have optedfor the shorter relationship name has instead,the resulting diagram (e.g., Supplier hasproduct) would be more difficult for readers ofthe diagram to interpret.
3.1.2.3 Relationship direction
One issue that sometimes troubles neophytedata modelers is that the direction of the
relationship is not made explicit on thediagram. Returning to Figure 3.1 , it is obviousto me (since I drew the diagram) that therelationship should be read: Product issupplied by supplier. Reading the relationshipin the other direction (Supplier is supplied byproduct) makes very little sense to anyone whois familiar with the particular problem domain.
Generally, ERDs make certain assumptionsabout the readers knowledge of the underlyingbusiness domain.
A notational convention supported bysome CASE tools is to require two namesfor each relationship: one that makessense in one direction (e.g., is suppliedby), and another that makes sense in theopposite direction (e.g., supplies).Although double-naming may make thediagram easier to read, it also addsclutter (twice as many labels) andimposes an additional burden on themodeler.
3.1.2.4 Cardinality
As discussed in Section 3.1.1.2 , the cardinalityof a relationship constrains the number ofinstances of one entity type that can beassociated with a single instance of the otherentity type.
The cardinality of relationships has animportant impact on number andstructure of the tables in the database.Consequently, it is important to get thecardinality right on paper before startingthe implementation.
buys
FIGURE 3.4: A relationship named buys.
?
!
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
7/23
Introduction: The importance of conceptualAn introduction to data modeling
There are three fundamental types ofcardinality in ERDs:
One-to-many You have already seen an
example of a one-to-many relationship inFigure 3.1 . You will soon discover that one-to-many relationships are the bread andbutter of relational databases.
One-to-one At this point in your datamodeling career, you should avoid one-to-one relationships. To illustrate the basicissue, consider the ERD shown in Figure 3.5 .Based on an existing paper-based system,the modeler has assumed that eachcustomer is associated with one customerrecord (i.e., a paper form containinginformation about the customer, such as
address, fax number, and so on). Clearly,each customer has only one customerrecord and each customer record belongs toa single customer. However, if we automatethe system and get rid of the paper form,then there is no reason not to combine theCustomer and Customer Record entities into
a single entity called Customer.
In many cases, one-to-one relationshipsindicate a modeling error. When you havea one-to-one relationship such as the oneshown in Figure 3.5 , you should combinethe two entities into a single entity.
Many-to-many The world is full of many-to-many relationships. A well-used exampleis Student takes course. Many-to-manyrelationships also arise when you considerthe history of an entity. To illustrate,consider the ERD shown in Figure 3.6 . Atfirst glance, the relationship betweenFamily and Single-Family Dwelling (SFD)might seem to be one-to-one since aparticular family can only live in one SFD ata time and each SFD can (by definition) onlycontain a single family. However, it ispossible for a family to live in different
houses over time. Similarly, it is possiblethat many families inhabit a particularhouse over the years. Thus, if the conceptof time is considered, the relationshipbecomes many-to-many.
We will discuss how you go about determiningcardinality in subsequent sections. At this point,it is sufficient to recognize that there are two
!
FIGURE 3.5: An incorrect one-to-onerelationship
Customer associatedwith
CustomerRecord
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
8/23
Introduction: The importance of conceptualAn introduction to data modeling
popular (and equivalent) approaches todenoting one and many on an ERD: thecrows foot notation you have already seen andthe 1:N notation.1. Crows foot notation In the crows foot
notation, three little lines (resembling acrows foot) are used to indicate many.Not surprisingly, the absence of a crowsfoot indicates one. Thus, the relationshipin Figure 3.7 indicates that each product issupplied by at most one supplier, whereaseach supplier may supply many products..
2. 1:N notation In the 1:N notation, thesymbol N (and/or M) is used to indicatemany whereas 1 is used to indicateone. An example of the 1:N notation isshown in Figure 3.8 .
The ERD model also supports additionalcardinality information in the form ofcardinality constraints. To keep things
simple, however, the discussion ofcardinality constraints is deferred untilSection 6.3.2 .
3.1.2.5 Attributes
Attributes are properties or characteristics of a
particularly entity about which we wish tocollect and store data. In addition, there istypically one attribute that uniquely identifiesparticular instances of the entity. For example,each of your customers may have a uniquecustomer ID. Such attributes are known as keyattributes .
FIGURE 3.6: What is the cardinality of thisrelationship?
Family livesin
Single-Family
Dwelling
Product suppliedby
Supplier
FIGURE 3.7: A one-to-many relationship incrows foot notation
Productsupplied
by Supplier
N 1
FIGURE 3.8: A one-to-many relationship in1:N notation
?
http://tables2.pdf/http://tables2.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
9/23
Learning objectives9 o f 23
An introduction to data modeling
For ERDs that are drawn manually, attributesare traditionally shown as ovals containing thename of the attribute. If the attribute is a key,it is typically underlined. A number ofattributes for the Customer entity are shown inFigure 3.9 .
Adding attributes to ERDs can result invery cluttered diagrams. Some CASE toolslist the attributes inside the entityrectangle (see the discussion of CASEtools in Section 3.4.5 ). Another way to
reduce clutter is to only show a handful ofcritical attributes on the diagram.
3.2 Learning objectives! understand the core constructs of the
entity-relationship model
! create an ERD based on yourunderstanding of a business scenario
! use associative entities to add
attributes to relationships! gain some familiarity with the role ofdata modeling and CASE tools in thedevelopment process
3.3 Exercises
In the sections that follow, we step through theconstruction of an ERD for the kitchen supplyscenario. By following along, you should gain abetter understanding of the basic techniquesinvolved in data modeling as well as some of thedesign pitfalls that should be avoided.
If this lesson was on how to play golf, youwould not read it and then assume thatyou are a good golfer. Golf is a skill thatrequires both theoretical knowledge andhours of practice. Thus, the only way tobecome a good golfer is to acquire a solidunderstanding of the fundamentals andthen go out and hit thousands of balls.The same principles apply to datamodeling.
3.3.1 Starting simple
Let us begin with the simplest and most
essential statement one can make about the
Customer
CustIDName Phone No.
Contact person
FIGURE 3.9: A number of attributes of theCustomer entity are shown in ovals.
?!
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
10/23
Exercises10 o f 23
An introduction to data modeling
wholesaling environment: customers buyproducts . It is natural that you would want toboth automate and informate (recallSection 2.4 ) this important business process.
3.3.1.1 Step one: identify the entities
Entities are physical things, organizations, rolesand events about which we want to storeinformation. In the wholesaling scenario, twoentities are immediately obvious: Customer and
Product. However, before we add the entities toour diagram, it is important that we have a firmunderstanding of what exactly these entitiescorrespond to in the real world:1. Customers In the wholesaling
environment, a customer is an organization,not a person. There may be a single personat the organization through whom weconduct our business. We will refer to thisperson as the contact person for thecustomer in order to maintain a cleandistinction between people andorganizations.
2. Products It is not immediately clearwhether the Product entity refers to aspecific item or a class of similar items. Forexample, one of the products you sell is theFat Cat mug. The Product ID of the mug is88 4017 and it normally sells for $5.50.Note, however, that there are many
individual Fat Cat mugs and each one is
slightly different due to irregularities,variations in painting, and so on. In ourcase, there are advantages to ignoring theindividuality of each mug and treating themall as a single group of interchangeableitems. Thus, when we talk about aproduct or a SKU, we are talking aboutan entire class of similar instances, notindividual instances themselves. 1
Having made these assumptions explicitly, we
can now create our first ERD.
Take out a piece of paper and a pencil.
Unless you have a special-purpose CASEtool, it is seldom worth the effort to drawthe early drafts of your conceptualmodels on a computer.
ERDs typically require many modificationsso you should not invest much timemaking your diagrams look nice. In fact,the diagram you are about to begin will
1 In some environments, it may be necessary to treatproducts as individual items. For example, in theaerospace industry, there is a requirement to trackindividual parts by serial number in case a part fails.The requirement for unit effectivity necessitates adifferent set of assumptions about the Product entityand thus leads to a different database design.
?
!
http://scenario.pdf/http://scenario.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
11/23
E iA i t d ti t d t d li g
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
12/23
Exercises12 o f 23
An introduction to data modeling
3.3.1.4 Step four: identify a few important attributes
There is a need to find a balance between thedescriptiveness of the ERD and the ease withwhich others can decipher it. As such, I preferto show only a handful of attributes on theERDs.
Add a small number of important attributesto your diagram, as shown in Figure 3.13 .
By adding attributes such as Qty on hand tothe Product entity, it is clear that the entity
refers to a class of products, not an individual
product. In contrast, if Serial number wereadded to the diagram as an attribute instead ofQty on hand, the reader of the ERD wouldcome to a different conclusion about themeaning of the Product entity.
3.3.2 Dealing with many-to-manyrelationships
Although the diagram in Figure 3.13 istechnically correct, it is missing a great deal ofinformation about how a purchase transactionoccurs in reality.
To illustrate, consider the attribute Unit pricebelonging to the Product entity. Unit pricecontains the default selling price of theproduct. To understand why it is the defaultprice, consider the case of a Fat Cat mug that
typically sells for $5.50. What if there is a
FIGURE 3.12: Indicate the many-to-manycardinality of the relationship.
Customer Productbuys
Add a crows foot to the customer side to indicate that each productis purchased by many customers.
1111
Add a second crows foot to theproduct side to indicate that eachcustomer purchases many products.
2222
FIGURE 3.13: An initial ERD for the kitchensupply environment.
Customer Productbuys
ProductID
Unit price Qty on hand
CustID
Name Contact person
ExercisesAn introduction to data modeling
http://map.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
13/23
Exercises13 o f 23
An introduction to data modeling
particular mug that has a minor flaw? Althoughthe mug can still be sold, the customer mayexpect a discounted price to compensate forthe flaw. The question is therefore: Where onthe diagram do we indicate the actual sellingprice of a particular mug?
A second example is the purchase quantity.What if the customer purchases a dozen FatCat mugs? Where is this information recorded?The Qty ordered attribute does not belong to
the Customer entity because customers ordermany products besides mugs. Similarly, Qtyordered does not belong to the Product entitybecause different customers may orderdifferent quantities.
This is a problem that typically arises in many-
to-many relationships: There are certainimportant attributes that do not seem to belongto either of the entities participating in therelationship. The solution is to assign theattributes to the relationship itself.
3.3.2.1 Attributes of relationships
The issues surrounding price and order quantityarise because the attributes belong to theinteraction of the entities in the many-to-manyrelationship. Thus, the price of a flawed mug isan attribute of a particular product being soldto a particular customer on a particular day.
To summarize, there are a number of attributesthat should be attached to the buysrelationship in Figure 3.13 :
Date the date on which the purchase ismade;
Actual price the price at which the item(or multiple items within the same class ofproducts) are actually sold to the customer;
Quantity ordered the number of items
with a certain product ID requested by thecustomer; and,
Quantity shipped the actual number ofitems shipped to the customer.
3.3.2.2 Associative entities
Given the number and importance of theattributes attached to the buys relationship,it makes sense to treat the relationship as anentity in its own right. To transform arelationship into an entity on an ERD, we use aspecial symbol called an associative entity . Thenotation for an associative entity is arelationship diamond nested inside of an entityrectangle, as shown in Figure 3.14 .
To transform your many-to-many relationship(without attributes) into an associative entity(with attributes), do the following:
ExercisesAn introduction to data modeling
http://map.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
14/23
Exercises14 o f 23
An introduction to data modeling
Draw a rectangle around the buysrelationship.
Replace the relationship name buys withan appropriate noun, for example Sale.
Remember, entitiesincluding associativeentitiesare named with nouns.
Decompose the many-to-many relationshipinto two one-to-many relationships.
Add the attributes to the associative entity,as shown in Figure 3.14 .
Although Sale is now treated as an entity,there is no requirement to add
relationship diamonds between Product
and Sale or Customer and Sale. Anassociative entity serves as an entity and a relationship at the same time.
The meaning of the Sale associative entity inFigure 3.14 is the following: Each customer canbe involved in many sales transactions, but eachindividual sales transaction involves only onecustomer. Similarly, each product can beinvolved in many sales transactions, but eachsales transaction involves only one type ofproduct.
3.3.2.3 Illustration
To better understand how an associative entityworks, it is worthwhile to jump ahead a bit andconsider what the data might look like. InFigure 3.15 , sample data for the Customer,Product, and Sale entities are shown. There area couple of interesting things to notice aboutthe sample data:1. Each entity contains information relevant to
that entity only. For example, Customeronly contains information about customers;Product only contains information aboutproducts, and so on.
2. Each row in the Sale entity shows thedetails of a single sales transaction. Eachsales transaction consists of a particularproduct being sold to a particular customer.
The transaction-specific information (such
FIGURE 3.14: Transform a many-to-manyrelationship into an associative entity.
Customer ProductSale
DateQuantity ordered
Quantity shipped Actual price
?
!
ExercisesAn introduction to data modeling
http://map.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
15/23
Exercises15 o f 23
An introduction to data modeling
as the actual selling price and quantity
ordered) are attributes of the Sale entity.3. By using the data for the Sale entity, it is
possible to determine which products havebeen purchased by a particular customer.Similarly, it is possible to determine whichcustomers have purchased a particularproduct.
4. Only the minimum amount of information
required to identify the customer andproduct is included in the Sale associativeentity. For example, neither the name ofthe customer or the description of theproduct appear in Sale since thisinformation can easily be found elsewhereusing the values of Customer ID and
Product ID respectively. In the context of
FIGURE 3.15: Data showing the role of an associative entity
Each customer can
participate in manysales transactions.
Each product canparticipate in manysales transactions.
Customer Product
Sale
ExercisesAn introduction to data modeling
http://map.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
16/23
Exercises16 o f 23
An introduction to data modeling
the Sale entity, the Customer ID andProduct ID attributes are called foreignkeys (foreign keys are so important thatLesson 6 is devoted to the topic).
3.3.3 Revising the ERD
There is an important problem with the ERD asits now stands. The constraint that each salestransaction involves only a single productappears to be at odds with the reality of the
business situation described in Lesson 2. Forexample, the sales transactions with whichwe are most familiarthe customer orders youreceive by fax typically request manydifferent products: a dozen Fat Cat mugs,two dozen spatulas, some wire whisks, and soon. The mismatch between the diagram and thebusiness environment means that the ERD mustbe revised.
3.3.3.1 Identifying the problem
The problem with the ERD in Figure 3.14 is thatignores the technology (broadly speaking) used
by customers to place orders. Specifically,customers normally wait until they need enoughstock to make an order worthwhile. In addition,factors such as minimum order values andshipping costs favor the batching of small,single-product orders into large, multi-productorders.
By taking the technology used for ordering intoaccount, it becomes clear that we have failedto model an important event entity : the arrivalof an order.
Be carefulnot all pieces of paper in theexisting business process areautomatically event entities. Forexample, the invoices that we send to ourcustomers are more properly thought ofas reports (which can be generated fromthe information already contained inother entities). With practice, thedistinction between entities and non-entities will become clear.
3.3.3.2 Adding the new entity
In this section, you are going to modify your ERDto include an Order entity.
Create a new ERD consisting of entities forcustomers, orders and products.
Here is where a CASE tool pays off: youcan delete entities or move entitiesaround the screen and the relationshipsand connector lines follow automatically.
Create relationships to reflect the fact thatcustomers place orders and orders consistof products.
!
?
ExercisesAn introduction to data modeling
http://tables2.pdf/http://scenario.pdf/http://scenario.pdf/http://tables2.pdf/http://map.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
17/23
17 o f 23g
Add cardinality symbols to reflect the factthat each customer can place many orders,but each order belongs to a single customer.
Add cardinality symbols to reflect the factthat each order can contain many productsand each product may be contained in manyorders.
Add a handful of attributes to the diagramto help clarify the meaning of the entities.
The resulting ERD is shown in Figure 3.16 .
3.3.3.3 Creating OrderDetails
Although Figure 3.16 is a great improvementover our previous ERD, much of the sameinformation missing from Figure 3.13 such asactual price and quantity orderedis missingfrom the new ERD. As a consequence, we musttransform the contains relationship into anassociative entity with its own attributes.
Transform the contains relationship into
an associative entity using the proceduredescribed in Section 3.3.2.2 .
To remain consistent with M ICROSOFTssample databases, I recommend using thename Order Detail for the associativeentity. Alternatives include Line Item
and Order Item.
The resulting ERD is shown in Figure 3.17 . Tohelp understand the relationship between theOrder entity and the Order Detail associativeentity, look at the orders included in the projectpackage . Each order has header information(such as customer, order date, and so on) andmultiple order details. Each detail hasinformation such as quantity ordered, quantityshipped, and price.
?
FIGURE 3.16: Revise the ERD to include anOrder entity.
Customer
Product
places Order
contains
ProductID
Unit price
Qty on hand
CustID Name
Contact person
OrderID
Date
DiscussionAn introduction to data modeling
http://map.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
18/23
18 o f 23g
3.4 Discussion
3.4.1 Logical versus physical models
A distinction is typically made between logical and physical data models:
Logical data models logical models
capture general information about entities
and relationships and are used forcommunication with business users.
Physical data models physical models
serve as a precise specification for theimplemented system. As a consequence,the models must take into account thetechnology used to store the data. Forexample, a given logical data modeltranslates into very different physical datamodels depending on whether the target
technology is a file-based system, arelational database, or an object-orienteddatabase.
Normally, you start with a high-level logicalmodel and refine with the help of users overseveral iterations. Once you are happy with thelogical model, you transform it into a physicalmodel and hand it to a database administrator (DBA) for implementation as a database.
For relational databases, the translationprocess from logical to physical is relativelystraightforward and involves the followingsteps:1. Decompose all many-to-many
relationships Since the relationaldatabase model does not support many-to-many relationships, you must replace allmany-to-many relationships withassociative entities as described in
Section 3.3.2.2 .
FIGURE 3.17: Create an associative entity tomodel individual order details.
Customer
Product
places Order
ProductID
Unit price
Qty on hand
CustID Name
Contact person
OrderID
Date
Orderdetail
Actual price
Quantity ordered
Quantity shipped
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
19/23
DiscussionAn introduction to data modeling
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
20/23
20 o f 23
between the release of A CCESSversion 2.0 andACCESS2000 to make the product moreaccessible to data modeling neophytes. Forexample, there is a table analyzer, a tabledesign wizard, and all sorts of other aidsintended to automate the database designprocess. In my view, there are three problemswith MICROSOFTs dumbing-down strategy:1. No wizard or add-in tool is going to change
the fact that database management systems
(DBMSs) are specialized software packagesthat presuppose an enormous amount ofprior knowledge. Even the error messagesgenerated by A CCESS(as you will soondiscover) can only be understood if youhave a firm grasp on the theoreticalfundamentals of the relational database
model. For example, what do you do whenyou accidentally violate a referentialintegrity constraint? What is a primarykey? What is an ambiguous outer join?
2. Trends in application developmentincreasingly emphasize data and de-emphasize programming. Thus, a solidunderstanding of your data is critical. Putanother way, just about anyone can build areasonably sophisticated system if theunderlying database is well designed(indeed, by doing these tutorials, you willsee just how far you can go without writinga single line of programming code). In
contrast, if the database design is poor, youwill have to be a programming wizard justto create the illusion that the applicationworks.
3. CASE tools from vendors such as O RACLE,COMPUTERASSOCIATES, VISIO(now part ofMICROSOFT), and many others translate ERDsdirectly into database tables. Thus, if youknow how to draw diagrams similar to theone shown in Figure 3.17 , you know how to
design databases.In short, the last thing you want to do is rely onwizards to shelter you from the database designprocess. Instead, you want to get in at thenitty-gritty level and understand the trade-offsbetween various designs. Once the design iscomplete you can use wizards and shortcuts foreverything else.
3.4.4 How do I learn about data modeling?
One important problem that I perceive as auniversity professor is that despite theimportance of data modeling, it is very difficultto find good practical training as a datamodeler.
In the standard computer science databasecourse, we tend to focus on theoretical issuessuch as relational algebra, set theory, indexing,normalization, and so on. Although such
knowledge is certainly important for DBAs and
Application to the projectAn introduction to data modeling
http://map.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
21/23
21 o f 23
other technical professionals, it provides littleguidance when we are faced with real-worldmodeling problems.
Conversely, the introductory informationsystems (IS) courses that we offer in businessschools provide only cursory treatment of datamodeling and database design. I suppose therationale is that it should be possible to hire acomputer science graduate to do the datamodeling!
3.4.5 CASE tools and the design process
Computer-aided software engineering (CASE)tools are software packages that simplify theprocess of creating conceptual models. Inaddition, some CASE packages are more than
just drawing tools: they translate the datamodels into database tables, programming codetemplates, and so on.
To illustrate, consider the diagram inFigure 3.19 which was drawn using thedatabase modeling tool in V ISIOENTERPRISE.The database modeling tool allows a designer tostart with a a physical ERD-like diagram and addimplementation-level metadata (data aboutdata). For example, in Figure 3.20 , thephysical-level properties of the ActualPrice attribute are specified in a dialog box.
Once the physical-level metadata has been
added to the model, many CASE tools can
generate table schemas for the targetdatabase. For example, O RACLEDESIGNERcangenerate structured query language (SQL) datadefinition commands for O RACLEdatabases. V ISIO ENTERPRISEcan generate SQL or create the tablesin various database packages (including A CCESS)directly.
Given CASE tools with this type of functionality,it is clear that the most important skill fordatabase designers is not a complete knowledge
of SQL syntax. Instead, it is the ability toanalyze a real-world problem and create a datamodel that accurately and elegantly capturesthe critical elements of the problem.
3.5 Application to the project
Complete your own ERD for the order entryscenario. Remember that at this point, thescope of the project is very limited. It willbe expanded somewhat in subsequentlessons.
HINT: If you get stuck, you can refer to the V ISIO diagram in Figure 3.19 . Keep in mind,however, that Figure 3.19 is a physical ERD; you should be working with logical ERDs at this stage of the developmentprocess.
Application to the project22 f 23
An introduction to data modeling
http://map.pdf/http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
22/23
22 o f 23
CustNameBillingAddressCityProvStatePostalCodeRegionCodeContactPersonContactPhoneOnLineOrdering
CustID
Customer
CustIDOrderDateProcessed
OrderID
Order
places
QtyOrderedQtyShippedActualPrice
OrderIDProductID
OrderDetail
consists of
DescriptionUnitUnitPriceQtyOnHand
ProductID
Product
is a
FIGURE 3.19: An physical database model for the kitchen supply environment created using a CASEtool.
VISIO ENTERPRISE canbe used to drawvarious styles of logical and physicaldata models. This isan example of aphysical model.
Attributes areshown within thebody of the entityand primary keysare shown in bold.
Crows footnotation is usedindicate thecardinality of therelationships.
Since this is a physical database diagram, thereare no many-to-many relationships or associativeentities. For example, the Order Detail associativeentity is shown as a regular entity.
http://map.pdf/ -
8/14/2019 Lesson 3: An Introduction to Data Modeling
23/23