ictdbs403 create basic databases · 2012-01-01 · probing questions are used to follow up on...
TRANSCRIPT
ICTDBS403
Create basic databases
Learner Guide
© Copyright, 2016 by North Coast TAFEnow
Date last saved: 20 January 2016 by Amanda Walker Version: 1.0 # of Pages = 53
John Chapman and Amanda Walker – Content writer and course
adviser
TAFEnow Resource Development Team – Instructional and graphic
design
Copyright of this material is reserved to the Crown in the right of the State of New South Wales.
Reproduction or transmittal in whole, or in part, other than in accordance with the provisions of the Copyright Act, is
prohibited without written authority of North Coast TAFEnow.
Disclaimer: In compiling the information contained within, and accessed through, this document ("Information")
DET has used its best endeavours to ensure that the Information is correct and current at the time of publication but
takes no responsibility for any error, omission or defect therein. To the extent permitted by law, DET and its
employees, agents and consultants exclude all liability for any loss or damage (including indirect, special or
consequential loss or damage) arising from the use of, or reliance on, the Information whether or not caused by any
negligent act or omission. If any law prohibits the exclusion of such liability, DET limits its liability to the extent
permitted by law, to the re-supply of the Information.
Third party sites/links disclaimer: This document may contain website contains links to third party sites. DET is not
responsible for the condition or the content of those sites as they are not under DET's control. The link(s) are
provided solely for your convenience and do not indicate, expressly or impliedly, any endorsement of the site(s) or
the products or services provided there. You access those sites and use their products and services solely at your
own risk.
Contents Getting Started .................................................................................................................................. i
About this unit .................................................................................................................................................................... i
Elements and performance criteria ............................................................................................................................. i
Icon Legends....................................................................................................................................................................... ii
Topic 1 - Analysis .............................................................................................................................. 5
Analysing the client needs ............................................................................................................................................ 6
Documenting the client needs ..................................................................................................................................12
Client review and approval .........................................................................................................................................16
Summary ............................................................................................................................................................................19
Suggested answers to Activities................................................................................................................................19
Topic 2 – Data modelling................................................................................................................ 23
Entity relationship diagrams .......................................................................................................................................24
Preparing models ............................................................................................................................................................40
Data dictionaries .............................................................................................................................................................41
Preparing documentation for the client ................................................................................................................42
Client review and approval .........................................................................................................................................44
Summary ............................................................................................................................................................................45
Suggested answers to Activities................................................................................................................................46
Getting Started
About this unit
This unit describes the skills and knowledge required to design, develop and test a database
in order to meet a specification.
It applies to individuals who may be either database, or web designers, required to create a
simple database to store information for an online application, using a simple entity relational
database.
Elements and performance criteria
Elements define the essential outcomes of a unit of competency. The Performance Criteria
specify the level of performance required to demonstrate achievement of the Element. They
are also called Essential Outcomes.
Follow this link to find the essential outcomes needed to demonstrate competency in this
Unit: http://training.gov.au/Training/Details/ICTDBS403
i | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Icon Legends
Learning Activities
Learning activities are the tasks and exercises that assist you in gaining a
clear understanding of the content in this workbook. It is important for you
to undertake these activities, as they will enhance your learning.
Activities can be used to prepare you for assessments. Refer to the
assessments before you commence so that you are aware which activities
will assist you in completing your assessments.
Case Studies
Case studies help you to develop advanced analytical and problem-solving
skills; they allow you to explore possible options and/or solutions to
complex issues and situations and to subsequently apply this knowledge
and these newly acquired skills to your workplace and life.
Discussions/Live chat
Whether you discuss your learning in an online forum or in a face-to-face
environment discussions allow you to create and consolidate new
meaningful knowledge.
Readings (Required and suggested)
The required reading is referred to throughout this Learner Guide. You will
need the required text for readings and activities.
The suggested reading is quoted in the Learner Guide, however you do not
need a copy of this text to complete the learning. The suggested reading
provides supplementary information that may assist you in completing the
unit.
Reference
A reference will refer you to a piece of information that will assist you with
understanding the information in the Learner Guide or required text.
References may be in the required text, another textbook on the internet.
Self-check
A self-check is an activity that allows you to assess your own learning
progress. It is an opportunity to determine the levels of your learning and to
identify areas for improvement.
Work Flow
Shows a logical series of processes for completing tasks.
ii | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
iii | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Topic 1 - Analysis This is topic 1 of 4 topics. These topics will cover analysing the client needs, documenting
those needs including preparation of appropriate database models, creation of a database,
testing the database and debugging any problems that might be encountered during testing.
In order to complete this unit you will require some Structured Query Language (‘SQL’)
development skills. If you do not have these skills it may be worth considering whether there
is an SQL unit you have yet to complete and which may be completed before this unit. If you
do not have an SQL unit to complete you would be best to do some additional reading about
SQL. The following are some links that will provide you with information about SQL
development:
There are many good references on the internet, however you might like to start with this SQL
Tutorial (w3schools, n.d.).
5 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Analysing the client needs
In this context the client may be an individual, your employer, a third party or any
organisation or individual that has approached you to build them a database.
The focus in this unit is on the database, there will likely be some analysis required with
respect to an interface or report output in order to enable the client to make use of the
database. You may need to work with those responsible for the development of the database
interfaces to ensure that the database and the interface are able to operate together.
Before you start analysis of the database specifics you should get a good understanding of the
problem that the client is working to address. With respect to a database build the problem is
likely to be the capture, storage or querying of data. The following are some key terms that
are used when defining the problem:
> Scope : This is the high level description of the requirements or expectations. These are
specific and relate to deliverable items. For example ‘The database will capture all
contact details for employees and also details about any contact made with employees’.
> Boundaries : Typically this is detailing anything outside of scope to provide clarity. For
example ‘The database will not capture data about the training that staff have received’.
> Objectives : Are the measures that will be used to ensure that the effort has been
successful. For example ‘The time taken by staff to process an enrolment will on average,
by the end of the first year, be reduced to 6 minutes’.
Setting client expectations using objectives
Objectives are critical to any development project. If the project is large enough there may be
a project manager or other person responsible for this, but in a small project you may be the
only resource and you will therefore need to address the objectives so that you and the client
have the same expectations.
6 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Objectives should be SMART:
In your role you will need to ensure that the objectives that relate to the database are SMART,
part of that is completing analysis of the database requirements and ensuring that those are
agreed and realistic.
LEARNING ACTIVITIES ACTIVITY 1
Which of the following might be an appropriate objective associated with the creation of a
database?
a. Capture all the data we need
b. Capture data about client visits including client name, address, phone number and date of
visit/s in an Access database to be implemented by December this year.
c. Capture all client data for competitors customers
d. Capture name, address and phone numbers for customers in a database or spreadsheet
e. All of the above
Requirements gathering
7 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
In order to determine what information will be captured you will need to use an information
gathering technique, some examples of these are:
> Interviews or meetings
> Questionnaires
> Observation
> Research
> Sampling.
When the system is basic the most efficient and effective method of determining the
requirements may well be interviews, however other methods should be considered.
Consider what method might be suitable in the following circumstances.
> There is an existing system that meets the business needs but which is not able to
operate on the new operating system, therefore needs to be replaced.
> The system will be implemented to capture details associated with a group of businesses
that wish to pool together to analyse a trend that is affecting their industry.
The following are possible methods to address the two scenarios you just considered:
> In the case of the existing system meeting the business needs it would be prudent to
research using the materials available for the existing system, if these are not available
observation of the database and how it is used would enable you to gain valuable
knowledge about the client information capture requirements.
> In the case of a solution for a diverse group it may be useful to develop some
questionnaires to get an understanding of what the different members of the group
perceive is required. You could then follow that up with an interview to consolidate
ideas and come to agreement on the final requirements.
Asking questions
The following are some examples of question types that you might use in obtaining details
about the client needs:
8 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Open Questions
Open questions promote the provision of substantial detail. The additional detail provided
should enable you to then ask further questions or express your thoughts and opinions.
These demonstrate an interest in the needs of the client. For example: Why do you need the
database? Have you any specific thoughts on what you might capture?
Closed Questions
Close Questions require a simple response, for example: Have you prepared any
documentation? Is there a budget?
Direct Questions
These draw people into the conversation, for example: John, have you been involved in the
planning to date? Jessica, do you think you would be able to attend a session about the
business processes?
Leading Questions
Drive the discussion in a direction of your choosing, for example: I presume you will want to
have that reviewed by someone from the user group? I suppose you will consider the impact
on the front desk staff?
Indirect Questions
Try to get everyone involved in the discussion. For example: How can we make cost effective
for the business?
Probing Questions
Probing questions are used to follow up on something someone has said, for example: John,
you said that the customers come into the office to lodge the forms, do you not get any
coming in by post, email or any other method? You might also use Hypothetical Questions
here for example: So that has not happened yet, but let’s say in future you wanted to improve
efficiency, might you want to do that then?
9 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Recording what you learn
Regardless of the method you use you need to make a record of it. This may be in any useful
form, for example meeting minutes, a report summarising the feedback from questionnaires
or an observation checklist. If you do not capture these details you will likely overlook
something when you go to document the requirements.
It is always best to compile the details on the day that you obtain them while they are fresh in
your mind.
LEARNING ACTIVITIES ACTIVITY 2
Q1. What are three methods you might use to obtain details about requirements from a client?
1
2
3
Q2. What is the aim of an open question?
Considering database specific requirements
System analysis is related with the need to gather information about the needs of a business.
In this unit we are focusing on database related analysis and so it is important to consider
what types of issues need to be considered that are specific to the creation of a basic
database.
10 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
This unit is associated with the creation of a basic database. It does not have regard for the
‘Database Management System (‘DBMS’) or operating environment selection. It would be
necessary though to consider this in a situation where there is no existing operating
environment. Put simply questions like the following would need to be asked and taken into
consideration when choosing a suitable solution:
> How many users there will be?
> How many records will be stored in a given time period?
> How large are the records? Including specifically any non-text elements to be captured
(e.g. images).
> Is there any potential scalability issues? For example can you foresee additional modules,
more users, higher usage or resale of the database?
> What security is required?
> What backup/recovery requirements are there?
> Will there be a migration?
REFERENCE REFERENCE 1
For the purposes of this unit we will assume that the database is relational as that is the most likely
database model you will be using for a basic database. If you are unfamiliar with the different
databases models you may choose to refer to UnixSpace (UnixSpace, n.d.) for more information.
LEARNING ACTIVITIES ACTIVITY 3
What are three questions you might ask of a client that relate directly to the database design?
Note: These are not like the questions above; these are questions that you would ask to determine
what you will actually capture in the database.
1
2
3
11 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Documenting the client needs
When you are working on a basic system there may be no need to develop substantial
documents that detail the system requirements; however there should be sufficient
documentation developed as to enable the stakeholders, including the customer and yourself
to understand what will be delivered.
Requirements Report
A requirements report may go by a number of different names depending on who you are
talking to. Sometimes they are called functional specifications, requirements document or
business problem document, regardless of the name used the document outlines the
requirements for the database to be created.
The requirements report is the official statement of what is required of systems developers. It
is not a design document. It should say what the system should do rather than how it should
do it.
It is a formal report which details all that has been learned and concluded about the system
and is the starting point for Systems Design. It acts as a contract of deliverables to the end
users and as such all key stakeholders will sign off this document.
Since the requirements document contains the details of what the client has asked for it will
also serve as a reference document for testing the system in the future.
Your organisation may have a standard layout for this type of document, but it will usually
include:
> an introduction
> a management summary
> a sign–off sheet
> a version control table
> a glossary – define all the technical terms that may cause confusion
> background to the project
> functional requirements
> non–functional requirements
> external interfaces.
12 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Management summary
This should be a concise summary of the main points of the document. Management will be
interested in:
> The facts needed in order to make a decision; note that supporting details for these facts
are found elsewhere in document
> The objectives of the system
> An outline of the developmental efforts to date.
It is sometimes called an executive summary, or management overview
Background
This section should contain details about the background of the project and a recap of the
developmental effort to date.
It describes:
> Problems with the current system
> The anticipated benefits and objectives of developing a new system
> The scope of the project.
Functional Requirements
This will be the logical design of the new system, i.e. what it will do. The functional
requirements may be described using:
> What data will be captured at a high level.
> How that data relates to each other (e.g. a customer may have unlimited file notes, they
may have no file notes).
Non–Functional Requirements
These are the constraints under which the system must operate and should include:
> operating constraints
> external constraints
> hardware and systems software constraints
> control requirements
> security requirements.
13 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Operating constraints include:
> volumes, sizes, and frequencies
> timing considerations
> reporting deadlines
> on–line response times
> processing schedules.
External constraints include:
> Statutory restrictions
> Requirements by regulatory and governmental agencies – examples are data retention
requirements and privacy considerations.
External Interfaces
How the new system will interface with other systems with which it must interact. This section
should describe in detail what form the interfaces will take, for example, if a particular output
from the proposed system will be required to be used as input to an existing system. In this
case it will be important to include a sample of the required input to the existing system.
Appendices
Frequently, the requirements document has a separate appendix with copies of important
documents from previous phases of the project, e.g., meeting minutes.
Information sources and references should also be included and provide a sample of
questionnaires, sample forms, lists of interviews with end users and software package
vendors.
LEARNING ACTIVITIES ACTIVITY 4
There are many examples of requirements documents on the web, including sites that provide
document templates. Use a search engine with the keywords “user requirements document
template” to investigate some of these.
14 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
The following are some considerations of the documentation produced:
> Must be able to be understood by someone who does not have knowledge of the
problem
> Must be organised logically and relevant
> Must be complete, accurate and current
> Must be well formatted and grammatically sound
> Word processed - hand written documents are not professional
> It must be version controlled.
LEARNING ACTIVITIES ACTIVITY 5
Q1. Should the Requirements Report include details about all data that will be captured in the
database?
Q2. If the Requirements Report identifies limitations of the solution that will be offered (e.g. budget
will not allow for certain functions that were suggested) would that be included in the
Requirements Report?
15 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Client review and approval
Once you complete a document it will need to undergo appropriate quality processes. There
are a number of factors that might affect the types of processes, for a basic database it is likely
that the quality processes will not be overly complex, however it will potentially depend on
the standards that you and/or the client have for system development, the methodology used
and the complexity of the concepts. It is possible in some situations for complex concepts to
result in basic solutions.
The most likely scenario for a basic database requirements document is:
> Document is drafted
> Document is reviewed by the author
> Updates are made if required
> (Optional) Document is reviewed by a third party internally (e.g. where there is a project
team or multiple analysts)
> Updates are made if required
> Document is released to client for review.
> Updates are made if required
> Document is approved
Internal reviews should be completed to identify issues including but not limited to:
> Content accuracy and completeness
> Formatting/consistency issues
> Compliance with standards or style guides.
Client Review
Once internal reviews are complete the document should be prepared for release to the
client. The document should include an approval sheet or review request, which should allow
for capture of issues and/or approval of the document. Where possible provide the document
electronically, this has a variety of benefits including the reduction in printing costs and
impact on the environment and timeliness of the document delivery.
16 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
The following are some other considerations:
> Ensure the client knows if/how/to whom they can distribute the document
> Consider if PDF is appropriate
> Do not provide in a format the client may not have, e.g. Microsoft Visio – check if you are
not sure or insert into a common format e.g. Microsoft Word
> If the client has updates how will these be made, for example will they provide text e.g.
Add new details about File Notes or will they write on the document and provide a hard
copy or scanned copy, may they record tracking in Microsoft Word or perhaps there is a
document management solution that will be used.
When you provide a document to a client for review it is important that you address the
following with them:
> Why they are doing it? Without knowing this they may not realise how important it is
that they do a detailed job. Understanding the implications of a poor review, e.g. failure
to miss a major requirement that will have time and cost implications later.
> What will happen if they do it poorly? This might include the fact that any changes
identified after this point will be considered out of scope.
> How to approach it? Sometimes reviewing a document like this is daunting to a client,
make some suggestions about how they might approach this task for example they
could read once then work through systematically checking against their business
functions. This will also include your explaining any document elements that the client
may be unfamiliar with, for example if you included a diagram or chart that they might
be unfamiliar with.
> How to provide feedback? Is there a specific form? Did you want to have a meeting?
> Where to get assistance? This might include suggesting their resources like the ICT
manager or key users
> When it is due for completion and what the implications of failing to do that are? Would
it for instance affect the go live date?
It is important to consider the scope of work that you were asked to complete. Just because a
client asks for a change does not mean it is in scope. Look back to the original request,
particularly if the work has fixed cost or time constraints. Additional inclusions will mean
additional effort that needs to be able to be accommodated.
17 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
LEARNING ACTIVITIES ACTIVITY 6
What are two benefits of providing a document for review electronically?
1
2
What are two benefits of using PDF format for documents released for formal review?
1
2
Acceptance
It may take a number of iterations for the client to accept the requirements are complete.
Once the client is satisfied with the requirements these need to be formally accepted.
Formal acceptance should be completed by authorised person or persons. If you are unsure
who is authorised check with the contract or agreement or your primary contact at the client.
If there is any question about this seek confirmation in writing.
The authorised person/s should sign and date either a page in the document or a separate
document that has a corresponding document name, date and version number. Less formal
processes (e.g. email confirmation) may be used for low risk projects, such as a basic database
being implemented for an internal client, however if there is any doubt a formal signature
should be sought.
This acceptance needs to be retained in the project documents as evidence of the agreed
requirements, from which variance should be handled as a change to the scope of the work
and which could result therefore in cost or time implications.
18 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
LEARNING ACTIVITIES ACTIVITY 7
If a signoff was obtained verbally because the authorised person was unable to provide this in
writing how might you go about keeping a formal record of this?
Summary
Obtaining information about the needs of your client is a critical part of the process for
building a database. In the event that there is a failure to obtain accurate and detailed
requirements this will likely result in you spending a significant amount of time developing
something that does not meet the client needs. You will then need to spend more time
redesigning and building the database. A little more time spent during the analysis in order
to ensure you do understand what it is that you need to build will likely save a lot of time later.
Be sure you document the requirements clearly. Walk the client through the requirements so
that they understand what it is that they are agreeing to and then have the authorised
representative/s of the client formally approve the documents. Approval in this way makes
the client accountable for having checked the accuracy and completeness of the
requirements as provided to you. That means that should they change their mind later you
can clearly demonstrate that you have built the system in line with what you had understood
and they had confirmed. It also provides a valuable document that can be checked against
during later stages in the project including design and testing.
Suggested answers to Activities
Activity 1
b
Activity 2
Q1. Any 3 of Interviews or meetings, Questionnaires, Observation, Research or Sampling
Q2. To obtain significant information, not a short answer. Also intended to demonstrate your
interest in the needs of the client.
19 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Activity 3
The types of questions that will aid in identifying database requirements are:
> What do we need to capture details about?
> How do those things relate? For example does a customer have one or more purchases
or can you have a customer that has not purchased anything?
> What will record about these things? For example what will we record about customers,
their name and what else?
> What will the output from the database be? This is a very important question; it will often
identify missing detail. For example if the client says that they want to query any
customers that have not purchased in the last year but they have already said sales will
be stored in a separate system you need to ask where that date will be coming from.
> What if any system integration with other databases or sources of data will be required?
Activity 5
Q1. No – this is not a design document, it should provide high level detail only.
Q2. Yes – anything that the client had hoped to include but which has been identified as
being out of scope should be detailed.
20 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Activity 6
Q1. Electronic documents are sustainable, they avoid use of paper, ink, delivery services etc.
Electronic distribution is fast. Electronic distribution is cost effective. Electronic documents
can be searched easily.
Q2. PDF is a format that is able to be read on a wide range of devices. It can be read using
software that is available without any cost to the user and therefore largely avoids issues with
incompatible software. Most clients will not make any effort to update a PDF document; they
may though directly update Microsoft Word documents for instance.
Activity 7
If verbal approval is necessary it would be prudent to confirm that in writing. You could
perhaps send an email saying something like: ‘I am writing to confirm our conversation at
2:00pm today where you formally approved Version 2.0 of the Requirements Document for
the ABC Database.’ You may then go on to request the form that was not completed be
completed as soon as possible and provide a copy attached for their convenience.
21 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Topic 2 – Data modelling Topic 2 will provide you with information about how you can model a database. Database
modelling is used in the design of a database; specifically we will be covering Entity-
Relationship Diagrams (ERD).
To understand this you will need to:
> Design an entity-relationship (ER) diagram to model the relationships between the
entities and attributes the database will hold
> Develop the primary and foreign keys to link the entities
> Develop a data dictionary
> Complete documentation and submit it to the appropriate person for approval
23 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Entity relationship diagrams
Entity Relationship Diagrams (ERD) are used to model databases. There are three levels of
data modelling:
> conceptual
> logical
> physical.
Each of these is completed in order and they each build on one another, thus becoming more
complex as the design approaches database build. The detail in each model is shown in the
table below:
Table 1
Conceptual Logical Physical
Entity names
Relationships
Attributes
Keys
Table names
Column names
Data types
Indexes
Integrity constraints
Views
Why have multiple models?
The models each have a different purpose. This provides for the ongoing analysis from
conception to development. The models separate the physical from the conceptual and
logical, therefore as technologies change there may be no need to update the earlier models.
The physical implementation may be very different to the logical model, while still carrying
out the initial objectives as set forth in the conceptual model.
24 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Different notations.
There are a number of different notations that are available for developing ERD. Often which
is used will depend on what the preference of the client is or the standard your employer may
set. In this unit we will be using the Information Engineering style.
We will be using the Information Engineering Style (or crows foot) in this unit. The following
are the possible combinations you may use when creating an ERD using the Information
Engineering Style:
Figure 1 (Jewell, 2005)
Conceptual model
The conceptual model is the simplest and earliest of the database models. It shows major
entities and their relationships to one another, i.e. one to one, one to many or many to many.
It should be suitable for use by technical and non-technical stakeholders.
The conceptual model shows data used in the enterprise independent of all physical
limitations, including which database management system is used, how many users there are
and what capacity the system will have. It is prepared in language that is used by the
customer, it may be inclusive of terms that are specific to their business but will not include
technical terms or jargon. It describes the functions of the system and is linked to the
requirements document.
25 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Entities
At this point it is important that you understand what an Entity is. Entities are things that you
might wish to store information about, they include for example:
> Tangibles (e.g. equipment)
> Concepts (e.g. account)
> People (e.g. staff)
> Events (e.g. sale)
> Places (e.g. office location)
An instance of an entity is a single occurrence. For example if Bill Smith worked for Platinum
Pty Ltd then Bill Smith is a single instance of the entity STAFF for that business in their HR
system.
When interviewing clients nouns that crop up regularly are often entities.
When you are naming entities it is important that you are consistent with the naming, to do
this it is useful to apply the following standards:
> Be accurate, name things functionally and technically accurately
> Be concise, as few words as possible
> Be unique, don’t use the same word for multiple entities or attributes
> Be atomic, represent a single concept
> Contain only letters, numbers and separators
> Use names rather than abbreviations where possible
> Be consistent, if you use _ to separate words do so everywhere e.g. SHOP_ITEM,
ONLINE_ITEM
Relationships
Relationships are used to show that entities relate to one another. You should use consistent
naming across the database model, the following are recommended standards:
> Be concise, as few words as possible
> Use names rather than abbreviations where possible
> Typically noun (entity) verb noun (entity) e.g. STUDENT enrolls_in COURSE
26 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
> Label in both directions showing the name closest to the entity from which it will be read
e.g.
Cardinality
Cardinality is how one entity relates to another entity, in some types of modelling this is
referred to as multiplicity. The following are some key features of cardinality in ERDs:
> Maximum Cardinality refers to the maximum number of instances of one entity that can
be associated with an instance of another.
> Minimum Cardinality refers to the minimum number of instances of one entity that can
be associated with an instance of another.
> For example one PERSON may have zero or more instances of PET and one PET may have
one or more owners.
> The trick is to determine one side at a time, in each case you are considering the
association to one instance of the other.
If you find it hard ask yourself the question using a single instance, for example ‘How many
owners could Jack the dog have?’
The following are some standards that are typically applied to conceptual models:
> Have as many entities as necessary, but break into subject areas of 10-15 entity
maximums
> Entities must have the same name regardless of which part of the model they are shown
on
> Each part of the model must have a unique name and version number
> Every entity should have at least one relationship
> Many to many relationships are supported
27 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
LEARNING ACTIVITIES ACTIVITY 8
Complete the following sentences to provide a text description of the following diagram,
specifically addressing cardinality:
A salesperson makes
A contact by
LEARNING ACTIVITIES ACTIVITY 9
Which of the following is not something that you should consider when naming entities?
a. Be accurate
b. Be unique, don’t use the same word for multiple entities or attributes
c. Use the plural (e.g. Sales)
d. Contain only letters, numbers and separators
28 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Logical model
The logical data model shows how data is used in a given enterprise based on a specific
model of data storage (e.g. relational) but independent of a given DBMS and all other physical
considerations.
The logical model varies from the conceptual model in the following ways:
> It includes Attributes and details about them such as data types, null status and length
> It includes Primary Keys and Foreign Keys
> It is Normalised
Attributes
In ERDs attributes are items we hold in relation to an entity, e.g. the entity INVOICE may have
Attributes including INVOICENO, INVOICEDATE, INVOICEAMOUNT. Sometimes these are
called data items. During analysis some attributes may be found not to be required,
potentially they need to be separated to a new entity (i.e. to normalise).
You should use consistent naming of attributes across the database model, the following are
recommended standards:
> Be accurate, name things functionally and technically accurately
> Be concise, as few words as possible
> Be unique, don’t use the same word for multiple entities or attributes
> Be atomic, represent a single concept
> Contain only letters, numbers and separators
> Use names rather than abbreviations where possible
> Be consistent if you use NO for number do that everywhere (e.g. CLASS_NO,
PAYMENT_NO)
The following should be considered with respect to the attributes you have identified.
> Significant? Only include things that will be useful to users.
> Direct not derived? Derived attributes are available elsewhere in the database (e.g. age is
able to be derived from date of birth)
> Nondecomposable? Something with repeating groups is not an attribute
29 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
> Consistent data type? For example all dates, you cannot include names in the date of
birth field.
When you define an attribute you need to assign it a Data Type. Common data types include
the following:
> INTEGER or NUMBER – any signed or unsigned number
> FLOAT or DECIMAL – a signed number with decimal places
> DATE – date only
> DATETIME – date and time
> TIMESTAMP – a unique reference to a point in time
> CHAR/CHARACTER/TEXT – character string
> IMAGE/FILE
A data type ‘tells’ you what sort of data will be stored in the attribute. The following are key
considerations in assigning data types:
> Logical models are free of physical limitations so at this stage it is sufficient to use a set of
standard terms that are not associated with the Database Management System (DBMS)
being used.
> It is best though to use types that are commonly used by a variety of DBMS (in this case
Relational DBMS) to save effort/rework when you get to physical design.
> For the physical design the Data Type must correspond with the available data types in
the specific DBMS.
It is common to include the following additional details associated with each attribute:
> Numbers : Minimum, Maximum, Length, Signed/Unsigned flag
> Characters: Maximum Length
> Images/Files : anticipated file types
> Null/Not Null flag (or Mandatory flag)
Primary keys
Primary keys are used to identify and locate data within a database. Each entity must have a
primary key. Primary keys are used to uniquely identify an instance of the entity. Primary
keys are used to directly access an instance of the entity e.g. to access a specific supplier in the
SUPPLIER entity you would use the associated primary key e.g. SUPPLIER_NO. A Primary Key
may be a group of unique attributes that identify a record.
30 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
There are a number of different types of primary keys, as follows:
> Simple – a single attribute that uniquely identifies the instance of the entity
> Compound – two or more attributes that uniquely identify an instance of the entity
> Hierarchic or Composite – one or more foreign keys and a qualifying non-foreign key
attribute. For example: STUDENTS study UNITS and UNITS have GRADES so GRADES
might use STUDENTID, UNITNO and GRADENO where GRADENO is a consecutive number
sequence for each unique combination of STUDENTID and UNITNO
The following is another example of a composite key: If invoice numbers are allocated each
financial year sequentially starting at 1 then YEAR and INVOICENO would together be unique:
> YEAR = 2012 would exist for all invoices that year
> INVOICENO =1 would exist for the first invoice each year
but
> YEAR = 2012 and INVOICENO = 1 together exists only once for all records it is a unique
reference to a single record (i.e. a primary key)
Foreign keys
When there is a relationship between two entities there needs to be a foreign key in each
entity to link instances in one entity to instances in the other.
For example:
> Primary Key for ORDER is ORDERNO
> Primary Key for CUSTOMER is CUSTOMERNO
> The CUSTOMERNO included in the ORDER entity is used to locate the customer
associated with the order.
Normalisation
Normalisation is the process of organising data in the database.
Normalisation is completed by applying sets of rules in stages, these stages are known as first
normal form (1NF), second normal form (2NF) etc. We will stop at 3NF for the purpose of this
unit that relates to basic databases, you can go on to 6NF. Often if you comply with 3NF you
will comply with 4NF and 5NF but not 6NF.
31 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
The goal of normalisation is to:
> Protect the data
> The database more flexible by eliminating redundancy and inconsistent dependency.
One of the key purposes of normalisation is to remove or reduce data redundancy. Data
redundancy is basically the duplication of data for various different systems. The issues with
redundant data include:
> Duplication of maintenance
> Potential of update in a subset of locations impacting data integrity
> Storage implications.
First normal form
First normal form is achieved when the following conditions are met:
> when all entities have a unique identifier or key
> when every column in every table contains only a single value (i.e. doesn’t contain a
repeating groups or composite fields).
To ensure that this is the case we:
> identify/create a primary key per table
> split any composite fields into multiple separate fields
> move any repeating groups into separate tables where each repetition will be a single
row in the table - identified by a unique key.
The following is an example:
> An entity contains the attributes Order Date, Ordered By, Item and Order Amount – none
of these uniquely identifiers the row, we need to add OrderID as a unique primary keyl
> Ordered By, according the client needs, contains both the customer name and contact
details, this should be separated into the contact details (e.g. name, address, suburb,
state, postcode and phone number) in separate fields.
> Item and Amount may repeat if the order is for many different items, these then need to
be moved into a new entity (e.g. Order Items).
32 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
In the following you can see that Items are contained within the table:
Orders
OrderID (PK) ItemNo ItemNo2 ItemNo…
1 1000 1001 …
> They should be separated and have an ItemNo or similar along with the OrderID as the
Primary Key.
> OrderID will be a foreign key to the Order table too.
Second normal form
Second normal form is achieved when the following conditions are met:
> when the table is in First Normal Form
AND
> every non-primary key column in the table must depend on the entire primary key, not
just part of it (if there is a composite key)
To ensure that this is the case we:
> comply with first normal form
> move any field that is dependent on only part of the key into a separate table
For example the primary key you have identified for a table is Order ID and Order Item
Number; this is the table that contains order items. If you included Date Ordered in that table
it would be dependent only on the Order ID not the Order Item Number therefore you need
to move Date Ordered to a table with only Order ID as the primary key.
33 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
In the following you can see this example shown using the table format:
Orders
OrderID (PK) ItemNo (PK) DateOrdered …
1 1000 1/1/2012
1 1001 1/1/2012
The Order ID will be the same for each item, so it will be repeated and redundant. The
solution:
Orders
OrderID (PK) DateOrdered …
1 1/1/2012
1 1/1/2012
Order Items
OrderID (PK) ItemNo (PK) …
1 1000
1 1001
34 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Third normal form
Third normal form is achieved when the following conditions are met:
> when the table is in First and Second Normal Form
AND
> each column that isn't part of the primary key doesn't depend on another column that
isn't part of the primary key
To ensure that this is the case we:
> comply with first and second normal form
> move any columns that depend on a column other than the primary key into another
table
The following table relates to the example below:
Orders
OrderID (PK) Customer Name Customer Address Customer Suburb Customer …
1 Bill Smith 12 White St Ryde …
2 Bill Smith 12 White St Ryde …
Looking at this example:
> The Customer Details are not directly related to the Order
> They could stand alone and therefore should be moved to a separate table
> In that new table they would have a foreign key relating them to the Orders for the
customer.
Third normal form may be too much.
Sometimes if you normalise to 3NF the database may become very inefficient, splitting things
like SUBURB, STATE and POSTCODE because of their interdependency may make loading data
to screens, reports, letters etc. too inefficient.
35 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Consider each case and what the data will be used for as you encounter it.
LEARNING ACTIVITIES ACTIVITY 10
Q1. Does the Logical model include all the different fields that will be stored in the database?
Q2. Look at the following example Logical model and answer the questions below:
1 What do you think this system might be?
2 Is it possible to have an Item that does not have a Type recorded for it in the system?
3 According to this model can the same Patron both Loan and Reserve a particular item?
4 Can a Patron loan more than one item?
5 Can a Patron reserve more than one Item?
36 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
There are 2 parts to the Logical Data Model:
> A diagram called the logical data structure (LDS)
> A set of associated textual descriptions that explain each part of the diagram.
Unless directed otherwise, you should produce both of these elements whenever you are
asked to produce a Logical Model for a client.
The following are the steps involved in logical modelling of a relational database:
1 Identify the entities – removing features not compatible with the data model (i.e.
relational model). In Conceptual Data Modelling we included entities, the same definition
applies. Now it is time to recheck the entities you identified. Initially ask yourself:
> Is there more than one occurrence of this concept?
> Is each occurrence uniquely identifiable?
> Do we need to hold data about this concept?
> Is this thing relevant to the system I am designing? (i.e. the business may have sales
but if this is the HR system that is likely not relevant)
2 Identify the attributes
3 Model the relationships – normalise and verify cardinalities
4 Choose Primary Keys – good keys are:
> Unique
> Non-null
> Data-less
> Never change
If your key does not meet this then create an arbitrary key (i.e. a key that has been created
solely to be a unique identifier, these are generally 1, 2, 3 etc.
5 Check the model
37 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
LEARNING ACTIVITIES ACTIVITY 11
What two separate parts must be provided when you are asked to supply a Logical model?
1
2
Physical model
The Physical Model is representative of the physical implementation of the database. In
reality often this is done in the DBMS.
In large scale projects you should have documents including:
> Physical Database Model
> Tables
> Columns
> Data types
> Keys
> Constraints
> Indexes
> Views
> A word processed document that outlines other details like the Security, Disk Space
Requirements and Backups
> Data Dictionary – updated from logical design to include additional physical
characteristics
38 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Given that this unit is associated with basic databases we will be limiting our attention to the
tables, columns, data types and keys associated with a database only. The other aspects of
the Physical Database Model are associated with more complex databases and will not be
covered within this unit. Therefore our physical model will be very close to our logical model,
except that the data types must correspond with the database management system we are
using.
Data Types
We will be using MySQL in this unit, the following are the data types that are used within
MySQL and therefore which you would need to use in your physical model.
REFERENCE REFERENCE 2
MySQL Reference Manual includes details about the available data types and pros and cons of
using various options. (MySQL, n.d.)
LEARNING ACTIVITIES ACTIVITY 12
This table will contain the data types you are most likely to use in MySQL in this unit of study.
Complete the following table, use the web reference above to find the answers if you are not
familiar with these data types.
Table 2
Data type Description
VARCHAR A variable string from 0 to 255 characters
SMALLINT
FLOAT
DATE
DATETIME
39 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Preparing models
When you prepare an ERD for a client there are a number of different approaches, they
include:
> Paper and a Pencil – lacks the professional touch.
> Products like ERwin, Smartdraw, Embarcadero's ER/Studio, Sybase's PowerDesigner,
Microsoft Visio or Popkin's System Architect.
> In the DBMS (e.g. SQLServer 7 or greater includes Database Design as does the current
version of MySQL).
Always assume that the client might not have the specific tool that you have used to prepare
the diagram and with this in mind present the diagram to the client in a common format, e.g.
PDF.
LEARNING ACTIVITIES ACTIVITY 13
Q1. Using a pencil and paper model the following, prepare a conceptual and also a logical model:
Scenario
Peter sells vegetables from his home garden. Peter wants to keep a database that contains details
about which of his neighbours’ purchased which vegetables when he has them available. The
most important things for Peter are that he can:
> Know which neighbour lives where.
> Know the names of his neighbours. Note there may be multiple neighbours at one address,
Peter wants to know all their names.
> Know which produce they purchased.
> Know when they purchased it (i.e. date).
Peter does not want to know about how much they purchased on a given day, just whether they
purchased any of that produce that day.
40 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Q2. Take the model that you completed on paper in Q1 and now prepare it in a suitable electronic
format. Be sure to use Information Engineering notation.
If you do not have access to a suitable tool, you can make use of available products online, for
example Draw.io or if you wish to use Microsoft Visio please contact your facilitator to request a
Microsoft Dreamspark account if you do not have one.
Data dictionaries
Data Dictionaries capture definitions for the various components of the system. They allow
participants in the project to access details about the system similarly to a standard English
dictionary.
Data dictionaries, in this context, have the following attributes for data element related
entries (elements is another name that refers to attributes or columns):
> Data type
> Length
> Range
> Discrete values (i.e. where only a specific set of values are valid)
> Required (or Allows Nulls)
> A description of the element.
Data dictionaries can be formatted in a number of different ways, most commonly now they
are produced in spreadsheets or for larger projects in one of many tools that are designed to
prepare and maintain data dictionaries.
41 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
The following is an example of a data dictionary:
LEARNING ACTIVITIES ACTIVITY 14
Refer to the scenario in Activity 13 and prepare a data dictionary that covers the elements in the
Logical database you designed. This may be best completed in a spreadsheet.
Preparing documentation for the client
Present your diagrams professionally, even if you have only a single conceptual model to
present it should be presented consistently with other documents and conform with
presentations guidelines. Generally it is accepted that the ERDs will be provided to the client
as part of a document, not as stand-alone diagrams.
The design document will typically include the following:
> Introduction/Document Purpose
> Standards Used
> Assumptions, if you have made any be sure to detail them.
> Interpreting the Model (i.e. instructions that enable the client to understand what the
diagram means)
42 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
> ERD (may be a single model or a number of models depending on the complexity of the
database and the standards being followed)
> Data Dictionary
> Signoff facilities inbuilt or attached (e.g. review request).
It is significant to note at this point that when the database is basic it may be possible to
produce only a single ERD that takes into consideration all the aspects of the three models.
This is not recommended for more complex databases. In the event that you do produce only
a single ERD the following are some important considerations:
> Double check your ideas with the client before commencing, make sure that the
concepts you would have included in your conceptual diagram are well founded
> Develop the ERD by working through what you would have done if you prepared the
three separate models
> Avoid rework by normalising as early as possible and using data types that match with
your database management system
Version Control
It is important that each ERD is version controlled, some reasons for this are:
> So you know why, how and when changes were made
> So you know who requested a change
> So you know which version the changes applied in
> So you can cross check changes against system updates
> So stakeholders can review only updates that have been made since the last version,
rather than the entire document again
Internal review
Before the document is released to the client it should be reviewed internally, preferably by
someone other than the person who prepared the document, however sometimes this is not
possible e.g. in the case of a single contractor. The review should include the following:
> Systematically check across all source documents (forms, reports, and minutes of
meetings) that have not been superseded.
> Start with the earliest documents (e.g. initial request) and move to most recent
documents (e.g. session minutes).
> Check with the client if there is any ambiguity or where documents are contradictory
43 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
> If you have an entity with only one attribute you need to seriously consider why, is it not
an attribute of another entity? It is likely there is an error if this is the case.
LEARNING ACTIVITIES ACTIVITY 15
If you were working on a basic database and you choose to complete just a single ERD rather than
all three models which of the three models would the one ERD resemble?
Client review and approval
As with all formal documents produced during the design phase the ERD should be reviewed
by the client and their formal approval obtained. Refer to the review and approval section in
Notes 1 for this topic if you feel unsure about how to approach review and approval.
Most importantly you should:
> Ensure the client understands the diagrams they are being asked to review.
> Provide details where the client can seek assistance during the review.
> Obtain the approval in writing if possible, if not confirm the verbal approval and keep the
approval on record.
44 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Summary
When designing a database there are three different models that are prepared in stages, the
conceptual model, the logical model and the physical model.
The conceptual model is a high level model that enables you to picture how the main
elements or concepts in the database will relate to one another. The logical model takes into
account the type of database (e.g. relational) this model includes keys and attributes and is
normalised. Keys identify data in the database. Attributes are like fields, they are what you
will capture about the entity. Normalisation is the process of reducing data redundancy, this
should make the database easier to maintain and more efficient. Finally the physical model
takes into consideration all the physical requirements, including which database
management system will be used.
When you are designing a basic database it may be possible to reduce the number of
diagrams that are prepared, however you should consider the steps required to produce each
diagram so that you do not overlook any requirements of good database design.
45 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Suggested answers to Activities
Activity 8
A sales person makes none, one or many contacts.
A contact is related to one and only one sales person.
Activity 9
Use the Plural does not apply. You should use the singular e.g. SALE not SALES
Activity 10
Q1. – No – Fields are called columns in a database and these are not included until the
physical ERD.
Q2.
1 A library management system
2 Yes the relationship shows that an ITEM may have zero or one TYPE
3 Yes there is nothing to limit the PATRON from both having an ITEM on LOAN and RESERVE
4 Yes a PATRON may LOAN zero, one or many ITEMS
5 Yes a PATRON may RESERVE zero, one or many ITEMS
Activity 11
1 A diagram called the Logical Data Structure (LDS).
2 A set of associated textual descriptions that explain each part of the diagram.
Activity 12
Table 3
Data type Description
VARCHAR A variable string from 0 to 255 characters
SMALLINT -32768 to 32767 normal OR 0 to 65535 UNSIGNED
46 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
Data type Description
FLOAT A smallint with a floating decimal point
DATE YYYY-MM-DD
DATETIME YYYY-MM-DD HH:MM:SS
Activity 13
Conceptual
Note that in the conceptual model we have not resolved the many to many between person
and produce.
47 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w
Logical
Remember that database design is not a precise art, your design may vary from the answer,
but take a look at this design and consider why it might vary from yours.
Activity 14
Entity Attribute Key Type Format Required Content
PERSON PERSONID PK SMALLINT Y Unique
identifier for
the person
PERSON FIRSTNAME VARCHAR(30) Y
PERSON SURNAME VARCHAR(30) Y
PERSON ADDRESSID FK SMALLINT Y
ADDRESS ADDRESSID PK SMALLINT Y
ADDRESS ADDRESS1 VARCHAR(50) Y
ADDRESS ADDRESS2 VARCHAR(50)
ADDRESS SUBURB VARCHAR(30) Y
PERSONPRODUCE PERSONID FK SMALLINT Y
PERSONPRODUCE PRODUCEID FK SMALLINT Y
PERSONPRODUCE DATE DATE DDMMYYYY Y
48 | P a g e I C T D B S 4 0 3 _ T o p i c 1 _ L G _ V 1 . 0 T A F E n o w
PRODUCE PRODUCEID PK SMALLINT Y
PRODUCE PRODUCENAME VARCHAR(50) Y
Note the following:
> The Address is separate from the person due to normalisation, otherwise where there are
multiple people at the same address the address would be repeated and cause
redundancy.
> Given this is a neighbourhood I have not included state, but if this was a border town like
Tweed Heads that might be necessary.
> The Required value relates to whether the value is required if there is an entry in the
table, therefore if any value is entered in the PERSONPRODUCE table then all fields in that
entity are required.
> It is possible that your model could be different that does not make it wrong, consider
why it is different and whether it meets the needs of Peter.
Activity 15
The ERD would typically be representative of the physical model, as you would need to have
included the attributes of each model in order to develop the single model. Therefore given
that the physical model is produced last and is the most complete of all models the single ERD
would need to include physical characteristics such as data types.
49 | P a g e I C T D B S 4 0 3 _ T O P I C 1 _ L G _ V 1 . 0
T A F E n o w