an introduction to database and database designing r c goyal principal scientist iasri, new delhi
TRANSCRIPT
Databases Before the Use of Computers
Data kept in books, ledgers, card files, folders, and file cabinets
Long response time Labor-intensive Often incomplete or inaccurate
Data & InformationData known facts that can be
recorded and that has implicit meaning
Information processed data
Meaningless data becomes information when it is processed and presented to the decision maker in a meaningful way
Database
A set of data that is required for a specific purpose or is fundamental to a system, project, enterprise, or business. A formally structured collection of data.
A database may consist of one or more data banks and be geographically distributed among several repositories.
In automated information systems, the database is manipulated using a database management system.
File Mgmt vs. Database Mgmt File Management
Each data entity is in a separate file system for creating, retrieving and
manipulating files. Database Management
Same file but data elements are integrated and shared among different files
Program controls the structure of a database and access to data
Database Management for Strategic Advantage We live in the Information Age Information used to make
organizations more productive and competitive
Databases used to support business operations
Databases used across a range of applictions Personal, department, enterprise, web
Technical Advantages of Database Management
Reduced data redundancy Improved data integrity More program independence Increased user productivity Increased security
Disadvantages of DBMS
Cost issues Implementation & maintenance
issues Security issues Privacy issues
Features of DBMS Database Engine
heart of DBMS stores, retrieves and updates data enforces business rules
Data dictionary holds definitions of all of the data tables Describes the data type, allows DBMS to
keep track of data, helps user find data they need
Features of DBMS…Cont.
Query processor Enables users to store and retrieve
data Use English command that use such
words as SELECT, DELETE, MODIFY Report generator
Formats and prints reports after user uses query processor
Features of DBMS…Cont.
Forms generator Application generator Access security
Setup access privileges to protect data from unauthorized access and sabotage
System recovery
Database Development Cycle
Database planning System Definition Requirements collection and analysis Database design DBMS selection Application design Prototyping Implementation Data Conversion and loading Testing Operational Maintenance
Database Planning
Systems Definition
Requirements Collectionand analysis
Database DesignDBMSSelection
ApplicationDesign
Implementation
Data Conversion and loading
Testing
Evaluation & Maintenance
Prototyping
Database Life Cycle
Source: http://www.cs/ucf.edu/courses/cgs2545/CH02/index.htm
Database Planning
Current systems evaluation
Development of standards
Technological feasibility
Operational feasibility
Economical feasibility
Requirements Collection and Analysis identifying management information
requirements, determining information requirements
by functional area, and establishing hardware and
software requirements
Systems definitionData dictionary Metadata
Database design methodology
A structured approach that uses procedures, techniques, tools, and documentation aids to support and facilitate the process of design.* Conceptual database design * Logical database design* Physical database design
Critical Success Factors in Database Design
Work interactively with the users as much as possible.
Follow a structured methodology throughout the data modelling process.
Incorporate structural and integrity considerations into the data models.
Combine conceptualisation, normalisation, and transaction validation techniques into the data modelling methodology.
Tips - Planning Before you start up the computer to build a
database, PLAN ON PAPER!!! Gather all of the paper forms you use to
collect your data Interview the people who will be using the
database, from those who conduct the field work to those who analyze the data
Think about what you want to get back out of your database, not just about how to get data in
Steal - see if it’s been done before
Tips - Tables
Normalize, normalize, normalize Use field properties to help maintain data
integrity When possible limit the possible entries in
a field to a lookup list (domain) Choose your key fields carefully Index fields that you will search often Many-to-many relationships in Access Steal - link to existing tables
Normalization
The process of breaking down large
tables into smaller ones by removing
all unnecessary or duplicate fields,
eliminating redundant data, and
making sure that each table
represents only one thing.
Why Normalize? If you don’t normalize, you may run into
anomalies (unexpected results) Deletion anomaly - deleting a record
unexpectedly removes a value we wanted to keep
Insertion anomaly - we can’t add a record because we don’t know the value of all of the required fields
Change anomaly - one change must be applied to many records in a single table
Non-normalized Table
Student# StName AdvName AdvRoom Class1 Class2 Class31022 Rajat Goyal 412 101-07 143-01 159-024123 Patel Gupta 216 202-01 211-02 214-01
Typical flat-file table
Contains all data in one record.
Contains repeating fields.
Contains information on more than one thing.
First Normal Form
Eliminate repeating fields within
tables.
Create a separate table for each set of
related data.
Identify each set of related data with a
primary key.
First Normal Form
No Repeating Fields: Class# is now a single field instead of repeating for each class a
student takes.
Now multiple records for each student because >1 record is needed to accommodate all
classes for each student
Student# StName AdvName AdvRoom Class#1022 Rajat Goyal 412 101-071022 Rajat Goyal 412 143-011022 Rajat Goyal 412 159-024123 Patel Gupta 216 210-014123 Patel Gupta 216 211-02
Student# StName AdvName AdvRoom Class1 Class2 Class31022 Rajat Goyal 412 101-07 143-01 159-024123 Patel Gupta 216 202-01 211-02 214-01
Non-normalized Table
Second Normal Form
Create separate tables for values
that apply to multiple records.
Relate these tables with a foreign
key.
Second Normal Form
Eliminate Redundant Data: no longer repeating student name, advisor, and adv-room for each class
Student# StName AdvName AdvRoom1022 Rajat Goyal 4124123 Patel Gupta 216
Students:
Student# Class#1022 101-071022 143-011022 159-024123 210-014123 211-024123 214-01
Classes:
Third Normal Form
Eliminate fields that do not depend on the key.
AdvRoom is independent of the key field (Student#). It does not depend on the student, it depends on
the advisor.
Student# StName AdvName AdvRoom1022 Rajat Goyal 4124123 Patel Gupta 216
Students:
Third Normal Form
Eliminate Data Not Dependent on Key: remove
AdvName from the Students table and create a
separate table describing the advisors with its own
primary key (AdvID).
Relate Advisors to Students using AdvID.
Student# StName AdvID1022 Rajat 14123 Patel 2
Students:
AdvID AdvName AdvRoom 1 Goyal 4122 Gupta 216
Advisors:
Deletion Anomalies
If we delete student Rajat, we lose the only records for advisor Goyal
Student# StName AdvName AdvRoom Class#1022 Rajat Goyal 412 101-071022 Rajat Goyal 412 143-011022 Rajat Goyal 412 159-024123 Patel Gupta 216 210-014123 Patel Gupta 216 211-02
Insertion Anomalies
If we want to add a new student we have to know what class they are in (it’s part of the key)
Student# StName AdvName AdvRoom Class#1022 Rajat Goyal 412 101-071022 Rajat Goyal 412 143-011022 Rajat Goyal 412 159-024123 Patel Gupta 216 210-014123 Patel Gupta 216 211-02
Change Anomalies
If student Rajat switches to advisor Gupta, we have to change three records instead of just one
Student# StName AdvName AdvRoom Class#1022 Rajat Goyal 412 101-071022 Rajat Goyal 412 143-011022 Rajat Goyal 412 159-024123 Patel Gupta 216 210-014123 Patel Gupta 216 211-02
Field Properties
Data Type - text, integer, double, boolean, etc.
Field Size - for text fields
Input Mask - a pattern for entering data
Default Values - auto-entered for new records
Validation Rule - limits values entered
Required? - force entry of data
Many-to-Many Relationship
Exist between two tables when: for one record in the first table,
there can be many corresponding records in the second table and…
for one record in the second table, there can be many corresponding records in the first table
Many-to-Many Relationship
One student can take many classes, and one class can be taken by many students.
Students
StudentIDNameAddressCityStateZipWorkPhoneHomePhone
Classes
ClassNumberStudentIDSubjectInstructorIDDaysTimeComments
Tips - Queries Take advantage of action queries to handle
batch record operations Use queries to present calculated values
rather than storing the calculated values in your tables
Remember that null never equals another null Joining two tables on a field that may contain a null
value may not give you the results you expect Searching for duplicate values will not return two
records that have a null Realize that you can link a table to itself
DBMS Selection
* Costs* Features and Tools* Underlying model * Portability* DBMS hardware requirements* Organisational requirements
Implementation
The physical realisation of the database and application designs
the detailed model is converted to the appropriate implementation model, the data dictionary is built, the database is populated, application programs are developed and users are trained
Data Conversion and Loading & Testing
Transferring any existing or new data into the new database and converting any existing applications to run on the new database
Finding errors
Operational maintenance
preventive maintenance (backup) corrective maintenance
(recovery)1 adaptive maintenance regular monitoring & periodical
check up
Recent Developments Affecting Database Design and Use
Data Mining (On-Line Analytical Processing) Drill down from summary data to detailed data Data Warehouses/Data Marts
Integrates many large databases into one repository
Linking Web Site Applications to Organizational Databases Users have Web view to organizational database Improves customer contact and service Adds security as a concern
Data Warehouses and Database
In the data warehouse, data are organized around major subjects
Data in the warehouse are stored as summarized rather than detailed raw data
Data in the data warehouse cover a much longer time frame than in a traditional transaction-oriented database
Data warehouses are organized for fast queries
Data warehouses are usually optimized for answering complex queries, known as OLAP
Data Warehouses and Database
Data warehouses allow for easy access via data-mining software called software
Data warehouses include multiple databases that have been processed so that data are uniformly defined, containing what is referred to as “clean” data
Data warehouses usually contain data from outside sources
Data Mining Patterns
Data mining patterns that decision makers try to identify include Associations, patterns that occur
together Sequences, patterns of actions that
take place over a period of time Clustering, patterns that develop
among groups of people Trends, the patterns that are noticed
over a period of time
Web Based Databases and XML
Web-based databases are used for sharing data
Extensible markup language (XML) is used to define data used primarily for business data exchange over the Web An XML document contains only data and
the nature of the data Continue………………………………..